﻿Rapid popularity of Internet of Things (IoT) and cloud computing permits neuroscientists to collect multilevel and multichannel brain data to better understand brain functions, diagnose diseases, and devise treatments.
To ensure secure and reliable data communication between end-to-end (E2E) devices supported by current IoT and cloud infrastructure, trust management is needed at the IoT and user ends.
This paper introduces a Neuro-Fuzzy based Brain-inspired trust management model (TMM) to secure IoT devices and relay nodes, and to ensure data reliability.
The proposed TMM utilizes node behavioral trust and data trust estimated using Adaptive Neuro-Fuzzy Inference System and weighted-additive methods respectively to assess the nodes trustworthiness.
In contrast to the existing fuzzy based TMMs, the NS2 simulation results confirm the robustness and accuracy of the proposed TMM in identifying malicious nodes in the communication network.
With the growing usage of cloud based IoT frameworks in Neuroscience research, integrating the proposed TMM into the existing infrastructure will assure secure and reliable data communication among the E2E devices.
In this paper, we address the problem of computing optimal paths through three consecutive points for the curvature-constrained forward moving Dubins vehicle.
Given initial and final configurations of the Dubins vehicle, and a midpoint with an unconstrained heading, the objective is to compute the midpoint heading that minimizes the total Dubins path length.
We provide a novel geometrical analysis of the optimal path, and establish new properties of the optimal Dubins' path through three points.
We then show how our method can be used to quickly refine Dubins TSP tours produced using state-of-the-art techniques.
We also provide extensive simulation results showing the improvement of the proposed approach in both runtime and solution quality over the conventional method of uniform discretization of the heading at the mid-point, followed by solving the minimum Dubins path for each discrete heading.
High quality upsampling of sparse 3D point clouds is critically useful for a wide range of geometric operations such as reconstruction, rendering, meshing, and analysis.
In this paper, we propose a data-driven algorithm that enables an upsampling of 3D point clouds without the need for hard-coded rules.
Our approach uses a deep network with Chamfer distance as the loss function, capable of learning the latent features in point clouds belonging to different object categories.
We evaluate our algorithm across different amplification factors, with upsampling learned and performed on objects belonging to the same category as well as different categories.
We also explore the desirable characteristics of input point clouds as a function of the distribution of the point samples.
Finally, we demonstrate the performance of our algorithm in single-category training versus multi-category training scenarios.
The final proposed model is compared against a baseline, optimization-based upsampling method.
Results indicate that our algorithm is capable of generating more uniform and accurate upsamplings.
Internet is the main source of information nowadays.
The search engines must have various alternative manners for the search results representation.
These representation methods will enable the end users especially the Visually Impaired VI web searchers to access the information on the web.
The aim of this paper is design, evaluate and improve the interface for the VI users to perform search and browse results.
This attempt provides a new accessibility tool for the VI web searchers.
The conceptual modelling technique proposed in this paper is based on the Formal Concept Analysis FCA that hides the detailed information for the collected data results.
This approach highlights the main discovered concepts to be focused on.
That is combined with context interactive navigation, in an interface called Interactive Search Engine (InteractSE), which minimize the time and effort required by the VI users.
There is no standardised set of guidelines or heuristics, which can be used for the evaluation of usability and accessibility aspects of such an interface.
Therefore, interactSE was evaluated with experts using Nielsen heuristics and Web Content Accessibility Guidelines WCAG 2.0 in terms of both usability and accessibility.
The analysis was carried out based on the number of usability problems identified and their average severity ratings.
The results show that the most frequently violated heuristics from the Nielsen set are consistency and documentation.
The average severity rating of all the problems found using Nielsen set is minor.
The results also show that the most frequently violated WCAG 2.0 guidelines are distinguishable, followed by navigable and affordance.
The average severity rating of all the problems found using WCAG 2.0 guidelines is also minor.
The results show that Nielsen heuristics and WCAG 2.0 guidelines both contributed in identifying a number of usability problems.
Automated Facial Expression Recognition (FER) has been a challenging task for decades.
Many of the existing works use hand-crafted features such as LBP, HOG, LPQ, and Histogram of Optical Flow (HOF) combined with classifiers such as Support Vector Machines for expression recognition.
These methods often require rigorous hyperparameter tuning to achieve good results.
Recently Deep Neural Networks (DNN) have shown to outperform traditional methods in visual object recognition.
In this paper, we propose a two-part network consisting of a DNN-based architecture followed by a Conditional Random Field (CRF) module for facial expression recognition in videos.
The first part captures the spatial relation within facial images using convolutional layers followed by three Inception-ResNet modules and two fully-connected layers.
To capture the temporal relation between the image frames, we use linear chain CRF in the second part of our network.
We evaluate our proposed network on three publicly available databases, viz.
Experiments are performed in subject-independent and cross-database manners.
Our experimental results show that cascading the deep network architecture with the CRF module considerably increases the recognition of facial expressions in videos and in particular it outperforms the state-of-the-art methods in the cross-database experiments and yields comparable results in the subject-independent experiments.
Existing deep multitask learning (MTL) approaches align layers shared between tasks in a parallel ordering.
Such an organization significantly constricts the types of shared structure that can be learned.
The necessity of parallel ordering for deep MTL is first tested by comparing it with permuted ordering of shared layers.
The results indicate that a flexible ordering can enable more effective sharing, thus motivating the development of a soft ordering approach, which learns how shared layers are applied in different ways for different tasks.
Deep MTL with soft ordering outperforms parallel ordering methods across a series of domains.
These results suggest that the power of deep MTL comes from learning highly general building blocks that can be assembled to meet the demands of each task.
In this paper we explore the use of electrical biosignals measured on scalp and corresponding to mental relaxation and concentration tasks in order to control an object in a video game.
To evaluate the requirements of such a system in terms of sensors and signal processing we compare two designs.
The first one uses only one scalp electroencephalographic (EEG) electrode and the power in the alpha frequency band.
The second one uses sixteen scalp EEG electrodes and machine learning methods.
The role of muscular activity is also evaluated using five electrodes positioned on the face and the neck.
Results show that the first design enabled 70% of the participants to successfully control the game, whereas 100% of the participants managed to do it with the second design based on machine learning.
Subjective questionnaires confirm these results: users globally felt to have control in both designs, with an increased feeling of control in the second one.
Offline analysis of face and neck muscle activity shows that this activity could also be used to distinguish between relaxation and concentration tasks.
Results suggest that the combination of muscular and brain activity could improve performance of this kind of system.
They also suggest that muscular activity has probably been recorded by EEG electrodes.
Low-density parity-check (LDPC) codes on symmetric memoryless channels have been analyzed using statistical physics by several authors.
In this paper, statistical mechanical analysis of LDPC codes is performed for asymmetric memoryless channels and general Markov channels.
It is shown that the saddle point equations of the replica symmetric solution for a Markov channel is equivalent to the density evolution of the belief propagation on the factor graph representing LDPC codes on the Markov channel.
The derivation uses the method of types for Markov chain.
Mobile agent networks, such as multi-UAV systems, are constrained by limited resources.
In particular, limited energy affects system performance directly, such as system lifetime.
It has been demonstrated in the wireless sensor network literature that the communication energy consumption dominates the computational and the sensing energy consumption.
Hence, the lifetime of the multi-UAV systems can be extended significantly by optimizing the amount of communication data, at the expense of increasing computational cost.
In this work, we aim at attaining an optimal trade-off between the communication and the computational energy.
Specifically, we propose a mixed-integer optimization formulation for a multi-hop hierarchical clustering-based self-organizing UAV network incorporating data aggregation, to obtain an energy-efficient information routing scheme.
The proposed framework is tested on two applications, namely target tracking and area mapping.
Based on simulation results, our method can significantly save energy compared to a baseline strategy, where there is no data aggregation and clustering scheme.
Inspired by speech recognition, recent state-of-the-art algorithms mostly consider scene text recognition as a sequence prediction problem.
Though achieving excellent performance, these methods usually neglect an important fact that text in images are actually distributed in two-dimensional space.
It is a nature quite different from that of speech, which is essentially a one-dimensional signal.
In principle, directly compressing features of text into a one-dimensional form may lose useful information and introduce extra noise.
In this paper, we approach scene text recognition from a two-dimensional perspective.
A simple yet effective model, called Character Attention Fully Convolutional Network (CA-FCN), is devised for recognizing the text of arbitrary shapes.
Scene text recognition is realized with a semantic segmentation network, where an attention mechanism for characters is adopted.
Combined with a word formation module, CA-FCN can simultaneously recognize the script and predict the position of each character.
Experiments demonstrate that the proposed algorithm outperforms previous methods on both regular and irregular text datasets.
Moreover, it is proven to be more robust to imprecise localizations in the text detection phase, which are very common in practice.
In this paper, we propose a design solution for the implementation of Virtualized Network Coding Functionality (VNCF) over a service coverage area.
Network Function Virtualization (NFV) and Network Coding (NC) architectural designs are integrated as a toolbox of NC design domains so that NC can be implemented over different underlying physical networks including satellite or hybrid networks.
The design includes identifying theoretical limits of NC over wireless networks in terms of achievable rate region and optimizing coding rates for nodes that implement VNCF.
The overall design target is to achieve a given multicast transmission target reliability at receiver sides.
In addition, the optimization problem uses databases with geo-tagged link statistics and geo-location information of network nodes in the deployment area for some computational complexity/energy constraints.
Numerical results provide validation of our design solution on how network conditions and system constraints impact the design and implementation of NC and how VNCF allows reliable communication over wireless networks with reliability and connectivity up to theoretical limits.
We propose a method that combines signals from many brain regions observed in functional Magnetic Resonance Imaging (fMRI) to predict the subject's behavior during a scanning session.
Such predictions suffer from the huge number of brain regions sampled on the voxel grid of standard fMRI data sets: the curse of dimensionality.
Dimensionality reduction is thus needed, but it is often performed using a univariate feature selection procedure, that handles neither the spatial structure of the images, nor the multivariate nature of the signal.
By introducing a hierarchical clustering of the brain volume that incorporates connectivity constraints, we reduce the span of the possible spatial configurations to a single tree of nested regions tailored to the signal.
We then prune the tree in a supervised setting, hence the name supervised clustering, in order to extract a parcellation (division of the volume) such that parcel-based signal averages best predict the target information.
Dimensionality reduction is thus achieved by feature agglomeration, and the constructed features now provide a multi-scale representation of the signal.
Comparisons with reference methods on both simulated and real data show that our approach yields higher prediction accuracy than standard voxel-based approaches.
Moreover, the method infers an explicit weighting of the regions involved in the regression or classification task.
As global political preeminence gradually shifted from the United Kingdom to the United States, so did the capacity to culturally influence the rest of the world.
In this work, we analyze how the world-wide varieties of written English are evolving.
We study both the spatial and temporal variations of vocabulary and spelling of English using a large corpus of geolocated tweets and the Google Books datasets corresponding to books published in the US and the UK.
The advantage of our approach is that we can address both standard written language (Google Books) and the more colloquial forms of microblogging messages (Twitter).
We find that American English is the dominant form of English outside the UK and that its influence is felt even within the UK borders.
Finally, we analyze how this trend has evolved over time and the impact that some cultural events have had in shaping it.
A battery swapping and charging station (BSCS) is an energy refueling station, where i) electric vehicles (EVs) with depleted batteries (DBs) can swap their DBs for fully-charged ones, and ii) the swapped DBs are then charged until they are fully-charged.
Successful deployment of a BSCS system necessitates a careful planning of swapping- and charging-related infrastructures, and thus a comprehensive performance evaluation of the BSCS is becoming crucial.
This paper studies such a performance evaluation problem with a novel mixed queueing network (MQN) model and validates this model with extensive numerical simulation.
We adopt the EVs' blocking probability as our quality-of-service measure and focus on studying the impact of the key parameters of the BSCS (e.g., the numbers of parking spaces, swapping islands, chargers, and batteries) on the blocking probability.
We prove a necessary and sufficient condition for showing the ergodicity of the MQN when the number of batteries approaches infinity, and further prove that the blocking probability has two different types of asymptotic behaviors.
Meanwhile, for each type of asymptotic behavior, we analytically derive the asymptotic lower bound of the blocking probability.
The success of graph embeddings or node representation learning in a variety of downstream tasks, such as node classification, link prediction, and recommendation systems, has led to their popularity in recent years.
Representation learning algorithms aim to preserve local and global network structure by identifying node neighborhood notions.
However, many existing algorithms generate embeddings that fail to properly preserve the network structure, or lead to unstable representations due to random processes (e.g., random walks to generate context) and, thus, cannot generate to multi-graph problems.
In this paper, we propose RECS, a novel, stable graph embedding algorithmic framework.
RECS learns graph representations using connection subgraphs by employing the analogy of graphs with electrical circuits.
It preserves both local and global connectivity patterns, and addresses the issue of high-degree nodes.
Further, it exploits the strength of weak ties and meta-data that have been neglected by baselines.
The experiments show that RECS outperforms state-of-the-art algorithms by up to 36.85% on multi-label classification problem.
Further, in contrast to baselines, RECS, being deterministic, is completely stable.
In this paper we propose right-angled Artin groups as a platform for secret sharing schemes based on the efficiency (linear time) of the word problem.
Inspired by previous work of Grigoriev-Shpilrain in the context of graphs, we define two new problems: Subgroup Isomorphism Problem and Group Homomorphism Problem.
Based on them, we also propose two new authentication schemes.
For right-angled Artin groups, the Group Homomorphism and Graph Homomorphism problems are equivalent, and the later is known to be NP-complete.
In the case of the Subgroup Isomorphism problem, we bring some results due to Bridson who shows there are right-angled Artin groups in which this problem is unsolvable.
Publishing articles in high-impact English journals is difficult for scholars around the world, especially for non-native English-speaking scholars (NNESs), most of whom struggle with proficiency in English.
In order to uncover the differences in English scientific writing between native English-speaking scholars (NESs) and NNESs, we collected a large-scale data set containing more than 150,000 full-text articles published in PLoS between 2006 and 2015.
We divided these articles into three groups according to the ethnic backgrounds of the first and corresponding authors, obtained by Ethnea, and examined the scientific writing styles in English from a two-fold perspective of linguistic complexity: (1) syntactic complexity, including measurements of sentence length and sentence complexity; and (2) lexical complexity, including measurements of lexical diversity, lexical density, and lexical sophistication.
The observations suggest marginal differences between groups in syntactical and lexical complexity.
Centrality is an important notion in complex networks; it could be used to characterize how influential a node or an edge is in the network.
It plays an important role in several other network analysis tools including community detection.
Even though there are a small number of axiomatic frameworks associated with this notion, the existing formalizations are not generic in nature.
In this paper we propose a generic axiomatic framework to capture all the intrinsic properties of a centrality measure (a.k.a. centrality index).
We analyze popular centrality measures along with other novel measures of centrality using this framework.
We observed that none of the centrality measures considered satisfies all the axioms.
Reconstruction of signals from compressively sensed measurements is an ill-posed problem.
In this paper, we leverage the recurrent generative model, RIDE, as an image prior for compressive image reconstruction.
Recurrent networks can model long-range dependencies in images and hence are suitable to handle global multiplexing in reconstruction from compressive imaging.
We perform MAP inference with RIDE using back-propagation to the inputs and projected gradient method.
We propose an entropy thresholding based approach for preserving texture in images well.
Our approach shows superior reconstructions compared to recent global reconstruction approaches like D-AMP and TVAL3 on both simulated and real data.
In the last decade, social media has evolved as one of the leading platform to create, share, or exchange information; it is commonly used as a way for individuals to maintain social connections.
In this online digital world, people use to post texts or pictures to express their views socially and create user-user engagement through discussions and conversations.
Thus, social media has established itself to bear signals relating to human behavior.
One can easily design user characteristic network by scraping through someone's social media profiles.
In this paper, we investigate the potential of social media in characterizing and understanding predominant drunk texters from the perspective of their social, psychological and linguistic behavior as evident from the content generated by them.
Our research aims to analyze the behavior of drunk texters on social media and to contrast this with non-drunk texters.
We use Twitter social media to obtain the set of drunk texters and non-drunk texters and show that we can classify users into these two respective sets using various psycholinguistic features with an overall average accuracy of 96.78% with very high precision and recall.
Note that such an automatic classification can have far-reaching impact - (i) on health research related to addiction prevention and control, and (ii) in eliminating abusive and vulgar contents from Twitter, borne by the tweets of drunk texters.
This paper explores the potential of extreme learning machine based supervised classification algorithm for land cover classification.
In comparison to a backpropagation neural network, which requires setting of several user-defined parameters and may produce local minima, extreme learning machine require setting of one parameter and produce a unique solution.
ETM+ multispectral data set (England) was used to judge the suitability of extreme learning machine for remote sensing classifications.
A back propagation neural network was used to compare its performance in term of classification accuracy and computational cost.
Results suggest that the extreme learning machine perform equally well to back propagation neural network in term of classification accuracy with this data set.
The computational cost using extreme learning machine is very small in comparison to back propagation neural network.
End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors.
We introduce an approach to learning representations of messages in dialogues by maximizing the likelihood of subsequent sentences and actions, which decouples the semantics of the dialogue utterance from its linguistic realization.
We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning.
Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve decision making without diverging from human language.
Our hierarchical latent-variable model outperforms previous work both linguistically and strategically.
We design a new approach that allows robot learning of new activities from unlabeled human example videos.
Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the robot learn the temporal structure of the activity as its future regression network, and learn to transfer such model for its own motor execution.
We present a new deep learning model: We extend the state-of-the-art convolutional object detection network for the representation/estimation of human hands in training videos, and newly introduce the concept of using a fully convolutional network to regress (i.e., predict) the intermediate scene representation corresponding to the future frame (e.g., 1-2 seconds later).
Combining these allows direct prediction of future locations of human hands and objects, which enables the robot to infer the motor control plan using our manipulation network.
We experimentally confirm that our approach makes learning of robot activities from unlabeled human interaction videos possible, and demonstrate that our robot is able to execute the learned collaborative activities in real-time directly based on its camera input.
In Future Internet it is possible to change elements of congestion control in order to eliminate jitter and batch loss caused by the current control mechanisms based on packet loss events.
We investigate the fundamental problem of adjusting sending rates to achieve optimal utilization of highly variable bandwidth of a network path using accurate packet rate information.
This is done by continuously controlling the sending rate with a function of the measured packet rate at the receiver.
We propose the relative loss of packet rate between the sender and the receiver (Relative Rate Reduction, RRR) as a new accurate and continuous measure of congestion of a network path, replacing the erratically fluctuating packet loss.
We demonstrate that with choosing various RRR based feedback functions the optimum is reached with adjustable congestion level.
The proposed method guarantees fair bandwidth sharing of competitive flows.
Finally, we present testbed experiments to demonstrate the performance of the algorithm.
Software quality in use comprises quality from the user's perspective.
It has gained its importance in e-government applications, mobile-based applications, embedded systems, and even business process development.
User's decisions on software acquisitions are often ad hoc or based on preference due to difficulty in quantitatively measuring software quality in use.
But, why is quality-in-use measurement difficult?
Although there are many software quality models, to the authors' knowledge no works survey the challenges related to software quality-in-use measurement.
This article has two main contributions: 1) it identifies and explains major issues and challenges in measuring software quality in use in the context of the ISO SQuaRE series and related software quality models and highlights open research areas; and 2) it sheds light on a research direction that can be used to predict software quality in use.
In short, the quality-in-use measurement issues are related to the complexity of the current standard models and the limitations and incompleteness of the customized software quality models.
A sentiment analysis of software reviews is proposed to deal with these issues.
Static type errors are a common stumbling block for newcomers to typed functional languages.
We present a dynamic approach to explaining type errors by generating counterexample witness inputs that illustrate how an ill-typed program goes wrong.
First, given an ill-typed function, we symbolically execute the body to synthesize witness values that make the program go wrong.
We prove that our procedure synthesizes general witnesses in that if a witness is found, then for all inhabited input types, there exist values that can make the function go wrong.
Second, we show how to extend this procedure to produce a reduction graph that can be used to interactively visualize and debug witness executions.
Third, we evaluate the coverage of our approach on two data sets comprising over 4,500 ill-typed student programs.
Our technique is able to generate witnesses for around 85% of the programs, our reduction graph yields small counterexamples for over 80% of the witnesses, and a simple heuristic allows us to use witnesses to locate the source of type errors with around 70% accuracy.
Finally, we evaluate whether our witnesses help students understand and fix type errors, and find that students presented with our witnesses show a greater understanding of type errors than those presented with a standard error message.
While large-scale knowledge graphs provide vast amounts of structured facts about entities, a short textual description can often be useful to succinctly characterize an entity and its type.
Unfortunately, many knowledge graph entities lack such textual descriptions.
In this paper, we introduce a dynamic memory-based network that generates a short open vocabulary description of an entity by jointly leveraging induced fact embeddings as well as the dynamic context of the generated sequence of words.
We demonstrate the ability of our architecture to discern relevant information for more accurate generation of type description by pitting the system against several strong baselines.
We consider the problem of extracting entropy by sparse transformations, namely functions with a small number of overall input-output dependencies.
In contrast to previous works, we seek extractors for essentially all the entropy without any assumption on the underlying distribution beyond a min-entropy requirement.
We give two simple constructions of sparse extractor families, which are collections of sparse functions such that for any distribution X on inputs of sufficiently high min-entropy, the output of most functions from the collection on a random input chosen from X is statistically close to uniform.
For strong extractor families (i.e., functions in the family do not take additional randomness) we give upper and lower bounds on the sparsity that are tight up to a constant factor for a wide range of min-entropies.
We then prove that for some min-entropies weak extractor families can achieve better sparsity.
We show how this construction can be used towards more efficient parallel transformation of (non-uniform) one-way functions into pseudorandom generators.
More generally, sparse extractor families can be used instead of pairwise independence in various randomized or nonuniform settings where preserving locality (i.e., parallelism) is of interest.
One of the most interesting features of Bayesian optimization for direct policy search is that it can leverage priors (e.g., from simulation or from previous tasks) to accelerate learning on a robot.
In this paper, we are interested in situations for which several priors exist but we do not know in advance which one fits best the current situation.
We tackle this problem by introducing a novel acquisition function, called Most Likely Expected Improvement (MLEI), that combines the likelihood of the priors and the expected improvement.
We evaluate this new acquisition function on a transfer learning task for a 5-DOF planar arm and on a possibly damaged, 6-legged robot that has to learn to walk on flat ground and on stairs, with priors corresponding to different stairs and different kinds of damages.
Our results show that MLEI effectively identifies and exploits the priors, even when there is no obvious match between the current situations and the priors.
This paper provides a general result on controlling local Rademacher complexities, which captures in an elegant form to relate the complexities with constraint on the expected norm to the corresponding ones with constraint on the empirical norm.
This result is convenient to apply in real applications and could yield refined local Rademacher complexity bounds for function classes satisfying general entropy conditions.
We demonstrate the power of our complexity bounds by applying them to derive effective generalization error bounds.
While machine learning approaches to image restoration offer great promise, current methods risk training models fixated on performing well only for image corruption of a particular level of difficulty---such as a certain level of noise or blur.
First, we examine the weakness of conventional "fixated" models and demonstrate that training general models to handle arbitrary levels of corruption is indeed non-trivial.
Then, we propose an on-demand learning algorithm for training image restoration models with deep convolutional neural networks.
The main idea is to exploit a feedback mechanism to self-generate training instances where they are needed most, thereby learning models that can generalize across difficulty levels.
On four restoration tasks---image inpainting, pixel interpolation, image deblurring, and image denoising---and three diverse datasets, our approach consistently outperforms both the status quo training procedure and curriculum learning alternatives.
One important factor determining the computational complexity of evaluating a probabilistic network is the cardinality of the state spaces of the nodes.
By varying the granularity of the state spaces, one can trade off accuracy in the result for computational efficiency.
We present an anytime procedure for approximate evaluation of probabilistic networks based on this idea.
On application to some simple networks, the procedure exhibits a smooth improvement in approximation quality as computation time increases.
This suggests that state-space abstraction is one more useful control parameter for designing real-time probabilistic reasoners.
In this paper we present the RuSentRel corpus including analytical texts in the sphere of international relations.
For each document we annotated sentiments from the author to mentioned named entities, and sentiments of relations between mentioned entities.
In the current experiments, we considered the problem of extracting sentiment relations between entities for the whole documents as a three-class machine learning task.
We experimented with conventional machine-learning methods (Naive Bayes, SVM, Random Forest).
This paper explores the idea that the universe is a virtual reality created by information processing, and relates this strange idea to the findings of modern physics about the physical world.
The virtual reality concept is familiar to us from online worlds, but our world as a virtual reality is usually a subject for science fiction rather than science.
Yet logically the world could be an information simulation running on a multi-dimensional space-time screen.
Indeed, if the essence of the universe is information, matter, charge, energy and movement could be aspects of information, and the many conservation laws could be a single law of information conservation.
If the universe were a virtual reality, its creation at the big bang would no longer be paradoxical, as every virtual system must be booted up.
It is suggested that whether the world is an objective reality or a virtual reality is a matter for science to resolve.
Modern information science can suggest how core physical properties like space, time, light, matter and movement could derive from information processing.
Such an approach could reconcile relativity and quantum theories, with the former being how information processing creates space-time, and the latter how it creates energy and matter.
A central problem to understanding intelligence is the concept of generalisation.
This allows previously learnt structure to be exploited to solve tasks in novel situations differing in their particularities.
We take inspiration from neuroscience, specifically the hippocampal-entorhinal system known to be important for generalisation.
We propose that to generalise structural knowledge, the representations of the structure of the world, i.e. how entities in the world relate to each other, need to be separated from representations of the entities themselves.
We show, under these principles, artificial neural networks embedded with hierarchy and fast Hebbian memory, can learn the statistics of memories and generalise structural knowledge.
Spatial neuronal representations mirroring those found in the brain emerge, suggesting spatial cognition is an instance of more general organising principles.
We further unify many entorhinal cell types as basis functions for constructing transition graphs, and show these representations effectively utilise memories.
We experimentally support model assumptions, showing a preserved relationship between entorhinal grid and hippocampal place cells across environments.
A mobile robot deployed for remote inspection, surveying or rescue missions can fail due to various possibilities and can be hardware or software related.
These failure scenarios necessitate manual recovery (self-rescue) of the robot from the environment.
It would bring unforeseen challenges to recover the mobile robot if the environment where it was deployed had hazardous or harmful conditions (e.g. ionizing radiations).
While it is not fully possible to predict all the failures in the robot, failures can be reduced by employing certain design/usage considerations.
Few example failure cases based on real experiences are presented in this short article along with generic suggestions on overcoming the illustrated failure situations.
This article presents the novel breakthrough general purpose algorithm for large scale optimization problems.
The novel algorithm is capable of achieving breakthrough speeds for very large-scale optimization on general purpose laptops and embedded systems.
Application of the algorithm to the Griewank function was possible in up to 1 billion decision variables in double precision took only 64485 seconds (~18 hours) to solve, while consuming 7,630 MB (7.6 GB) or RAM on a single threaded laptop CPU.
It shows that the algorithm is computationally and memory (space) linearly efficient, and can find the optimal or near-optimal solution in a fraction of the time and memory that many conventional algorithms require.
It is envisaged that this will open up new possibilities of real-time large-scale problems on personal laptops and embedded systems.
High level understanding of sequential visual input is important for safe and stable autonomy, especially in localization and object detection.
While traditional object classification and tracking approaches are specifically designed to handle variations in rotation and scale, current state-of-the-art approaches based on deep learning achieve better performance.
This paper focuses on developing a spatiotemporal model to handle videos containing moving objects with rotation and scale changes.
Built on models that combine Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to classify sequential data, this work investigates the effectiveness of incorporating attention modules in the CNN stage for video classification.
The superiority of the proposed spatiotemporal model is demonstrated on the Moving MNIST dataset augmented with rotation and scaling.
We propose a mechanism that incorporates network coding into TCP with only minor changes to the protocol stack, thereby allowing incremental deployment.
In our scheme, the source transmits random linear combinations of packets currently in the congestion window.
At the heart of our scheme is a new interpretation of ACKs - the sink acknowledges every degree of freedom (i.e., a linear combination that reveals one unit of new information) even if it does not reveal an original packet immediately.
Such ACKs enable a TCP-like sliding-window approach to network coding.
Our scheme has the nice property that packet losses are essentially masked from the congestion control algorithm.
Our algorithm therefore reacts to packet drops in a smooth manner, resulting in a novel and effective approach for congestion control over networks involving lossy links such as wireless links.
Our experiments show that our algorithm achieves higher throughput compared to TCP in the presence of lossy wireless links.
We also establish the soundness and fairness properties of our algorithm.
We report the results of a project to control the use of end user computing tools for business critical applications in a banking environment.
Several workstreams were employed in order to bring about a cultural change within the bank towards the use of spreadsheets and other end-user tools, covering policy development, awareness and skills training, inventory monitoring, user licensing, key risk metrics and mitigation approaches.
The outcomes of these activities are discussed, and conclusions are drawn as to the need for appropriate organisational models to guide the use of these tools.
In this work we have proposed a geometric model that is employed to devise a scheme for identifying the hotspots and zones in a chip.
These spots or zone need to be guarded thermally to ensure performance and reliability of the chip.
The model namely continuous unit sphere model has been presented taking into account that the 3D region of the chip is uniform, thereby reflecting on the possible locations of heat sources and the target observation points.
The experimental results for the - continuous domain establish that a region which does not contain any heat sources may become hotter than the regions containing the thermal sources.
Thus a hotspot may appear away from the active sources, and placing heat sinks on the active thermal sources alone may not suffice to tackle thermal imbalance.
Power management techniques aid in obtaining a uniform power profile throughout the chip, but we propose an algorithm using minimum bipartite matching where we try to move the sources minimally (with minimum perturbation in the chip floor plan) near cooler points (blocks) to obtain a uniform power profile due to diffusion of heat from hotter point to cooler ones.
One of the key differences between the learning mechanism of humans and Artificial Neural Networks (ANNs) is the ability of humans to learn one task at a time.
ANNs, on the other hand, can only learn multiple tasks simultaneously.
Any attempts at learning new tasks incrementally cause them to completely forget about previous tasks.
This lack of ability to learn incrementally, called Catastrophic Forgetting, is considered a major hurdle in building a true AI system.
In this paper, our goal is to isolate the truly effective existing ideas for incremental learning from those that only work under certain conditions.
To this end, we first thoroughly analyze the current state of the art (iCaRL) method for incremental learning and demonstrate that the good performance of the system is not because of the reasons presented in the existing literature.
We conclude that the success of iCaRL is primarily due to knowledge distillation and recognize a key limitation of knowledge distillation, i.e, it often leads to bias in classifiers.
Finally, we propose a dynamic threshold moving algorithm that is able to successfully remove this bias.
We demonstrate the effectiveness of our algorithm on CIFAR100 and MNIST datasets showing near-optimal results.
Our implementation is available at https://github.com/Khurramjaved96/incremental-learning.
Policy gradient is an efficient technique for improving a policy in a reinforcement learning setting.
However, vanilla online variants are on-policy only and not able to take advantage of off-policy data.
In this paper we describe a new technique that combines policy gradient with off-policy Q-learning, drawing experience from a replay buffer.
This is motivated by making a connection between the fixed points of the regularized policy gradient algorithm and the Q-values.
This connection allows us to estimate the Q-values from the action preferences of the policy, to which we apply Q-learning updates.
We refer to the new technique as 'PGQL', for policy gradient and Q-learning.
We also establish an equivalency between action-value fitting techniques and actor-critic algorithms, showing that regularized policy gradient techniques can be interpreted as advantage function learning algorithms.
We conclude with some numerical examples that demonstrate improved data efficiency and stability of PGQL.
In particular, we tested PGQL on the full suite of Atari games and achieved performance exceeding that of both asynchronous advantage actor-critic (A3C) and Q-learning.
This work briefly surveys unconventional research in Russia from the end of the 19th until the beginning of the 21th centuries in areas related to generation and detection of a 'high-penetrating' emission of non-biological origin.
The overview is based on open scientific and journalistic materials.
The unique character of this research and its history, originating from governmental programs of the USSR, is shown.
Relations to modern studies on biological effects of weak electromagnetic emission, several areas of bioinformatics and theories of physical vacuum are discussed.
In wind farms, wake interaction leads to losses in power capture and accelerated structural degradation when compared to freestanding turbines.
One method to reduce wake losses is by misaligning the rotor with the incoming flow using its yaw actuator, thereby laterally deflecting the wake away from downstream turbines.
However, this demands an accurate and computationally tractable model of the wind farm dynamics.
This problem calls for a closed-loop solution.
This tutorial paper fills the scientific gap by demonstrating the full closed-loop controller synthesis cycle using a steady-state surrogate model.
Furthermore, a novel, computationally efficient and modular communication interface is presented that enables researchers to straight-forwardly test their control algorithms in large-eddy simulations.
High-fidelity simulations of a 9-turbine farm show a power production increase of up to 11% using the proposed closed-loop controller compared to traditional, greedy wind farm operation.
Behavior Trees (BTs) have become a popular framework for designing controllers of autonomous agents in the computer game and in the robotics industry.
One of the key advantages of BTs lies in their modularity, where independent modules can be composed to create more complex ones.
In the classical formulation of BTs, modules can be composed using one of the three operators: Sequence, Fallback, and Parallel.
The Parallel operator is rarely used despite its strong potential against other control architectures as Finite State Machines.
This is due to the fact that concurrent actions may lead to unexpected problems similar to the ones experienced in concurrent programming.
In this paper, we introduce Concurrent BTs (CBTs) as a generalization of BTs in which we introduce the notions of progress and resource usage.
We show how CBTs allow safe concurrent executions of actions and we analyze the approach from a mathematical standpoint.
To illustrate the use of CBTs, we provide a set of use cases in robotics scenarios.
This paper is concerned with the effect of overlay network topology on the performance of live streaming peer-to-peer systems.
The paper focuses on the evaluation of topologies which are aware of the delays experienced between different peers on the network.
Metrics are defined which assess the topologies in terms of delay, bandwidth usage and resilience to peer drop-out.
Several topology creation algorithms are tested and the metrics are measured in a simple simulation testbed.
This gives an assessment of the type of gains which might be expected from locality awareness in peer-to-peer networks.
We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation.
UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), discontinuous structures and non-terminal nodes corresponding to complex semantic units.
To our knowledge, the conjunction of these formal properties is not supported by any existing parser.
Our transition-based parser, which uses a novel transition set and features based on bidirectional LSTMs, has value not just for UCCA parsing: its ability to handle more general graph structures can inform the development of parsers for other semantic DAG structures, and in languages that frequently use discontinuous structures.
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates.
To better understand this development, we decompose the variance of the policy gradient estimator and numerically show that learned state-action-dependent baselines do not in fact reduce variance over a state-dependent baseline in commonly tested benchmark domains.
We confirm this unexpected result by reviewing the open-source code accompanying these prior papers, and show that subtle implementation decisions cause deviations from the methods presented in the papers and explain the source of the previously observed empirical gains.
Furthermore, the variance decomposition highlights areas for improvement, which we demonstrate by illustrating a simple change to the typical value function parameterization that can significantly improve performance.
Identifying the relations that exist between words (or entities) is important for various natural language processing tasks such as, relational search, noun-modifier classification and analogy detection.
A popular approach to represent the relations between a pair of words is to extract the patterns in which the words co-occur with from a corpus, and assign each word-pair a vector of pattern frequencies.
Despite the simplicity of this approach, it suffers from data sparseness, information scalability and linguistic creativity as the model is unable to handle previously unseen word pairs in a corpus.
In contrast, a compositional approach for representing relations between words overcomes these issues by using the attributes of each individual word to indirectly compose a representation for the common relations that hold between the two words.
This study aims to compare different operations for creating relation representations from word-level representations.
We investigate the performance of the compositional methods by measuring the relational similarities using several benchmark datasets for word analogy.
Moreover, we evaluate the different relation representations in a knowledge base completion task.
We present a language independent, unsupervised method for building word embeddings using morphological expansion of text.
Our model handles the problem of data sparsity and yields improved word embeddings by relying on training word embeddings on artificially generated sentences.
We evaluate our method using small sized training sets on eleven test sets for the word similarity task across seven languages.
Further, for English, we evaluated the impacts of our approach using a large training set on three standard test sets.
Our method improved results across all languages.
Efforts are underway at UT Austin to build autonomous robot systems that address the challenges of long-term deployments in office environments and of the more prescribed domestic service tasks of the RoboCup@Home competition.
We discuss the contrasts and synergies of these efforts, highlighting how our work to build a RoboCup@Home Domestic Standard Platform League entry led us to identify an integrated software architecture that could support both projects.
Further, naturalistic deployments of our office robot platform as part of the Building-Wide Intelligence project have led us to identify and research new problems in a traditional laboratory setting.
Dou Shou Qi is a game in which two players control a number of pieces, each of them aiming to move one of their pieces onto a given square.
We implemented an engine for analyzing the game.
Moreover, we created a series of endgame tablebases containing all configurations with up to four pieces.
These tablebases are the first steps towards theoretically solving the game.
Finally, we constructed decision trees based on the endgame tablebases.
In this note we report on some interesting patterns.
As deep neural networks (DNNs) have been integrated into critical systems, several methods to attack these systems have been developed.
These adversarial attacks make imperceptible modifications to an image that fool DNN classifiers.
We present an adaptive JPEG encoder which defends against many of these attacks.
Experimentally, we show that our method produces images with high visual quality while greatly reducing the potency of state-of-the-art attacks.
Our algorithm requires only a modest increase in encoding time, produces a compressed image which can be decompressed by an off-the-shelf JPEG decoder, and classified by an unmodified classifier
A conditional Generative Adversarial Network allows for generating samples conditioned on certain external information.
Being able to recover latent and conditional vectors from a condi- tional GAN can be potentially valuable in various applications, ranging from image manipulation for entertaining purposes to diagnosis of the neural networks for security purposes.
In this work, we show that it is possible to recover both latent and conditional vectors from generated images given the generator of a conditional generative adversarial network.
Such a recovery is not trivial due to the often multi-layered non-linearity of deep neural networks.
Furthermore, the effect of such recovery applied on real natural images are investigated.
We discovered that there exists a gap between the recovery performance on generated and real images, which we believe comes from the difference between generated data distribution and real data distribution.
Experiments are conducted to evaluate the recovered conditional vectors and the reconstructed images from these recovered vectors quantitatively and qualitatively, showing promising results.
Query-expansion via pseudo-relevance feedback is a popular method of overcoming the problem of vocabulary mismatch and of increasing average retrieval effectiveness.
In this paper, we develop a new method that estimates a query topic model from a set of pseudo-relevant documents using a new language modelling framework.
We assume that documents are generated via a mixture of multivariate Polya distributions, and we show that by identifying the topical terms in each document, we can appropriately select terms that are likely to belong to the query topic model.
The results of experiments on several TREC collections show that the new approach compares favourably to current state-of-the-art expansion methods.
Analyzing job hopping behavior is important for the understanding of job preference and career progression of working individuals.
When analyzed at the workforce population level, job hop analysis helps to gain insights of talent flow and organization competition.
Traditionally, surveys are conducted on job seekers and employers to study job behavior.
While surveys are good at getting direct user input to specially designed questions, they are often not scalable and timely enough to cope with fast-changing job landscape.
In this paper, we present a data science approach to analyze job hops performed by about 490,000 working professionals located in a city using their publicly shared profiles.
We develop several metrics to measure how much work experience is needed to take up a job and how recent/established the job is, and then examine how these metrics correlate with the propensity of hopping.
We also study how job hop behavior is related to job promotion/demotion.
Finally, we perform network analyses at the job and organization levels in order to derive insights on talent flow as well as job and organizational competitiveness.
This paper presents iterative Sequential Action Control (iSAC), a receding horizon approach for control of nonlinear systems.
The iSAC method has a closed-form open-loop solution, which is iteratively updated between time steps by introducing constant control values applied for short duration.
Application of a contractive constraint on the cost is shown to lead to closed-loop asymptotic stability under mild assumptions.
The effect of asymptotically decaying disturbances on system trajectories is also examined.
To demonstrate the applicability of iSAC to a variety of systems and conditions, we employ five different systems, including a 13-dimensional quaternion-based quadrotor.
Each system is tested in different scenarios, ranging from feasible and infeasible trajectory tracking, to setpoint stabilization, with or without the presence of external disturbances.
Finally, limitations of this work are discussed.
We consider a compressed sensing problem in which both the measurement and the sparsifying systems are assumed to be frames (not necessarily tight) of the underlying Hilbert space of signals, which may be finite or infinite dimensional.
The main result gives explicit bounds on the number of measurements in order to achieve stable recovery, which depends on the mutual coherence of the two systems.
As a simple corollary, we prove the efficiency of nonuniform sampling strategies in cases when the two systems are not incoherent, but only asymptotically incoherent, as with the recovery of wavelet coefficients from Fourier samples.
This general framework finds applications to inverse problems in partial differential equations, where the standard assumptions of compressed sensing are often not satisfied.
Several examples are discussed, with a special focus on electrical impedance tomography.
Social health and emotional wellness is a matter of concern in today's urban world.
Being the part of a metropolis has an effect on mental health through the influence of increased stressors and factors such as overcrowded and polluted environment, high levels of violence, and reduced social support.
It is important to realize that only healthy citizens can constitute together a smart city.
In this paper, we present a fuzzy-based approach for analyzing the well being of a person.
We track the general day to day activities of a person and analyze its performance.
To do so, we divide the factors affecting the wellness of a person into three components which are the physical, productive and social.
Using these parameters, we output a coefficient for the overall well being of a person.
The visual observation and tracking of cells and other micrometer-sized objects has many different biomedical applications.
The automation of those tasks based on computer methods helps in the evaluation of such measurements.
In this work, we present a general purpose algorithm that excels at evaluating deterministic behavior of micrometer-sized objects.
Our concrete application is the tracking of fast moving objects over large distances along deterministic trajectories in a microscopic video.
Thereby, we are able to determine characteristic properties of the objects.
For this purpose, we use a set of basic algorithms, including blob recognition, feature-based shape recognition and a graph algorithm, and combined them in a novel way.
An evaluation of the algorithms performance shows a high accuracy in the recognition of objects as well as of complete trajectories.
Moreover, a direct comparison to a similar algorithm shows superior recognition rates.
In iterative supervised learning algorithms it is common to reach a point in the search where no further induction seems to be possible with the available data.
If the search is continued beyond this point, the risk of overfitting increases significantly.
Following the recent developments in inductive semantic stochastic methods, this paper studies the feasibility of using information gathered from the semantic neighborhood to decide when to stop the search.
Two semantic stopping criteria are proposed and experimentally assessed in Geometric Semantic Genetic Programming (GSGP) and in the Semantic Learning Machine (SLM) algorithm (the equivalent algorithm for neural networks).
The experiments are performed on real-world high-dimensional regression datasets.
The results show that the proposed semantic stopping criteria are able to detect stopping points that result in a competitive generalization for both GSGP and SLM.
This approach also yields computationally efficient algorithms as it allows the evolution of neural networks in less than 3 seconds on average, and of GP trees in at most 10 seconds.
The usage of the proposed semantic stopping criteria in conjunction with the computation of optimal mutation/learning steps also results in small trees and neural networks.
Convolutional rectifier networks, i.e. convolutional neural networks with rectified linear activation and max or average pooling, are the cornerstone of modern deep learning.
However, despite their wide use and success, our theoretical understanding of the expressive properties that drive these networks is partial at best.
On the other hand, we have a much firmer grasp of these issues in the world of arithmetic circuits.
Specifically, it is known that convolutional arithmetic circuits possess the property of "complete depth efficiency", meaning that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be implemented (or even approximated) by a shallow network.
In this paper we describe a construction based on generalized tensor decompositions, that transforms convolutional arithmetic circuits into convolutional rectifier networks.
We then use mathematical tools available from the world of arithmetic circuits to prove new results.
First, we show that convolutional rectifier networks are universal with max pooling but not with average pooling.
Second, and more importantly, we show that depth efficiency is weaker with convolutional rectifier networks than it is with convolutional arithmetic circuits.
This leads us to believe that developing effective methods for training convolutional arithmetic circuits, thereby fulfilling their expressive potential, may give rise to a deep learning architecture that is provably superior to convolutional rectifier networks but has so far been overlooked by practitioners.
An adversarial example is an example that has been adjusted to produce the wrong label when presented to a system at test time.
If adversarial examples existed that could fool a detector, they could be used to (for example) wreak havoc on roads populated with smart vehicles.
Recently, we described our difficulties creating physical adversarial stop signs that fool a detector.
More recently, Evtimov et al. produced a physical adversarial stop sign that fools a proxy model of a detector.
In this paper, we show that these physical adversarial stop signs do not fool two standard detectors (YOLO and Faster RCNN) in standard configuration.
Evtimov et al.'s construction relies on a crop of the image to the stop sign; this crop is then resized and presented to a classifier.
We argue that the cropping and resizing procedure largely eliminates the effects of rescaling and of view angle.
Whether an adversarial attack is robust under rescaling and change of view direction remains moot.
We argue that attacking a classifier is very different from attacking a detector, and that the structure of detectors - which must search for their own bounding box, and which cannot estimate that box very accurately - likely makes it difficult to make adversarial patterns.
Finally, an adversarial pattern on a physical object that could fool a detector would have to be adversarial in the face of a wide family of parametric distortions (scale; view angle; box shift inside the detector; illumination; and so on).
Such a pattern would be of great theoretical and practical interest.
There is currently no evidence that such patterns exist.
The residual neural network (ResNet) is a popular deep network architecture which has the ability to obtain high-accuracy results on several image processing problems.
In order to analyze the behavior and structure of ResNet, recent work has been on establishing connections between ResNets and continuous-time optimal control problems.
In this work, we show that the post-activation ResNet is related to an optimal control problem with differential inclusions, and provide continuous-time stability results for the differential inclusion associated with ResNet.
Motivated by the stability conditions, we show that alterations of either the architecture or the optimization problem can generate variants of ResNet which improve the theoretical stability bounds.
In addition, we establish stability bounds for the full (discrete) network associated with two variants of ResNet, in particular, bounds on the growth of the features and a measure of the sensitivity of the features with respect to perturbations.
These results also help to show the relationship between the depth, regularization, and stability of the feature space.
Computational experiments on the proposed variants show that the accuracy of ResNet is preserved and that the accuracy seems to be monotone with respect to the depth and various corruptions.
Acute kidney injury (AKI) is a common and serious complication after a surgery which is associated with morbidity and mortality.
The majority of existing perioperative AKI risk score prediction models are limited in their generalizability and do not fully utilize the physiological intraoperative time-series data.
Thus, there is a need for intelligent, accurate, and robust systems, able to leverage information from large-scale data to predict patient's risk of developing postoperative AKI.
A retrospective single-center cohort of 2,911 adult patients who underwent surgery at the University of Florida Health has been used for this study.
We used machine learning and statistical analysis techniques to develop perioperative models to predict the risk of AKI (risk during the first 3 days, 7 days, and until the discharge day) before and after the surgery.
In particular, we examined the improvement in risk prediction by incorporating three intraoperative physiologic time series data, i.e., mean arterial blood pressure, minimum alveolar concentration, and heart rate.
For an individual patient, the preoperative model produces a probabilistic AKI risk score, which will be enriched by integrating intraoperative statistical features through a machine learning stacking approach inside a random forest classifier.
We compared the performance of our model based on the area under the receiver operating characteristics curve (AUROC), accuracy and net reclassification improvement (NRI).
The predictive performance of the proposed model is better than the preoperative data only model.
For AKI-7day outcome: The AUC was 0.86 (accuracy was 0.78) in the proposed model, while the preoperative AUC was 0.84 (accuracy 0.76).
Furthermore, with the integration of intraoperative features, we were able to classify patients who were misclassified in the preoperative model.
Smart cities are an actual trend being pursued by research that, fundamentally, tries to improve city's management on behalf of a better human quality of live.
This paper proposes a new autonomic complementary approach for smart cities management.
It is argued that smart city management systems with autonomic characteristics will improve and facilitate management functionalities in general.
A framework is also presented as use case considering specific application scenarios like smart-health, smart-grid, smart-environment and smart-streets.
Artificial intelligence and machine learning have been major research interests in computer science for the better part of the last few decades.
However, all too recently, both AI and ML have rapidly grown to be media frenzies, pressuring companies and researchers to claim they use these technologies.
As ML continues to percolate into daily life, we, as computer scientists and machine learning researchers, are responsible for ensuring we clearly convey the extent of our work and the humanity of our models.
Regularizing ML for mass adoption requires a rigorous standard for model interpretability, a deep consideration for human bias in data, and a transparent understanding of a model's societal effects.
Solar forecasting accuracy is affected by weather conditions, and weather awareness forecasting models are expected to improve the performance.
However, it may not be available and reliable to classify different forecasting tasks by using only meteorological weather categorization.
In this paper, an unsupervised clustering-based (UC-based) solar forecasting methodology is developed for short-term (1-hour-ahead) global horizontal irradiance (GHI) forecasting.
This methodology consists of three parts: GHI time series unsupervised clustering, pattern recognition, and UC-based forecasting.
The daily GHI time series is first clustered by an Optimized Cross-validated ClUsteRing (OCCUR) method, which determines the optimal number of clusters and best clustering results.
Then, support vector machine pattern recognition (SVM-PR) is adopted to recognize the category of a certain day using the first few hours' data in the forecasting stage.
GHI forecasts are generated by the most suitable models in different clusters, which are built by a two-layer Machine learning based Multi-Model (M3) forecasting framework.
The developed UC-based methodology is validated by using 1-year of data with six solar features.
Numerical results show that (i) UC-based models outperform non-UC (all-in-one) models with the same M3 architecture by approximately 20%; (ii) M3-based models also outperform the single-algorithm machine learning (SAML) models by approximately 20%.
We focus on adversarial patrolling games on arbitrary graphs, where the Defender can control a mobile resource, the targets are alarmed by an alarm system, and the Attacker can observe the actions of the mobile resource of the Defender and perform different attacks exploiting multiple resources.
This scenario can be modeled as a zero-sum extensive-form game in which each player can play multiple times.
The game tree is exponentially large both in the size of the graph and in the number of attacking resources.
We show that when the number of the Attacker's resources is free, the problem of computing the equilibrium path is NP-hard, while when the number of resources is fixed, the equilibrium path can be computed in poly-time.
We provide a dynamic-programming algorithm that, given the number of the Attacker's resources, computes the equilibrium path requiring poly-time in the size of the graph and exponential time in the number of the resources.
Furthermore, since in real-world scenarios it is implausible that the Defender knows the number of attacking resources, we study the robustness of the Defender's strategy when she makes a wrong guess about that number.
We show that even the error of just a single resource can lead to an arbitrary inefficiency, when the inefficiency is defined as the ratio of the Defender's utilities obtained with a wrong guess and a correct guess.
However, a more suitable definition of inefficiency is given by the difference of the Defender's utilities: this way, we observe that the higher the error in the estimation, the higher the loss for the Defender.
Then, we investigate the performance of online algorithms when no information about the Attacker's resources is available.
Finally, we resort to randomized online algorithms showing that we can obtain a competitive factor that is twice better than the one that can be achieved by any deterministic online algorithm.
Robotic systems are complex and critical: they are inherently hybrid, combining both hardware and software; they typically exhibit both cyber-physical attributes and autonomous capabilities; and are required to be at least safe and often ethical.
While for many engineered systems testing, either through real deployment or via simulation, is deemed sufficient the uniquely challenging elements of robotic systems, together with the crucial dependence on sophisticated software control and decision-making, requires a stronger form of verification.
The increasing deployment of robotic systems in safety-critical scenarios exacerbates this still further and leads us towards the use of formal methods to ensure the correctness of, and provide sufficient evidence for the certification of, robotic systems.
There have been many approaches that have used some variety of formal specification or formal verification in autonomous robotics, but there is no resource that collates this activity in to one place.
This paper systematically surveys the state-of-the art in specification formalisms and tools for verifying robotic systems.
Specifically, it describes the challenges arising from autonomy and software architectures, avoiding low-level hardware control and is subsequently identifies approaches for the specification and verification of robotic systems, while avoiding more general approaches.
This paper presents a multi-contact approach to generalized humanoid fall mitigation planning that unifies inertial shaping, protective stepping, and hand contact strategies.
The planner optimizes both the contact sequence and the robot state trajectories.
A high-level tree search is conducted to iteratively grow a contact transition tree.
At each edge of the tree, trajectory optimization is used to calculate robot stabilization trajectories that produce the desired contact transition while minimizing kinetic energy.
Also, at each node of the tree, the optimizer attempts to find a self-motion (inertial shaping movement) to eliminate kinetic energy.
This paper also presents an efficient and effective method to generate initial seeds to facilitate trajectory optimization.
Experiments demonstrate show that our proposed algorithm can generate complex stabilization strategies for a simulated robot under varying initial pushes and environment shapes.
Feedback mechanism based algorithms are frequently used to solve network optimization problems.
These schemes involve users and network exchanging information (e.g. requests for bandwidth allocation and pricing) to achieve convergence towards an optimal solution.
However, in the implementation, these algorithms do not guarantee that messages will be delivered to the destination when network congestion occurs.
This in turn often results in packet drops, which may cause information loss, and this condition may lead to algorithm failing to converge.
To prevent this failure, we propose least square (LS) estimation algorithm to recover the missing information when packets are dropped from the network.
The simulation results involving several scenarios demonstrate that LS estimation can provide the convergence for feedback mechanism based algorithm.
We propose a method to perform audio event detection under the common constraint that only limited training data are available.
In training a deep learning system to perform audio event detection, two practical problems arise.
Firstly, most datasets are "weakly labelled" having only a list of events present in each recording without any temporal information for training.
Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest.
In this paper, we propose a data-efficient training of a stacked convolutional and recurrent neural network.
This neural network is trained in a multi instance learning setting for which we introduce a new loss function that leads to improved training compared to the usual approaches for weakly supervised learning.
We successfully test our approach on two low-resource datasets that lack temporal labels.
Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process images at a fixed resolution.
We propose a depth-aware gating module that adaptively selects the pooling field size in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details are preserved for distant objects while larger receptive fields are used for those nearby.
The depth gating signal is provided by stereo disparity or estimated directly from monocular input.
We integrate this depth-aware gating into a recurrent convolutional neural network to perform semantic segmentation.
Our recurrent module iteratively refines the segmentation results, leveraging the depth and semantic predictions from the previous iterations.
Through extensive experiments on four popular large-scale RGB-D datasets, we demonstrate this approach achieves competitive semantic segmentation performance with a model which is substantially more compact.
We carry out extensive analysis of this architecture including variants that operate on monocular RGB but use depth as side-information during training, unsupervised gating as a generic attentional mechanism, and multi-resolution gating.
We find that gated pooling for joint semantic segmentation and depth yields state-of-the-art results for quantitative monocular depth estimation.
State-of-the-art branch and bound algorithms for mixed integer programming make use of special methods for making branching decisions.
Strategies that have gained prominence include modern variants of so-called strong branching (Applegate, et al.,1995) and reliability branching (Achterberg, Koch and Martin, 2005; Hendel, 2015), which select variables for branching by solving associated linear programs and exploit pseudo-costs (Benichou et al., 1971).
We suggest new branching criteria and propose alternative branching approaches called narrow gauge and analytical branching.
The perspective underlying our approaches is to focus on prioritization of child nodes to examine fewer candidate variables at the current node of the B&B tree, balanced with procedures to extrapolate the implications of choosing these candidates by generating a small-depth look-ahead tree.
Our procedures can also be used in rules to select among open tree nodes (those whose child nodes have not yet been generated).
We incorporate pre- and post-winnowing procedures to progressively isolate preferred branching candidates, and employ derivative (created) variables whose branches are able to explore the solution space more deeply.
We present a proof procedure for univariate real polynomial problems in Isabelle/HOL.
The core mathematics of our procedure is based on univariate cylindrical algebraic decomposition.
We follow the approach of untrusted certificates, separating solving from verifying: efficient external tools perform expensive real algebraic computations, producing evidence that is formally checked within Isabelle's logic.
This allows us to exploit highly-tuned computer algebra systems like Mathematica to guide our procedure without impacting the correctness of its results.
We present experiments demonstrating the efficacy of this approach, in many cases yielding orders of magnitude improvements over previous methods.
In this report, we present our findings from benchmarking experiments for information extraction on historical handwritten marriage records Esposalles from IEHHR - ICDAR 2017 robust reading competition.
The information extraction is modeled as semantic labeling of the sequence across 2 set of labels.
This can be achieved by sequentially or jointly applying handwritten text recognition (HTR) and named entity recognition (NER).
We deploy a pipeline approach where first we use state-of-the-art HTR and use its output as input for NER.
We show that given low resource setup and simple structure of the records, high performance of HTR ensures overall high performance.
We explore the various configurations of conditional random fields and neural networks to benchmark NER on given certain noisy input.
The best model on 10-fold cross-validation as well as blind test data uses n-gram features with bidirectional long short-term memory.
This paper structures a novel vision for OLAP by fundamentally redefining several of the pillars on which OLAP has been based for the last 20 years.
We redefine OLAP queries, in order to move to higher degrees of abstraction from roll-up's and drill-down's, and we propose a set of novel intentional OLAP operators, namely, describe, assess, explain, predict, and suggest, which express the user's need for results.
We fundamentally redefine what a query answer is, and escape from the constraint that the answer is a set of tuples; on the contrary, we complement the set of tuples with models (typically, but not exclusively, results of data mining algorithms over the involved data) that concisely represent the internal structure or correlations of the data.
Due to the diverse nature of the involved models, we come up (for the first time ever, to the best of our knowledge) with a unifying framework for them, that places its pillars on the extension of each data cell of a cube with information about the models that pertain to it -- practically converting the small parts that build up the models to data that annotate each cell.
We exploit this data-to-model mapping to provide highlights of the data, by isolating data and models that maximize the delivery of new information to the user.
We introduce a novel method for assessing the surprise that a new query result brings to the user, with respect to the information contained in previous results the user has seen via a new interestingness measure.
The individual parts of our proposal are integrated in a new data model for OLAP, which we call the Intentional Analytics Model.
We complement our contribution with a list of significant open problems for the community to address.
Recently, there have been increasing demands to construct compact deep architectures to remove unnecessary redundancy and to improve the inference speed.
While many recent works focus on reducing the redundancy by eliminating unneeded weight parameters, it is not possible to apply a single deep architecture for multiple devices with different resources.
When a new device or circumstantial condition requires a new deep architecture, it is necessary to construct and train a new network from scratch.
In this work, we propose a novel deep learning framework, called a nested sparse network, which exploits an n-in-1-type nested structure in a neural network.
A nested sparse network consists of multiple levels of networks with a different sparsity ratio associated with each level, and higher level networks share parameters with lower level networks to enable stable nested learning.
The proposed framework realizes a resource-aware versatile architecture as the same network can meet diverse resource requirements.
Moreover, the proposed nested network can learn different forms of knowledge in its internal networks at different levels, enabling multiple tasks using a single network, such as coarse-to-fine hierarchical classification.
In order to train the proposed nested sparse network, we propose efficient weight connection learning and channel and layer scheduling strategies.
We evaluate our network in multiple tasks, including adaptive deep compression, knowledge distillation, and learning class hierarchy, and demonstrate that nested sparse networks perform competitively, but more efficiently, compared to existing methods.
We present a quasi-experiment to investigate whether, and to what extent, sleep deprivation impacts the performance of novice software developers using the agile practice of test-first development (TFD).
We recruited 45 undergraduates and asked them to tackle a programming task.
Among the participants, 23 agreed to stay awake the night before carrying out the task, while 22 slept usually.
We analyzed the quality (i.e., the functional correctness) of the implementations delivered by the participants in both groups, their engagement in writing source code (i.e., the amount of activities performed in the IDE while tackling the programming task) and ability to apply TFD (i.e., the extent to which a participant can use this practice).
By comparing the two groups of participants, we found that a single night of sleep deprivation leads to a reduction of 50% in the quality of the implementations.
There is important evidence that the developers' engagement and their prowess to apply TFD are negatively impacted.
Our results also show that sleep-deprived developers make more fixes to syntactic mistakes in the source code.
We conclude that sleep deprivation has possibly disruptive effects on software development activities.
The results open opportunities for improving developers' performance by integrating the study of sleep with other psycho-physiological factors in which the software engineering research community has recently taken an interest in.
High-order parametric models that include terms for feature interactions are applied to various data mining tasks, where ground truth depends on interactions of features.
However, with sparse data, the high- dimensional parameters for feature interactions often face three issues: expensive computation, difficulty in parameter estimation and lack of structure.
Previous work has proposed approaches which can partially re- solve the three issues.
In particular, models with factorized parameters (e.g.Factorization Machines) and sparse learning algorithms (e.g.FTRL-Proximal) can tackle the first two issues but fail to address the third.
Regarding to unstructured parameters, constraints or complicated regularization terms are applied such that hierarchical structures can be imposed.
However, these methods make the optimization problem more challenging.
In this work, we propose Strongly Hierarchical Factorization Machines and ANOVA kernel regression where all the three issues can be addressed without making the optimization problem more difficult.
Experimental results show the proposed models significantly outperform the state-of-the-art in two data mining tasks: cold-start user response time prediction and stock volatility prediction.
Information Technology (IT) significantly impacts the environment throughout its life cycle.
Most enterprises have not paid enough attention to this until recently.
IT's environmental impact can be significantly reduced by behavioral changes, as well as technology changes.
Given the relative energy and materials inefficiency of most IT infrastructures today, many green IT initiatives can be easily tackled at no incremental cost.
The Green Grid - a non-profit trade organization of IT professionals is such an initiative, formed to initiate the issues of power and cooling in data centers, scattered world-wide.
The Green Grid seeks to define best practices for optimizing the efficient consumption of power at IT equipment and facility levels, as well as the manner in which cooling is delivered at these levels hence, providing promising attitude in bringing down the environmental hazards, as well as proceeding to the new era of green computing.
In this paper we review the various analytical aspects of The Green Grid upon the data centers and found green facts.
Sparse Subspace Clustering (SSC) has been used extensively for subspace identification tasks due to its theoretical guarantees and relative ease of implementation.
However SSC has quadratic computation and memory requirements with respect to the number of input data points.
This burden has prohibited SSCs use for all but the smallest datasets.
To overcome this we propose a new method, k-SSC, that screens out a large number of data points to both reduce SSC to linear memory and computational requirements.
We provide theoretical analysis for the bounds of success for k-SSC.
Our experiments show that k-SSC exceeds theoretical expectations and outperforms existing SSC approximations by maintaining the classification performance of SSC.
Furthermore in the spirit of reproducible research we have publicly released the source code for k-SSC
This paper describes a set of neural network architectures, called Prediction Neural Networks Set (PNNS), based on both fully-connected and convolutional neural networks, for intra image prediction.
The choice of neural network for predicting a given image block depends on the block size, hence does not need to be signalled to the decoder.
It is shown that, while fully-connected neural networks give good performance for small block sizes, convolutional neural networks provide better predictions in large blocks with complex textures.
Thanks to the use of masks of random sizes during training, the neural networks of PNNS well adapt to the available context that may vary, depending on the position of the image block to be predicted.
When integrating PNNS into a H.265 codec, PSNR-rate performance gains going from 1.46% to 5.20% are obtained.
These gains are on average 0.99% larger than those of prior neural network based methods.
Unlike the H.265 intra prediction modes, which are each specialized in predicting a specific texture, the proposed PNNS can model a large set of complex textures.
During the life span of large software projects, developers often apply the same code changes to different code locations in slight variations.
Since the application of these changes to all locations is time-consuming and error-prone, tools exist that learn change patterns from input examples, search for possible pattern applications, and generate corresponding recommendations.
In many cases, the generated recommendations are syntactically or semantically wrong due to code movements in the input examples.
Thus, they are of low accuracy and developers cannot directly copy them into their projects without adjustments.
We present the Accurate REcommendation System (ARES) that achieves a higher accuracy than other tools because its algorithms take care of code movements when creating patterns and recommendations.
On average, the recommendations by ARES have an accuracy of 96% with respect to code changes that developers have manually performed in commits of source code archives.
At the same time ARES achieves precision and recall values that are on par with other tools.
E-Learning is efficient, task relevant and just-in-time learning grown from the learning requirements of the new and dynamically changing world.
The term Semantic Web covers the steps to create a new WWW architecture that augments the content with formal semantics enabling better possibilities of navigation through the cyberspace and its contents.
In this paper, we present the Semantic Web-Based model for our e-learning system taking into account the learning environment at Saudi Arabian universities.
The proposed system is mainly based on ontology-based descriptions of content, context and structure of the learning materials.
It further provides flexible and personalized access to these learning materials.
The framework has been validated by an interview based qualitative method.
The assumption that training and testing samples are generated from the same distribution does not always hold for real-world machine-learning applications.
The procedure of tackling this discrepancy between the training (source) and testing (target) domains is known as domain adaptation.
We propose an unsupervised version of domain adaptation that considers the presence of only unlabelled data in the target domain.
Our approach centers on finding correspondences between samples of each domain.
The correspondences are obtained by treating the source and target samples as graphs and using a convex criterion to match them.
The criteria used are first-order and second-order similarities between the graphs as well as a class-based regularization.
We have also developed a computationally efficient routine for the convex optimization, thus allowing the proposed method to be used widely.
To verify the effectiveness of the proposed method, computer simulations were conducted on synthetic, image classification and sentiment classification datasets.
Results validated that the proposed local sample-to-sample matching method out-performs traditional moment-matching methods and is competitive with respect to current local domain-adaptation methods.
The novel "Volume-Enclosing Surface exTraction Algorithm" (VESTA) generates triangular isosurfaces from computed tomography volumetric images and/or three-dimensional (3D) simulation data.
Here, we present various benchmarks for GPU-based code implementations of both VESTA and the current state-of-the-art Marching Cubes Algorithm (MCA).
One major result of this study is that VESTA runs significantly faster than the MCA.
Geometric model fitting is a fundamental task in computer graphics and computer vision.
However, most geometric model fitting methods are unable to fit an arbitrary geometric model (e.g. a surface with holes) to incomplete data, due to that the similarity metrics used in these methods are unable to measure the rigid partial similarity between arbitrary models.
This paper hence proposes a novel rigid geometric similarity metric, which is able to measure both the full similarity and the partial similarity between arbitrary geometric models.
The proposed metric enables us to perform partial procedural geometric model fitting (PPGMF).
The task of PPGMF is to search a procedural geometric model space for the model rigidly similar to a query of non-complete point set.
Models in the procedural model space are generated according to a set of parametric modeling rules.
A typical query is a point cloud.
PPGMF is very useful as it can be used to fit arbitrary geometric models to non-complete (incomplete, over-complete or hybrid-complete) point cloud data.
For example, most laser scanning data is non-complete due to occlusion.
Our PPGMF method uses Markov chain Monte Carlo technique to optimize the proposed similarity metric over the model space.
To accelerate the optimization process, the method also employs a novel coarse-to-fine model dividing strategy to reject dissimilar models in advance.
Our method has been demonstrated on a variety of geometric models and non-complete data.
Experimental results show that the PPGMF method based on the proposed metric is able to fit non-complete data, while the method based on other metrics is unable.
It is also shown that our method can be accelerated by several times via early rejection.
Secure spontaneous authentication between devices worn at arbitrary location on the same body is a challenging, yet unsolved problem.
We propose BANDANA, the first-ever implicit secure device-to-device authentication scheme for devices worn on the same body.
Our approach leverages instantaneous variation in acceleration patterns from gait sequences to extract always-fresh secure secrets.
It enables secure spontaneous pairing of devices worn on the same body or interacted with.
The method is robust against noise in sensor readings and active attackers.
We demonstrate the robustness of BANDANA on two gait datasets and discuss the discriminability of intra- and inter-body cases, robustness to statistical bias, as well as possible attack scenarios.
Along with the emergence and popularity of social communications on the Internet, topic discovery from short texts becomes fundamental to many applications that require semantic understanding of textual content.
As a rising research field, short text topic modeling presents a new and complementary algorithmic methodology to supplement regular text topic modeling, especially targets to limited word co-occurrence information in short texts.
This paper presents the first comprehensive open-source package, called STTM, for use in Java that integrates the state-of-the-art models of short text topic modeling algorithms, benchmark datasets, and abundant functions for model inference and evaluation.
The package is designed to facilitate the expansion of new methods in this research field and make evaluations between the new approaches and existing ones accessible.
STTM is open-sourced at https://github.com/qiang2100/STTM.
Based on the in-depth analysis of the essence and features of vague phenomena, this paper focuses on establishing the axiomatical foundation of membership degree theory for vague phenomena, presents an axiomatic system to govern membership degrees and their interconnections.
On this basis, the concept of vague partition is introduced, further, the concept of fuzzy set introduced by Zadeh in 1965 is redefined based on vague partition from the perspective of axiomatization.
The thesis defended in this paper is that the relationship among vague attribute values should be the starting point to recognize and model vague phenomena from a quantitative view.
We consider a variation of Construction A of lattices from linear codes based on two classes of number fields, totally real and CM Galois number fields.
We propose a generic construction with explicit generator and Gram matrices, then focus on modular and unimodular lattices, obtained in the particular cases of totally real, respectively, imaginary, quadratic fields.
Our motivation comes from coding theory, thus some relevant properties of modular lattices, such as minimal norm, theta series, kissing number and secrecy gain are analyzed.
Interesting lattices are exhibited.
Current approaches in video forecasting attempt to generate videos directly in pixel space using Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs).
However, since these approaches try to model all the structure and scene dynamics at once, in unconstrained settings they often generate uninterpretable results.
Our insight is to model the forecasting problem at a higher level of abstraction.
Specifically, we exploit human pose detectors as a free source of supervision and break the video forecasting problem into two discrete steps.
First we explicitly model the high level structure of active objects in the scene---humans---and use a VAE to model the possible future movements of humans in the pose space.
We then use the future poses generated as conditional information to a GAN to predict the future frames of the video in pixel space.
By using the structured space of pose as an intermediate representation, we sidestep the problems that GANs have in generating video pixels directly.
We show through quantitative and qualitative evaluation that our method outperforms state-of-the-art methods for video prediction.
Extreme learning machine (ELM) is an extremely fast learning method and has a powerful performance for pattern recognition tasks proven by enormous researches and engineers.
However, its good generalization ability is built on large numbers of hidden neurons, which is not beneficial to real time response in the test process.
In this paper, we proposed new ways, named "constrained extreme learning machines" (CELMs), to randomly select hidden neurons based on sample distribution.
Compared to completely random selection of hidden nodes in ELM, the CELMs randomly select hidden nodes from the constrained vector space containing some basic combinations of original sample vectors.
The experimental results show that the CELMs have better generalization ability than traditional ELM, SVM and some other related methods.
Additionally, the CELMs have a similar fast learning speed as ELM.
Connection calculi allow for very compact implementations of goal-directed proof search.
We give an overview of our work related to connection tableaux calculi: First, we show optimised functional implementations of clausal and nonclausal proof search, including a consistent Skolemisation procedure for machine learning.
Then, we show two guidance methods based on machine learning, namely reordering of proof steps with Naive Bayesian probablities, and expansion of a proof search tree with Monte Carlo Tree Search.
Finally, we give a translation of connection proofs to LK, enabling proof certification and automatic proof search in interactive theorem provers.
The major aim of this survey is to identify the strengths and weaknesses of a representative set of Data-Mining and Integration (DMI) query languages.
We describe a set of properties of DMI-related languages that we use for a systematic evaluation of these languages.
In addition, we introduce a scoring system that we use to quantify our opinion on how well a DMI-related language supports a property.
The languages surveyed in this paper include: DMQL, MineSQL, MSQL, M2MQL, dmFSQL, OLEDB for DM, MINE RULE, and Oracle Data Mining.
This survey may help researchers to propose a DMI language that is beyond the state-of-the-art, or it may help practitioners to select an existing language that fits well a purpose.
Cooperative multi-agent planning (MAP) is a relatively recent research field that combines technologies, algorithms and techniques developed by the Artificial Intelligence Planning and Multi-Agent Systems communities.
While planning has been generally treated as a single-agent task, MAP generalizes this concept by considering multiple intelligent agents that work cooperatively to develop a course of action that satisfies the goals of the group.
This paper reviews the most relevant approaches to MAP, putting the focus on the solvers that took part in the 2015 Competition of Distributed and Multi-Agent Planning, and classifies them according to their key features and relative performance.
Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the worlds can be modified (e.g., material and reflectance), (iii) physical simulations can be performed, and (iv) algorithms can be learnt and evaluated.
But creating realistic virtual worlds is not easy.
The game industry, however, has spent a lot of effort creating 3D worlds, which a player can interact with.
So researchers can build on these resources to create virtual worlds, provided we can access and modify the internal data structures of the games.
To enable this we created an open-source plugin UnrealCV (http://unrealcv.github.io) for a popular game engine Unreal Engine 4 (UE4).
We show two applications: (i) a proof of concept image dataset, and (ii) linking Caffe with the virtual world to test deep network algorithms.
Three-dimensional (3D) biomedical image sets are often acquired with in-plane pixel spacings that are far less than the out-of-plane spacings between images.
The resultant anisotropy, which can be detrimental in many applications, can be decreased using image interpolation.
Optical flow and/or other registration-based interpolators have proven useful in such interpolation roles in the past.
When acquired images are comprised of signals that describe the flow velocity of fluids, additional information is available to guide the interpolation process.
In this paper, we present an optical-flow based framework for image interpolation that also minimizes resultant divergence in the interpolated data.
In this paper we present a new and simple language-independent method for word-alignment based on the use of external sources of bilingual information such as machine translation systems.
We show that the few parameters of the aligner can be trained on a very small corpus, which leads to results comparable to those obtained by the state-of-the-art tool GIZA++ in terms of precision.
Regarding other metrics, such as alignment error rate or F-measure, the parametric aligner, when trained on a very small gold-standard (450 pairs of sentences), provides results comparable to those produced by GIZA++ when trained on an in-domain corpus of around 10,000 pairs of sentences.
Furthermore, the results obtained indicate that the training is domain-independent, which enables the use of the trained aligner 'on the fly' on any new pair of sentences.
Conventionally, Selective Harmonic Elimination (SHE) method in 2-level inverters, finds best switching angles to reach first voltage harmonic to reference level and eliminate other harmonics, simultaneously.
Considering Induction Motor (IM) as the inverter load, and wide DC bus voltage variations, the inverter must operate in both over-modulation and linear modulation region.
Main objective of the modified SHE is to reduce harmonic torques through finding the best switching angles.
In this paper, optimization is based on optimizing phasor equations in which harmonic torques are calculated.
The procedure of this method is that, first, the ratio of the same torque harmonics is estimated, secondly, by using that estimation, the ratio of voltage harmonics that generates homogeneous torques is calculated.
For the estimation and the calculation of the ratios motor parameter, mechanical speed of the rotor, the applied frequency, and the concept of slip are used.
The advantage of this approach is highlighted when mechanical load and DC bus voltage variations are taken into consideration.
Simulation results are presented under a wide range of working conditions in an induction motor to demonstrate the effectiveness of the proposed method.
Research on optical TEMPEST has moved forward since 2002 when the first pair of papers on the subject emerged independently and from widely separated locations in the world within a week of each other.
Since that time, vulnerabilities have evolved along with systems, and several new threat vectors have consequently appeared.
Although the supply chain ecosystem of Ethernet has reduced the vulnerability of billions of devices through use of standardised PHY solutions, other recent trends including the Internet of Things (IoT) in both industrial settings and the general population, High Frequency Trading (HFT) in the financial sector, the European General Data Protection Regulation (GDPR), and inexpensive drones have made it relevant again for consideration in the design of new products for privacy.
One of the general principles of security is that vulnerabilities, once fixed, sometimes do not stay that way.
Analyzing videos of human actions involves understanding the temporal relationships among video frames.
State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs.
Such a two-stage approach is computationally expensive, storage demanding, and not end-to-end trainable.
In this paper, we present a novel CNN architecture that implicitly captures motion information between adjacent frames.
We name our approach hidden two-stream CNNs because it only takes raw video frames as input and directly predicts action classes without explicitly computing optical flow.
Our end-to-end approach is 10x faster than its two-stage baseline.
Experimental results on four challenging action recognition datasets: UCF101, HMDB51, THUMOS14 and ActivityNet v1.2 show that our approach significantly outperforms the previous best real-time approaches.
Context: Pre-publication peer review of scientific articles is considered a key element of the research process in software engineering, yet it is often perceived as not to work fully well.
Objective: We aim at understanding the perceptions of and attitudes towards peer review of authors and reviewers at one of software engineering's most prestigious venues, the International Conference on Software Engineering (ICSE).
Method: We invited 932 ICSE 2014/15/16 authors and reviewers to participate in a survey with 10 closed and 9 open questions.
Results: We present a multitude of results, such as: Respondents perceive only one third of all reviews to be good, yet one third as useless or misleading; they propose double-blind or zero-blind reviewing regimes for improvement; they would like to see showable proofs of (good) reviewing work be introduced; attitude change trends are weak.
Conclusion: The perception of the current state of software engineering peer review is fairly negative.
Also, we found hardly any trend that suggests reviewing will improve by itself over time; the community will have to make explicit efforts.
Fortunately, our (mostly senior) respondents appear more open for trying different peer reviewing regimes than we had expected.
Inspired by previous work of Shoup, Lenstra-De Smit and Couveignes-Lercier, we give fast algorithms to compute in (the first levels of) the ell-adic closure of a finite field.
In many cases, our algorithms have quasi-linear complexity.
Convolutional neural networks are designed for dense data, but vision data is often sparse (stereo depth, point clouds, pen stroke, etc.).
We present a method to handle sparse depth data with optional dense RGB, and accomplish depth completion and semantic segmentation changing only the last layer.
Our proposal efficiently learns sparse features without the need of an additional validity mask.
We show how to ensure network robustness to varying input sparsities.
Our method even works with densities as low as 0.8% (8 layer lidar), and outperforms all published state-of-the-art on the Kitti depth completion benchmark.
We present Shrinking Horizon Model Predictive Control (SHMPC) for discrete-time linear systems with Signal Temporal Logic (STL) specification constraints under stochastic disturbances.
The control objective is to maximize an optimization function under the restriction that a given STL specification is satisfied with high probability against stochastic uncertainties.
We formulate a general solution, which does not require precise knowledge of the probability distributions of the (possibly dependent) stochastic disturbances; only the bounded support intervals of the density functions and moment intervals are used.
For the specific case of disturbances that are independent and normally distributed, we optimize the controllers further by utilizing knowledge of the disturbance probability distributions.
We show that in both cases, the control law can be obtained by solving optimization problems with linear constraints at each step.
We experimentally demonstrate effectiveness of this approach by synthesizing a controller for an HVAC system.
Bugs that surface in mobile applications can be difficult to reproduce and fix due to several confounding factors including the highly GUI-driven nature of mobile apps, varying contextual states, differing platform versions and device fragmentation.
It is clear that developers need support in the form of automated tools that allow for more precise reporting of application defects in order to facilitate more efficient and effective bug fixes.
In this paper, we present a tool aimed at supporting application testers and developers in the process of On-Device Bug Reporting.
Our tool, called ODBR, leverages the uiautomator framework and low-level event stream capture to offer support for recording and replaying a series of input gesture and sensor events that describe a bug in an Android application.
Evolution sculpts both the body plans and nervous systems of agents together over time.
In contrast, in AI and robotics, a robot's body plan is usually designed by hand, and control policies are then optimized for that fixed design.
The task of simultaneously co-optimizing the morphology and controller of an embodied robot has remained a challenge.
In psychology, the theory of embodied cognition posits that behavior arises from a close coupling between body plan and sensorimotor control, which suggests why co-optimizing these two subsystems is so difficult: most evolutionary changes to morphology tend to adversely impact sensorimotor control, leading to an overall decrease in behavioral performance.
Here, we further examine this hypothesis and demonstrate a technique for "morphological innovation protection", which temporarily reduces selection pressure on recently morphologically-changed individuals, thus enabling evolution some time to "readapt" to the new morphology with subsequent control policy mutations.
We show the potential for this method to avoid local optima and converge to similar highly fit morphologies across widely varying initial conditions, while sustaining fitness improvements further into optimization.
While this technique is admittedly only the first of many steps that must be taken to achieve scalable optimization of embodied machines, we hope that theoretical insight into the cause of evolutionary stagnation in current methods will help to enable the automation of robot design and behavioral training -- while simultaneously providing a testbed to investigate the theory of embodied cognition.
We propose a new design for a cellular neural network with spintronic neurons and CMOS-based synapses.
Harnessing the magnetoelectric and inverse Rashba-Edelstein effects allows natural emulation of the behavior of an ideal cellular network.
This combination of effects offers an increase in speed and efficiency over other spintronic neural networks.
A rigorous performance analysis via simulation is provided.
Integrating model-free and model-based approaches in reinforcement learning has the potential to achieve the high performance of model-free algorithms with low sample complexity.
However, this is difficult because an imperfect dynamics model can degrade the performance of the learning algorithm, and in sufficiently complex environments, the dynamics model will almost always be imperfect.
As a result, a key challenge is to combine model-based approaches with model-free learning in such a way that errors in the model do not degrade performance.
We propose stochastic ensemble value expansion (STEVE), a novel model-based technique that addresses this issue.
By dynamically interpolating between model rollouts of various horizon lengths for each individual example, STEVE ensures that the model is only utilized when doing so does not introduce significant errors.
Our approach outperforms model-free baselines on challenging continuous control benchmarks with an order-of-magnitude increase in sample efficiency, and in contrast to previous model-based approaches, performance does not degrade in complex environments.
The Internet of Things (IoT) being a promising technology of the future is expected to connect billions of devices.
The increased number of communication is expected to generate mountains of data and the security of data can be a threat.
The devices in the architecture are essentially smaller in size and low powered.
Conventional encryption algorithms are generally computationally expensive due to their complexity and requires many rounds to encrypt, essentially wasting the constrained energy of the gadgets.
Less complex algorithm, however, may compromise the desired integrity.
In this paper we propose a lightweight encryption algorithm named as Secure IoT (SIT).
It is a 64-bit block cipher and requires 64-bit key to encrypt the data.
The architecture of the algorithm is a mixture of feistel and a uniform substitution-permutation network.
Simulations result shows the algorithm provides substantial security in just five encryption rounds.
The hardware implementation of the algorithm is done on a low cost 8-bit micro-controller and the results of code size, memory utilization and encryption/decryption execution cycles are compared with benchmark encryption algorithms.
The MATLAB code for relevant simulations is available online at https://goo.gl/Uw7E0W.
Social media for news consumption is becoming increasingly popular due to its easy access, fast dissemination, and low cost.
However, social media also enable the wide propagation of "fake news", i.e., news with intentionally false information.
Fake news on social media poses significant negative societal effects, and also presents unique challenges.
To tackle the challenges, many existing works exploit various features, from a network perspective, to detect and mitigate fake news.
In essence, news dissemination ecosystem involves three dimensions on social media, i.e., a content dimension, a social dimension, and a temporal dimension.
In this chapter, we will review network properties for studying fake news, introduce popular network types and how these networks can be used to detect and mitigation fake news on social media.
Increasing numbers of software vulnerabilities are discovered every year whether they are reported publicly or discovered internally in proprietary code.
These vulnerabilities can pose serious risk of exploit and result in system compromise, information leaks, or denial of service.
We leveraged the wealth of C and C++ open-source code available to develop a large-scale function-level vulnerability detection system using machine learning.
To supplement existing labeled vulnerability datasets, we compiled a vast dataset of millions of open-source functions and labeled it with carefully-selected findings from three different static analyzers that indicate potential exploits.
The labeled dataset is available at: https://osf.io/d45bw/.
Using these datasets, we developed a fast and scalable vulnerability detection tool based on deep feature representation learning that directly interprets lexed source code.
We evaluated our tool on code from both real software packages and the NIST SATE IV benchmark dataset.
Our results demonstrate that deep feature representation learning on source code is a promising approach for automated software vulnerability detection.
Neural network based architectures used for sound recognition are usually adapted from other application domains, which may not harness sound related properties.
The ConditionaL Neural Network (CLNN) is designed to consider the relational properties across frames in a temporal signal, and its extension the Masked ConditionaL Neural Network (MCLNN) embeds a filterbank behavior within the network, which enforces the network to learn in frequency bands rather than bins.
Additionally, it automates the exploration of different feature combinations analogous to handcrafting the optimum combination of features for a recognition task.
We applied the MCLNN to the environmental sounds of the ESC-10 dataset.
The MCLNN achieved competitive accuracies compared to state-of-the-art convolutional neural networks and hand-crafted attempts.
Deep learning based speech enhancement and source separation systems have recently reached unprecedented levels of quality, to the point that performance is reaching a new ceiling.
Most systems rely on estimating the magnitude of a target source by estimating a real-valued mask to be applied to a time-frequency representation of the mixture signal.
A limiting factor in such approaches is a lack of phase estimation: the phase of the mixture is most often used when reconstructing the estimated time-domain signal.
Here, we propose `MagBook', `phasebook', and `Combook', three new types of layers based on discrete representations that can be used to estimate complex time-frequency masks.
MagBook layers extend classical sigmoidal units and a recently introduced convex softmax activation for mask-based magnitude estimation.
Phasebook layers use a similar structure to give an estimate of the phase mask without suffering from phase wrapping issues.
Combook layers are an alternative to the MagBook-Phasebook combination that directly estimate complex masks.
We present various training and inference regimes involving these representations, and explain in particular how to include them in an end-to-end learning framework.
We also present an oracle study to assess upper bounds on performance for various types of masks using discrete phase representations.
We evaluate the proposed methods on the wsj0-2mix dataset, a well-studied corpus for single-channel speaker-independent speaker separation, matching the performance of state-of-the-art mask-based approaches without requiring additional phase reconstruction steps.
Previous research has pointed that software applications should not depend on programmers to provide security for end-users as majority of programmers are not experts of computer security.
On the other hand, some studies have revealed that security experts believe programmers have a major role to play in ensuring the end-users' security.
However, there has been no investigation on what programmers perceive about their responsibility for the end-users' security of applications they develop.
In this work, by conducting a qualitative experimental study with 40 software developers, we attempted to understand the programmer's perception on who is responsible for ensuring end-users' security of the applications they develop.
Results revealed majority of programmers perceive that they are responsible for the end-users' security of applications they develop.
Furthermore, results showed that even though programmers aware of things they need to do to ensure end-users' security, they do not often follow them.
We believe these results would change the current view on the role that different stakeholders of the software development process (i.e. researchers, security experts, programmers and Application Programming Interface (API) developers) have to play in order to ensure the security of software applications.
DBSCAN is a classical density-based clustering procedure with tremendous practical relevance.
However, DBSCAN implicitly needs to compute the empirical density for each sample point, leading to a quadratic worst-case time complexity, which is too slow on large datasets.
We propose DBSCAN++, a simple modification of DBSCAN which only requires computing the densities for a chosen subset of points.
We show empirically that, compared to traditional DBSCAN, DBSCAN++ can provide not only competitive performance but also added robustness in the bandwidth hyperparameter while taking a fraction of the runtime.
We also present statistical consistency guarantees showing the trade-off between computational cost and estimation rates.
Surprisingly, up to a certain point, we can enjoy the same estimation rates while lowering computational cost, showing that DBSCAN++ is a sub-quadratic algorithm that attains minimax optimal rates for level-set estimation, a quality that may be of independent interest.
Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets.
In this paper, we offer contributions in both these areas to enable similar progress in audio modeling.
First, we detail a powerful new WaveNet-style autoencoder model that conditions an autoregressive decoder on temporal codes learned from the raw audio waveform.
Second, we introduce NSynth, a large-scale and high-quality dataset of musical notes that is an order of magnitude larger than comparable public datasets.
Using NSynth, we demonstrate improved qualitative and quantitative performance of the WaveNet autoencoder over a well-tuned spectral autoencoder baseline.
Finally, we show that the model learns a manifold of embeddings that allows for morphing between instruments, meaningfully interpolating in timbre to create new types of sounds that are realistic and expressive.
In this paper, we present a set of simulation models to more realistically mimic the behaviour of users reading messages.
We propose a User Behaviour Model, where a simulated user reacts to a message by a flexible set of possible reactions (e.g. ignore, read, like, save, etc.) and a mobility-based reaction (visit a place, run away from danger, etc.).
We describe our models and their implementation in OMNeT++.
We strongly believe that these models will significantly contribute to the state of the art of simulating realistically opportunistic networks.
Recently, neural machine translation has achieved remarkable progress by introducing well-designed deep neural networks into its encoder-decoder framework.
From the optimization perspective, residual connections are adopted to improve learning performance for both encoder and decoder in most of these deep architectures, and advanced attention connections are applied as well.
Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT.
The proposed DenseNMT not only allows dense connection in creating new features for both encoder and decoder, but also uses the dense attention structure to improve attention quality.
Our experiments on multiple datasets show that DenseNMT structure is more competitive and efficient.
Acoustic event detection for content analysis in most cases relies on lots of labeled data.
However, manually annotating data is a time-consuming task, which thus makes few annotated resources available so far.
Unlike audio event detection, automatic audio tagging, a multi-label acoustic event classification task, only relies on weakly labeled data.
This is highly desirable to some practical applications using audio analysis.
In this paper we propose to use a fully deep neural network (DNN) framework to handle the multi-label classification task in a regression way.
Considering that only chunk-level rather than frame-level labels are available, the whole or almost whole frames of the chunk were fed into the DNN to perform a multi-label regression for the expected tags.
The fully DNN, which is regarded as an encoding function, can well map the audio features sequence to a multi-tag vector.
A deep pyramid structure was also designed to extract more robust high-level features related to the target tags.
Further improved methods were adopted, such as the Dropout and background noise aware training, to enhance its generalization capability for new audio recordings in mismatched environments.
Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input.
The results show that our approach obtained a 15% relative improvement compared with the official GMM-based method of DCASE 2016 challenge.
In this paper a secret message/image transmission technique has been proposed through (2, 2) visual cryptographic share which is non-interpretable in general.
A binary image is taken as cover image and authenticating message/image has been fabricated into it through a hash function where two bits in each pixel within four bits from LSB of the pixel is embedded and as a result it converts the binary image to gray scale one.
(2,2) visual cryptographic shares are generated from this converted gray scale image.
During decoding shares are combined to regenerate the authenticated image from where the secret message/image is obtained through the same hash function along with reduction of noise.
Noise reduction is also done on regenerated authenticated image to regenerate original cover image at destination.
We illustrate how elementary information-theoretic ideas may be employed to provide proofs for well-known, nontrivial results in number theory.
Specifically, we give an elementary and fairly short proof of the following asymptotic result: The sum of (log p)/p, taken over all primes p not exceeding n, is asymptotic to log n as n tends to infinity.
We also give finite-n bounds refining the above limit.
This result, originally proved by Chebyshev in 1852, is closely related to the celebrated prime number theorem.
The Ring Learning-With-Errors (LWE) problem, whose security is based on hard ideal lattice problems, has proven to be a promising primitive with diverse applications in cryptography.
There are however recent discoveries of faster algorithms for the principal ideal SVP problem, and attempts to generalize the attack to non-principal ideals.
In this work, we study the LWE problem on group rings, and build cryptographic schemes based on this new primitive.
One can regard the LWE on cyclotomic integers as a special case when the underlying group is cyclic, while our proposal utilizes non-commutative groups, which eliminates the weakness associated with the principal ideal lattices.
In particular, we show how to build public key encryption schemes from dihedral group rings, which maintains the efficiency of the ring-LWE and improves its security.
This paper addresses the general problem of blind echo retrieval, i.e., given M sensors measuring in the discrete-time domain M mixtures of K delayed and attenuated copies of an unknown source signal, can the echo locations and weights be recovered?
This problem has broad applications in fields such as sonars, seismol-ogy, ultrasounds or room acoustics.
It belongs to the broader class of blind channel identification problems, which have been intensively studied in signal processing.
Existing methods in the literature proceed in two steps: (i) blind estimation of sparse discrete-time filters and (ii) echo information retrieval by peak-picking on filters.
The precision of these methods is fundamentally limited by the rate at which the signals are sampled: estimated echo locations are necessary on-grid, and since true locations never match the sampling grid, the weight estimation precision is impacted.
This is the so-called basis-mismatch problem in compressed sensing.
We propose a radically different approach to the problem, building on the framework of finite-rate-of-innovation sampling.
The approach operates directly in the parameter-space of echo locations and weights, and enables near-exact blind and off-grid echo retrieval from discrete-time measurements.
It is shown to outperform conventional methods by several orders of magnitude in precision.
When multiple radio-frequency sources are connected to multiple loads through a passive multiport matching network, perfect power transfer to the loads across all frequencies is generally impossible.
In this two-part paper, we provide analyses of bandwidth over which power transfer is possible.
Our principal tools include broadband multiport matching upper bounds, presented herein, on the integral over all frequency of the logarithm of a suitably defined power loss ratio.
In general, the larger the integral, the larger the bandwidth over which power transfer can be accomplished.
We apply these bounds in several ways: We show how the number of sources and loads, and the coupling between loads, affect achievable bandwidth.
We analyze the bandwidth of networks constrained to have certain architectures.
We characterize systems whose bandwidths scale as the ratio between the numbers of loads and sources.
The first part of the paper presents the bounds and uses them to analyze loads whose frequency responses can be represented by analytical circuit models.
The second part analyzes the bandwidth of realistic loads whose frequency responses are available numerically.
We provide applications to wireless transmitters where the loads are antennas being driven by amplifiers.
The derivations of the bounds are also included.
We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs.
Our algorithm, COAL, makes predictions by regressing to each label's cost and predicting the smallest.
On a new example, it uses a set of regressors that perform well on past data to estimate possible costs for each label.
It queries only the labels that could be the best, ignoring the sure losers.
We prove COAL can be efficiently implemented for any regression family that admits squared loss optimization; it also enjoys strong guarantees with respect to predictive performance and labeling effort.
We empirically compare COAL to passive learning and several active learning baselines, showing significant improvements in labeling effort and test cost on real-world datasets.
Peer-to-peer (P2P) locality has recently raised a lot of interest in the community.
Indeed, whereas P2P content distribution enables financial savings for the content providers, it dramatically increases the traffic on inter-ISP links.
To solve this issue, the idea to keep a fraction of the P2P traffic local to each ISP was introduced a few years ago.
Since then, P2P solutions exploiting locality have been introduced.
However, several fundamental issues on locality still need to be explored.
In particular, how far can we push locality, and what is, at the scale of the Internet, the reduction of traffic that can be achieved with locality?
In this paper, we perform extensive experiments on a controlled environment with up to 10,000 BitTorrent clients to evaluate the impact of high locality on inter-ISP links traffic and peers download completion time.
We introduce two simple mechanisms that make high locality possible in challenging scenarios and we show that we save up to several orders of magnitude inter-ISP traffic compared to traditional locality without adversely impacting peers download completion time.
In addition, we crawled 214,443 torrents representing 6,113,224 unique peers spread among 9,605 ASes.
We show that whereas the torrents we crawled generated 11.6 petabytes of inter-ISP traffic, our locality policy implemented for all torrents could have reduced the global inter-ISP traffic by up to 40%.
Recreation of flight trajectory is important among research areas.
The design of a flight trajectory recreation and playback system is presented in this paper.
Rather than transferring the flight data to diagram, graph and table, flight data is visualized on the 3D global of ossimPlanet. ossimPlanet is an open-source 3D global geo-spatial viewer and the system realization is based on analysis it.
Users are allowed to choose their interested flight of aerial mission.
The aerial photographs and corresponding configuration files in which flight data is included would be read in.
And the flight statuses would be stored.
The flight trajectory is then recreated.
Users can view the photographs and flight trajectory marks on the correct positions of 3D global.
The scene along flight trajectory is also simulated at the plane's eye point.
This paper provides a more intuitive way for recreation of flight trajectory.
The cost is decreased remarkably and security is ensured by secondary development on open-source platform.
We propose a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information.
We encode textual drug pairs with convolutional neural networks and their molecular pairs with graph convolutional networks (GCNs), and then we concatenate the outputs of these two networks.
In the experiments, we show that GCNs can predict DDIs from the molecular structures of drugs in high accuracy and the molecular information can enhance text-based DDI extraction by 2.39 percent points in the F-score on the DDIExtraction 2013 shared task data set.
Modern advanced analytics applications make use of machine learning techniques and contain multiple steps of domain-specific and general-purpose processing with high resource requirements.
We present KeystoneML, a system that captures and optimizes the end-to-end large-scale machine learning applications for high-throughput training in a distributed environment with a high-level API.
This approach offers increased ease of use and higher performance over existing systems for large scale learning.
We demonstrate the effectiveness of KeystoneML in achieving high quality statistical accuracy and scalable training using real world datasets in several domains.
By optimizing execution KeystoneML achieves up to 15x training throughput over unoptimized execution on a real image classification application.
This paper presents an interconnected control-planning strategy for redundant manipulators, subject to system and environmental constraints.
The method incorporates low-level control characteristics and high-level planning components into a robust strategy for manipulators acting in complex environments, subject to joint limits.
This strategy is formulated using an adaptive control rule, the estimated dynamic model of the robotic system and the nullspace of the linearized constraints.
A path is generated that takes into account the capabilities of the platform.
The proposed method is computationally efficient, enabling its implementation on a real multi-body robotic system.
Through experimental results with a 7 DOF manipulator, we demonstrate the performance of the method in real-world scenarios.
We study the unsupervised learning of CNNs for optical flow estimation using proxy ground truth data.
Supervised CNNs, due to their immense learning capacity, have shown superior performance on a range of computer vision problems including optical flow prediction.
They however require the ground truth flow which is usually not accessible except on limited synthetic data.
Without the guidance of ground truth optical flow, unsupervised CNNs often perform worse as they are naturally ill-conditioned.
We therefore propose a novel framework in which proxy ground truth data generated from classical approaches is used to guide the CNN learning.
The models are further refined in an unsupervised fashion using an image reconstruction loss.
Our guided learning approach is competitive with or superior to state-of-the-art approaches on three standard benchmark datasets yet is completely unsupervised and can run in real time.
We propose an effective Hybrid Deep Learning (HDL) architecture for the task of determining the probability that a questioned handwritten word has been written by a known writer.
HDL is an amalgamation of Auto-Learned Features (ALF) and Human-Engineered Features (HEF).
To extract auto-learned features we use two methods: First, Two Channel Convolutional Neural Network (TC-CNN); Second, Two Channel Autoencoder (TC-AE).
Furthermore, human-engineered features are extracted by using two methods: First, Gradient Structural Concavity (GSC); Second, Scale Invariant Feature Transform (SIFT).
Experiments are performed by complementing one of the HEF methods with one ALF method on 150000 pairs of samples of the word "AND" cropped from handwritten notes written by 1500 writers.
Our results indicate that HDL architecture with AE-GSC achieves 99.7% accuracy on seen writer dataset and 92.16% accuracy on shuffled writer dataset which out performs CEDAR-FOX, as for unseen writer dataset, AE-SIFT performs comparable to this sophisticated handwriting comparison tool.
We show how faceted search using a combination of traditional classification systems and mixed-membership topic models can go beyond keyword search to inform resource discovery, hypothesis formulation, and argument extraction for interdisciplinary research.
Our test domain is the history and philosophy of scientific work on animal mind and cognition.
The methods can be generalized to other research areas and ultimately support a system for semi-automatic identification of argument structures.
We provide a case study for the application of the methods to the problem of identifying and extracting arguments about anthropomorphism during a critical period in the development of comparative psychology.
We show how a combination of classification systems and mixed-membership models trained over large digital libraries can inform resource discovery in this domain.
Through a novel approach of "drill-down" topic modeling---simultaneously reducing both the size of the corpus and the unit of analysis---we are able to reduce a large collection of fulltext volumes to a much smaller set of pages within six focal volumes containing arguments of interest to historians and philosophers of comparative psychology.
The volumes identified in this way did not appear among the first ten results of the keyword search in the HathiTrust digital library and the pages bear the kind of "close reading" needed to generate original interpretations that is the heart of scholarly work in the humanities.
Zooming back out, we provide a way to place the books onto a map of science originally constructed from very different data and for different purposes.
The multilevel approach advances understanding of the intellectual and societal contexts in which writings are interpreted.
There has been significant increase in penetration of renewable generation (RG) sources all over the world.
Localized concentration of many such generators could initiate a cascade tripping sequence that might threaten the stability of the entire system.
Understanding the impact of cascade tripping process would help the system planner identify trip sequences that must be blocked in order to increase stability.
In this work, we attempt to understand the consequences of cascade tripping mechanism through a Lyapunov approach.
A conservative definition for the stability region (SR) along with its estimation for a given cascading sequence using sum of squares (SOS) programming is proposed.
Finally, a simple probabilistic definition of the SR is used to visualize the risk of instability and understand the impact of blocking trip sequences.
A 3-machine system with significant RG penetration is used to demonstrate the idea.
Image processing and pixel-wise dense prediction have been advanced by harnessing the capabilities of deep learning.
One central issue of deep learning is the limited capacity to handle joint upsampling.
We present a deep learning building block for joint upsampling, namely guided filtering layer.
This layer aims at efficiently generating the high-resolution output given the corresponding low-resolution one and a high-resolution guidance map.
The proposed layer is composed of a guided filter, which is reformulated as a fully differentiable block.
To this end, we show that a guided filter can be expressed as a group of spatial varying linear transformation matrices.
This layer could be integrated with the convolutional neural networks (CNNs) and jointly optimized through end-to-end training.
To further take advantage of end-to-end training, we plug in a trainable transformation function that generates task-specific guidance maps.
By integrating the CNNs and the proposed layer, we form deep guided filtering networks.
The proposed networks are evaluated on five advanced image processing tasks.
Experiments on MIT-Adobe FiveK Dataset demonstrate that the proposed approach runs 10-100 times faster and achieves the state-of-the-art performance.
We also show that the proposed guided filtering layer helps to improve the performance of multiple pixel-wise dense prediction tasks.
The code is available at https://github.com/wuhuikai/DeepGuidedFilter.
This paper investigates and evaluates support vector machine active learning algorithms for use with imbalanced datasets, which commonly arise in many applications such as information extraction applications.
Algorithms based on closest-to-hyperplane selection and query-by-committee selection are combined with methods for addressing imbalance such as positive amplification based on prevalence statistics from initial random samples.
Three algorithms (ClosestPA, QBagPA, and QBoostPA) are presented and carefully evaluated on datasets for text classification and relation extraction.
The ClosestPA algorithm is shown to consistently outperform the other two in a variety of ways and insights are provided as to why this is the case.
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models.
Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time.
Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning.
In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems.
We review approaches that use machine learning or meta-heuristics for software optimization at compile-time and run-time.
Additionally, we discuss challenges and future research directions.
The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of software optimization for parallel computing systems.
Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.
Device-free localization (DFL) methods use measured changes in the received signal strength (RSS) between many pairs of RF nodes to provide location estimates of a person inside the wireless network.
Fundamental challenges for RSS DFL methods include having a model of RSS measurements as a function of a person's location, and maintaining an accurate model as the environment changes over time.
Current methods rely on either labeled empty-area calibration or labeled fingerprints with a person at each location.
Both need to be frequently recalibrated or retrained to stay current with changing environments.
Other DFL methods only localize people in motion.
In this paper, we address these challenges by, first, introducing a new mixture model for link RSS as a function of a person's location, and second, providing the framework to update model parameters without ever being provided labeled data from either empty-area or known-location classes.
We develop two new Bayesian localization methods based on our mixture model and experimentally validate our system at three test sites with seven days of measurements.
We demonstrate that our methods localize a person with non-degrading performance in changing environments, and, in addition, reduce localization error by 11-51% compared to other DFL methods.
Network quantization is one of network compression techniques to reduce the redundancy of deep neural networks.
It reduces the number of distinct network parameter values by quantization in order to save the storage for them.
In this paper, we design network quantization schemes that minimize the performance loss due to quantization given a compression ratio constraint.
We analyze the quantitative relation of quantization errors to the neural network loss function and identify that the Hessian-weighted distortion measure is locally the right objective function for the optimization of network quantization.
As a result, Hessian-weighted k-means clustering is proposed for clustering network parameters to quantize.
When optimal variable-length binary codes, e.g., Huffman codes, are employed for further compression, we derive that the network quantization problem can be related to the entropy-constrained scalar quantization (ECSQ) problem in information theory and consequently propose two solutions of ECSQ for network quantization, i.e., uniform quantization and an iterative solution similar to Lloyd's algorithm.
Finally, using the simple uniform quantization followed by Huffman coding, we show from our experiments that the compression ratios of 51.25, 22.17 and 40.65 are achievable for LeNet, 32-layer ResNet and AlexNet, respectively.
In this paper, we analyze efficacy of the fast gradient sign method (FGSM) and the Carlini-Wagner's L2 (CW-L2) attack.
We prove that, within a certain regime, the untargeted FGSM can fool any convolutional neural nets (CNNs) with ReLU activation; the targeted FGSM can mislead any CNNs with ReLU activation to classify any given image into any prescribed class.
For a special two-layer neural network: a linear layer followed by the softmax output activation, we show that the CW-L2 attack increases the ratio of the classification probability between the target and ground truth classes.
Moreover, we provide numerical results to verify all our theoretical results.
R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification.
However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing.
This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing.
The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model.
The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources.
Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing.
Traditional authentication in radio-frequency (RF) systems enable secure data communication within a network through techniques such as digital signatures and hash-based message authentication codes (HMAC), which suffer from key recovery attacks.
State-of-the-art IoT networks such as Nest also use Open Authentication (OAuth 2.0) protocols that are vulnerable to cross-site-recovery forgery (CSRF), which shows that these techniques may not prevent an adversary from copying or modeling the secret IDs or encryption keys using invasive, side channel, learning or software attacks.
Physical unclonable functions (PUF), on the other hand, can exploit manufacturing process variations to uniquely identify silicon chips which makes a PUF-based system extremely robust and secure at low cost, as it is practically impossible to replicate the same silicon characteristics across dies.
Taking inspiration from human communication, which utilizes inherent variations in the voice signatures to identify a certain speaker, we present RF- PUF: a deep neural network-based framework that allows real-time authentication of wireless nodes, using the effects of inherent process variation on RF properties of the wireless transmitters (Tx), detected through in-situ machine learning at the receiver (Rx) end.
The proposed method utilizes the already-existing asymmetric RF communication framework and does not require any additional circuitry for PUF generation or feature extraction.
Simulation results involving the process variations in a standard 65 nm technology node, and features such as LO offset and I-Q imbalance detected with a neural network having 50 neurons in the hidden layer indicate that the framework can distinguish up to 4800 transmitters with an accuracy of 99.9% (~ 99% for 10,000 transmitters) under varying channel conditions, and without the need for traditional preambles.
Network coverage of wireless sensor network (WSN) means how well an area of interest is being monitored by the deployed network.
It depends mainly on sensing model of nodes.
In this paper, we present three types of sensing models viz.Boolean sensing model, shadow-fading sensing model and Elfes sensing model.
We investigate the impact of sensing models on network coverage.
We also investigate network coverage based on Poisson node distribution.
A comparative study between regular and random node placement has also been presented in this paper.
This study will be useful for coverage analysis of WSN.
Robustness of hybrid control systems to measurement noise, actuator disturbances, and more generally perturbations, is analyzed.
The relationship between the robustness of a hybrid control system and of its implementations is emphasized.
Firstly, a formal definition of implementation of a hybrid control system is provided, based on the uniqueness of the solutions.
Then, two examples are analyzed in detail, showing how the previously developed robustness property fails to guarantee that the implementations, necessarily used in control practice, are also robust.
A new concept of strong robustness is proposed, which guarantees that at least jumping-first and flowing-first implementations are robust when the hybrid control system is strongly robust.
In addition, we provide a sufficient condition for strong robustness based on the previously developed hybrid relaxation results.
Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs.
Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials.
To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon.
The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling.
We then question why some approaches are more successful than others in different language pairs.
We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair.
To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge.
Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.
Explaining the unreasonable effectiveness of deep learning has eluded researchers around the globe.
Various authors have described multiple metrics to evaluate the capacity of deep architectures.
In this paper, we allude to the radius margin bounds described for a support vector machine (SVM) with hinge loss, apply the same to the deep feed-forward architectures and derive the Vapnik-Chervonenkis (VC) bounds which are different from the earlier bounds proposed in terms of number of weights of the network.
In doing so, we also relate the effectiveness of techniques like Dropout and Dropconnect in bringing down the capacity of the network.
Finally, we describe the effect of maximizing the input as well as the output margin to achieve an input noise-robust deep architecture.
The dynamic character of most social networks requires to model evolution of networks in order to enable complex analysis of theirs dynamics.
The following paper focuses on the definition of differences between network snapshots by means of Graph Differential Tuple.
These differences enable to calculate the diverse distance measures as well as to investigate the speed of changes.
Four separate measures are suggested in the paper with experimental study on real social network data.
This paper presents a method based on linear programming for trajectory planning of automated vehicles, combining obstacle avoidance, time scheduling for the reaching of waypoints and time-optimal traversal of tube-like road segments.
System modeling is conducted entirely spatial-based.
Kinematic vehicle dynamics as well as time are expressed in a road-aligned coordinate frame with path along the road centerline serving as the dependent variable.
We elaborate on control rate constraints in the spatial domain.
A vehicle dimension constraint heuristic is proposed to constrain vehicle dimensions inside road boundaries.
It is outlined how friction constraints are accounted for.
The discussion is extended to dynamic vehicle models.
The benefits of the proposed method are illustrated by a comparison to a time-based method.
We consider a certain tiling problem of a planar region in which there are no long horizontal or vertical strips consisting of copies of the same tile.
Intuitively speaking, we would like to create a dappled pattern with two or more kinds of tiles.
We give an efficient algorithm to turn any tiling into one satisfying the condition, and discuss its applications in texturing.
Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other.
This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems.
In this paper, we explore the compositional generalization capabilities of recurrent neural networks (RNNs).
We first propose the lookup table composition domain as a simple setup to test compositional behaviour and show that it is theoretically possible for a standard RNN to learn to behave compositionally in this domain when trained with standard gradient descent and provided with additional supervision.
We then remove this additional supervision and perform a search over a large number of model initializations to investigate the proportion of RNNs that can still converge to a compositional solution.
We discover that a small but non-negligible proportion of RNNs do reach partial compositional solutions even without special architectural constraints.
This suggests that a combination of gradient descent and evolutionary strategies directly favouring the minority models that developed more compositional approaches might suffice to lead standard RNNs towards compositional solutions.
Autonomous path planning algorithms are significant to planetary exploration rovers, since relying on commands from Earth will heavily reduce their efficiency of executing exploration missions.
This paper proposes a novel learning-based algorithm to deal with global path planning problem for planetary exploration rovers.
Specifically, a novel deep convolutional neural network with double branches (DB-CNN) is designed and trained, which can plan path directly from orbital images of planetary surfaces without implementing environment mapping.
Moreover, the planning procedure requires no prior knowledge about planetary surface terrains.
Finally, experimental results demonstrate that DB-CNN achieves better performance on global path planning and faster convergence during training compared with the existing Value Iteration Network (VIN).
Since the proof of the four color theorem in 1976, computer-generated proofs have become a reality in mathematics and computer science.
During the last decade, we have seen formal proofs using verified proof assistants being used to verify the validity of such proofs.
In this paper, we describe a formalized theory of size-optimal sorting networks.
From this formalization we extract a certified checker that successfully verifies computer-generated proofs of optimality on up to 8 inputs.
The checker relies on an untrusted oracle to shortcut the search for witnesses on more than 1.6 million NP-complete subproblems.
In classifier (or regression) fusion the aim is to combine the outputs of several algorithms to boost overall performance.
Standard supervised fusion algorithms often require accurate and precise training labels.
However, accurate labels may be difficult to obtain in many remote sensing applications.
This paper proposes novel classification and regression fusion models that can be trained given ambiguosly and imprecisely labeled training data in which training labels are associated with sets of data points (i.e., "bags") instead of individual data points (i.e., "instances") following a multiple instance learning framework.
Experiments were conducted based on the proposed algorithms on both synthetic data and applications such as target detection and crop yield prediction given remote sensing data.
The proposed algorithms show effective classification and regression performance.
Generalized linear mixed-effects models in the context of genome-wide association studies (GWAS) represent a formidable computational challenge: the solution of millions of correlated generalized least-squares problems, and the processing of terabytes of data.
We present high performance in-core and out-of-core shared-memory algorithms for GWAS: By taking advantage of domain-specific knowledge, exploiting multi-core parallelism, and handling data efficiently, our algorithms attain unequalled performance.
When compared to GenABEL, one of the most widely used libraries for GWAS, on a 12-core processor we obtain 50-fold speedups.
As a consequence, our routines enable genome studies of unprecedented size.
The binary similarity problem consists in determining if two functions are similar by only considering their compiled form.
Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact.
Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations.
However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem.
In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network.
SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures.
We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions.
Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).
The IETF recently standardized the Opus codec as RFC6716.
Opus targets a wide range of real-time Internet applications by combining a linear prediction coder with a transform coder.
We describe the transform coder, with particular attention to the psychoacoustic knowledge built into the format.
The result out-performs existing audio codecs that do not operate under real-time constraints.
Chemical multisensor devices need calibration algorithms to estimate gas concentrations.
Their possible adoption as indicative air quality measurements devices poses new challenges due to the need to operate in continuous monitoring modes in uncontrolled environments.
Several issues, including slow dynamics, continue to affect their real world performances.
At the same time, the need for estimating pollutant concentrations on board the devices, espe- cially for wearables and IoT deployments, is becoming highly desirable.
In this framework, several calibration approaches have been proposed and tested on a variety of proprietary devices and datasets; still, no thorough comparison is available to researchers.
This work attempts a benchmarking of the most promising calibration algorithms according to recent literature with a focus on machine learning approaches.
We test the techniques against absolute and dynamic performances, generalization capabilities and computational/storage needs using three different datasets sharing continuous monitoring operation methodology.
Our results can guide researchers and engineers in the choice of optimal strategy.
They show that non-linear multivariate techniques yield reproducible results, outperforming lin- ear approaches.
Specifically, the Support Vector Regression method consistently shows good performances in all the considered scenarios.
We highlight the enhanced suitability of shallow neural networks in a trade-off between performance and computational/storage needs.
We confirm, on a much wider basis, the advantages of dynamic approaches with respect to static ones that only rely on instantaneous sensor array response.
The latter have been shown to be best choice whenever prompt and precise response is needed.
The ultimate goal of this indoor mapping research is to automatically reconstruct a floorplan simply by walking through a house with a smartphone in a pocket.
This paper tackles this problem by proposing FloorNet, a novel deep neural architecture.
The challenge lies in the processing of RGBD streams spanning a large 3D space.
FloorNet effectively processes the data through three neural network branches: 1) PointNet with 3D points, exploiting the 3D information; 2) CNN with a 2D point density image in a top-down view, enhancing the local spatial reasoning; and 3) CNN with RGB images, utilizing the full image information.
FloorNet exchanges intermediate features across the branches to exploit the best of all the architectures.
We have created a benchmark for floorplan reconstruction by acquiring RGBD video streams for 155 residential houses or apartments with Google Tango phones and annotating complete floorplan information.
Our qualitative and quantitative evaluations demonstrate that the fusion of three branches effectively improves the reconstruction quality.
We hope that the paper together with the benchmark will be an important step towards solving a challenging vector-graphics reconstruction problem.
Code and data are available at https://github.com/art-programmer/FloorNet.
Business Process Management (BPM) is a central element of today organizations.
Despite over the years its main focus has been the support of processes in highly controlled domains, nowadays many domains of interest to the BPM community are characterized by ever-changing requirements, unpredictable environments and increasing amounts of data that influence the execution of process instances.
Under such dynamic conditions, BPM systems must increase their level of automation to provide the reactivity and flexibility necessary for process management.
On the other hand, the Artificial Intelligence (AI) community has concentrated its efforts on investigating dynamic domains that involve active control of computational entities and physical devices (e.g., robots, software agents, etc.).
In this context, Automated Planning, which is one of the oldest areas in AI, is conceived as a model-based approach to synthesize autonomous behaviours in automated way from a model.
In this paper, we discuss how automated planning techniques can be leveraged to enable new levels of automation and support for business processing, and we show some concrete examples of their successful application to the different stages of the BPM life cycle.
We consider the setting of a Master server, M, who possesses confidential data (e.g., personal, genomic or medical data) and wants to run intensive computations on it, as part of a machine learning algorithm for example.
The Master wants to distribute these computations to untrusted workers who have volunteered or are incentivized to help with this task.
However, the data must be kept private and not revealed to the individual workers.
Some of the workers may be stragglers, e.g., slow or busy, and will take a random time to finish the task assigned to them.
We are interested in reducing the delays experienced by the Master.
We focus on linear computations as an essential operation in many iterative algorithms such as principal component analysis, support vector machines and other gradient-descent based algorithms.
A classical solution is to use a linear secret sharing scheme, such as Shamir's scheme, to divide the data into secret shares on which the workers can perform linear computations.
However, classical codes can provide straggler mitigation assuming a worst-case scenario of a fixed number of stragglers.
We propose a solution based on new secure codes, called Staircase codes, introduced previously by two of the authors.
Staircase codes allow flexibility in the number of stragglers up to a given maximum, and universally achieve the information theoretic limit on the download cost by the Master, leading to latency reduction.
Under the shifted exponential model, we find upper and lower bounds on the Master's mean waiting time.
We derive the distribution of the Master's waiting time, and its mean, for systems with up to two stragglers.
For systems with any number of stragglers, we derive an expression that can give the exact distribution, and the mean, of the waiting time of the Master.
We show that Staircase codes always outperform classical secret sharing codes.
Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning.
We here present a novel method to embed directed acyclic graphs.
Following prior work, we first advocate for using hyperbolic spaces which provably model tree-like structures better than Euclidean geometry.
Second, we view hierarchical relations as partial orders defined using a family of nested geodesically convex cones.
We prove that these entailment cones admit an optimal shape with a closed form expression both in the Euclidean and hyperbolic spaces, and they canonically define the embedding learning process.
Experiments show significant improvements of our method over strong recent baselines both in terms of representational capacity and generalization.
This paper presents a new algorithm, the Modified Moving Contracting Window Pattern Algorithm (CMCWPM), for the calculation of field similarity.
It strongly relies on previous work by Yang et al.
(2001), correcting previous work in which characters marked as inaccessible for further pattern matching were not treated as boundaries between subfields, occasionally leading to higher than expected scores of field similarity.
A reference Python implementation is provided.
Deep spiking neural networks (SNNs) hold great potential for improving the latency and energy efficiency of deep neural networks through event-based computation.
However, training such networks is difficult due to the non-differentiable nature of asynchronous spike events.
In this paper, we introduce a novel technique, which treats the membrane potentials of spiking neurons as differentiable signals, where discontinuities at spike times are only considered as noise.
This enables an error backpropagation mechanism for deep SNNs, which works directly on spike signals and membrane potentials.
Thus, compared with previous methods relying on indirect training and conversion, our technique has the potential to capture the statics of spikes more precisely.
Our novel framework outperforms all previously reported results for SNNs on the permutation invariant MNIST benchmark, as well as the N-MNIST benchmark recorded with event-based vision sensors.
The information available to robots in real tasks is widely distributed both in time and space, requiring the agent to search for relevant data.
In humans, that face the same problem when sounds, images and smells are presented to their sensors in a daily scene, a natural system is applied: Attention.
As vision plays an important role in our routine, most research regarding attention has involved this sensorial system and the same has been replicated to the robotics field.
However,most of the robotics tasks nowadays do not rely only in visual data, that are still costly.
To allow the use of attentive concepts with other robotics sensors that are usually used in tasks such as navigation, self-localization, searching and mapping, a generic attentional model has been previously proposed.
In this work, feature mapping functions were designed to build feature maps to this attentive model from data from range scanner and sonar sensors.
Experiments were performed in a high fidelity simulated robotics environment and results have demonstrated the capability of the model on dealing with both salient stimuli and goal-driven attention over multiple features extracted from multiple sensors.
In recent years, deep learning algorithms have become increasingly more prominent for their unparalleled ability to automatically learn discriminant features from large amounts of data.
However, within the field of electromyography-based gesture recognition, deep learning algorithms are seldom employed as they require an unreasonable amount of effort from a single person, to generate tens of thousands of examples.
This work's hypothesis is that general, informative features can be learned from the large amounts of data generated by aggregating the signals of multiple users, thus reducing the recording burden while enhancing gesture recognition.
Consequently, this paper proposes applying transfer learning on aggregated data from multiple users, while leveraging the capacity of deep learning algorithms to learn discriminant features from large datasets.
Two datasets comprised of 19 and 17 able-bodied participants respectively (the first one is employed for pre-training) were recorded for this work, using the Myo Armband.
A third Myo Armband dataset was taken from the NinaPro database and is comprised of 10 able-bodied participants.
Three different deep learning networks employing three different modalities as input (raw EMG, Spectrograms and Continuous Wavelet Transform (CWT)) are tested on the second and third dataset.
The proposed transfer learning scheme is shown to systematically and significantly enhance the performance for all three networks on the two datasets, achieving an offline accuracy of 98.31% for 7 gestures over 17 participants for the CWT-based ConvNet and 68.98% for 18 gestures over 10 participants for the raw EMG-based ConvNet.
Finally, a use-case study employing eight able-bodied participants suggests that real-time feedback allows users to adapt their muscle activation strategy which reduces the degradation in accuracy normally experienced over time.
Acoustic ranging based indoor positioning solutions have the advantage of higher ranging accuracy and better compatibility with commercial-off-the-self consumer devices.
However, similar to other time-domain based approaches using Time-of-Arrival and Time-Difference-of-Arrival, they suffer from performance degradation in presence of multi-path propagation and low received signal-to-noise ratio (SNR) in indoor environments.
In this paper, we improve upon our previous work on asynchronous acoustic indoor positioning and develop ARABIS, a robust and low-cost acoustic positioning system (IPS) for mobile devices.
We develop a low-cost acoustic board custom-designed to support large operational ranges and extensibility.
To mitigate the effects of low SNR and multi-path propagation, we devise a robust algorithm that iteratively removes possible outliers by taking advantage of redundant TDoA estimates.
Experiments have been carried in two testbeds of sizes 10.67m*7.76m and 15m*15m, one in an academic building and one in a convention center.
The proposed system achieves average and 95% quantile localization errors of 7.4cm and 16.0cm in the first testbed with 8 anchor nodes and average and 95% quantile localization errors of 20.4cm and 40.0cm in the second testbed with 4 anchor nodes only.
Clinical decision support systems (CDSS) are widely used to assist with medical decision making.
However, CDSS typically require manually curated rules and other data which are difficult to maintain and keep up-to-date.
Recent systems leverage advanced deep learning techniques and electronic health records (EHR) to provide more timely and precise results.
Many of these techniques have been developed with a common focus on predicting upcoming medical events.
However, while the prediction results from these approaches are promising, their value is limited by their lack of interpretability.
To address this challenge, we introduce CarePre, an intelligent clinical decision assistance system.
The system extends a state-of-the-art deep learning model to predict upcoming diagnosis events for a focal patient based on his/her historical medical records.
The system includes an interactive framework together with intuitive visualizations designed to support the diagnosis, treatment outcome analysis, and the interpretation of the analysis results.
We demonstrate the effectiveness and usefulness of CarePre system by reporting results from a quantities evaluation of the prediction algorithm and a case study and three interviews with senior physicians.
To ensure the maximum utilization of the limited bandwidth resources and improved quality of service (QoS) is the key issue for wireless communication networks.
Excessive call blocking is a constraint to attain the desired QoS.
In cellular network, as the traffic arrival rate increases, call blocking probability (CBP) increases considerably.
Paying profound concern, we proposed a scheme that reduces the call blocking probability with approximately steady call dropping probability (CDP).
Our proposed scheme also introduces the acceptance factor in specific guard channel where originating calls get access according to the acceptance factor.
The analytical performance proves better performance than the conventional new-call bounding scheme in case of higher and lower traffic arrival rate.
The different sets of regulations existing for differ-ent agencies within the government make the task of creating AI enabled solutions in government dif-ficult.
Regulatory restrictions inhibit sharing of da-ta across different agencies, which could be a significant impediment to training AI models.
We discuss the challenges that exist in environments where data cannot be freely shared and assess tech-nologies which can be used to work around these challenges.
We present results on building AI models using the concept of federated AI, which al-lows creation of models without moving the training data around.
Over the past three decades, considerable effort has been devoted to the study of software architecture.
A major portion of this effort has focused on the originally proposed view of four "C"s---components, connectors, configurations, and constraints---that are the building blocks of a system's architecture.
Despite being simple and appealing, this view has proven to be incomplete and has required further elaboration.
To that end, researchers have more recently tried to approach architectures from another important perspective---that of design decisions that yield a system's architecture.
These more recent efforts have lacked a precise understanding of several key questions, however: (1) What is an architectural design decision (definition)?
(2) How can architectural design decisions be found in existing systems (identification)?
(3) What system decisions are and are not architectural (classification)?
(4) How are architectural design decisions manifested in the code (reification)?
(5) How can important architectural decisions be preserved and/or changed as desired (evolution)?
This paper presents a technique targeted at answering these questions by analyzing information that is readily available about software systems.
We applied our technique on over 100 different versions of two widely adopted open- source systems, and found that it can accurately uncover the architectural design decisions embodied in the systems.
We are aiming at a semantics of logic programs with preferences defined on rules, which always selects a preferred answer set, if there is a non-empty set of (standard) answer sets of the given program.
It is shown in a seminal paper by Brewka and Eiter that the goal mentioned above is incompatible with their second principle and it is not satisfied in their semantics of prioritized logic programs.
Similarly, also according to other established semantics, based on a prescriptive approach, there are programs with standard answer sets, but without preferred answer sets.
According to the standard prescriptive approach no rule can be fired before a more preferred rule, unless the more preferred rule is blocked.
This is a rather imperative approach, in its spirit.
In our approach, rules can be blocked by more preferred rules, but the rules which are not blocked are handled in a more declarative style, their execution does not depend on the given preference relation on the rules.
An argumentation framework (different from the Dung's framework) is proposed in this paper.
Argu- mentation structures are derived from the rules of a given program.
An attack relation on argumentation structures is defined, which is derived from attacks of more preferred rules against the less preferred rules.
Preferred answer sets correspond to complete argumentation structures, which are not blocked by other complete argumentation structures.
Ranking algorithms are the information gatekeepers of the Internet era.
We develop a stylized model to study the effects of ranking algorithms on opinion dynamics.
We consider a search engine that uses an algorithm based on popularity and on personalization.
We find that popularity-based rankings generate an advantage of the fewer effect: fewer websites reporting a given signal attract relatively more traffic overall.
This highlights a novel, ranking-driven channel that explains the diffusion of misinformation, as websites reporting incorrect information may attract an amplified amount of traffic precisely because they are few.
Furthermore, when individuals provide sufficiently positive feedback to the ranking algorithm, popularity-based rankings tend to aggregate information while personalization acts in the opposite direction.
Citations are commonly held to represent scientific impact.
To date, however, there is no empirical evidence in support of this postulate that is central to research assessment exercises and Science of Science studies.
Here, we report on the first empirical verification of the degree to which citation numbers represent scientific impact as it is actually perceived by experts in their respective field.
We run a large-scale survey of about 2000 corresponding authors who performed a pairwise impact assessment task across more than 20000 scientific articles.
Results of the survey show that citation data and perceived impact do not align well, unless one properly accounts for strong psychological biases that affect the opinions of experts with respect to their own papers vs. those of others.
First, researchers tend to largely prefer their own publications to the most cited papers in their field of research.
Second, there is only a mild positive correlation between the number of citations of top-cited papers in given research areas and expert preference in pairwise comparisons.
This also applies to pairs of papers with several orders of magnitude differences in their total number of accumulated citations.
However, when researchers were asked to choose among pairs of their own papers, thus eliminating the bias favouring one's own papers over those of others, they did systematically prefer the most cited article.
We conclude that, when scientists have full information and are making unbiased choices, expert opinion on impact is congruent with citation numbers.
In this paper, we consider the notion of a direct type algorithm introduced by V.A.Bondarenko in 1983.
A direct type algorithm is a linear decision tree with some special properties.
Until recently, it was thought that the class of direct type algorithms is wide and includes many classical combinatorial algorithms, including the branch and bound algorithm for the traveling salesman problem, proposed by J.D.C.Little, K.G.Murty, D.W. Sweeney, C. Karel in 1963.
We show that this algorithm is not a direct type algorithm.
This work presents an algorithm for changing from latitudinal to longitudinal formation of autonomous aircraft squadrons.
The maneuvers are defined dynamically by using a predefined set of 3D basic maneuvers.
This formation changing is necessary when the squadron has to perform tasks which demand both formations, such as lift off, georeferencing, obstacle avoidance and landing.
Simulations show that the formation changing is made without collision.
The time complexity analysis of the transformation algorithm reveals that its efficiency is optimal, and the proof of correction ensures its longitudinal formation features.
We address the problem of super-resolution frequency recovery using prior knowledge of the structure of a spectrally sparse, undersampled signal.
In many applications of interest, some structure information about the signal spectrum is often known.
The prior information might be simply knowing precisely some signal frequencies or the likelihood of a particular frequency component in the signal.
We devise a general semidefinite program to recover these frequencies using theories of positive trigonometric polynomials.
Our theoretical analysis shows that, given sufficient prior information, perfect signal reconstruction is possible using signal samples no more than thrice the number of signal frequencies.
Numerical experiments demonstrate great performance enhancements using our method.
We show that the nominal resolution necessary for the grid-free results can be improved if prior information is suitably employed.
We present a Polyhedral Scene Generator system which creates a random scene based on a few user parameters, renders the scene from random view points and creates a dataset containing the renderings and corresponding annotation files.
We hope that this generator will enable research on how a program could parse a scene if it had multiple viewpoints to consider.
For ambiguous scenes, typically people move their head or change their position to see the scene from different angles as well as seeing how it changes while they move; this research field is called active perception.
The random scene generator presented is designed to support research in this field by generating images of scenes with known complexity characteristics and with verifiable properties with respect to the distribution of features across a population.
Thus, it is well-suited for research in active perception without the requirement of a live 3D environment and mobile sensing agent, including comparative performance evaluations.
The system is publicly available at https://polyhedral.eecs.yorku.ca.
While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects.
This limits the use of depth prediction in augmented and virtual reality applications, that aim at scene exploration by synthesizing the scene from a different vantage point, or at diminished reality.
To address this issue, we shift the focus from conventional depth map prediction to the regression of a specific data representation called Layered Depth Image (LDI), which contains information about the occluded regions in the reference frame and can fill in occlusion gaps in case of small view changes.
We propose a novel approach based on Convolutional Neural Networks (CNNs) to jointly predict depth maps and foreground separation masks used to condition Generative Adversarial Networks (GANs) for hallucinating plausible color and depths in the initially occluded areas.
We demonstrate the effectiveness of our approach for novel scene view synthesis from a single image.
We present results of empirical studies on positive speech on Twitter.
By positive speech we understand speech that works for the betterment of a given situation, in this case relations between different communities in a conflict-prone country.
We worked with four Twitter data sets.
Through semi-manual opinion mining, we found that positive speech accounted for < 1% of the data .
In fully automated studies, we tested two approaches: unsupervised statistical analysis, and supervised text classification based on distributed word representation.
We discuss benefits and challenges of those approaches and report empirical evidence obtained in the study.
In this paper, we introduce "Power Linear Unit" (PoLU) which increases the nonlinearity capacity of a neural network and thus helps improving its performance.
PoLU adopts several advantages of previously proposed activation functions.
First, the output of PoLU for positive inputs is designed to be identity to avoid the gradient vanishing problem.
Second, PoLU has a non-zero output for negative inputs such that the output mean of the units is close to zero, hence reducing the bias shift effect.
Thirdly, there is a saturation on the negative part of PoLU, which makes it more noise-robust for negative inputs.
Furthermore, we prove that PoLU is able to map more portions of every layer's input to the same space by using the power function and thus increases the number of response regions of the neural network.
We use image classification for comparing our proposed activation function with others.
In the experiments, MNIST, CIFAR-10, CIFAR-100, Street View House Numbers (SVHN) and ImageNet are used as benchmark datasets.
The neural networks we implemented include widely-used ELU-Network, ResNet-50, and VGG16, plus a couple of shallow networks.
Experimental results show that our proposed activation function outperforms other state-of-the-art models with most networks.
The security of communication in everyday life becomes very important.
On the other hand, all existing encryption protocols require from user additional knowledge end resources.
In this paper we discuss the problem of public key distribution between interested parties.
We propose to use a popular social media as a channel to publish public keys.
This way of key distribution allows also easily connect owner of the key with real person institution (what is not always easy).
Recognizing that the mobile devices become the main tool of communication, we present description of mobile application that uses proposed security methods.
Mobile phone calling is one of the most widely used communication methods in modern society.
The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks.
Mobile phone users call each other forming a directed calling network.
If only reciprocal calls are considered, we obtain an undirected mutual calling network.
The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges.
We perform a comparative analysis of the statistical properties of these four networks, which are constructed from the calling records of more than nine million individuals in Shanghai over a period of 110 days.
We find that these networks share many common structural properties and also exhibit idiosyncratic features when compared with previously studied large mobile calling networks.
The empirical findings provide us an intriguing picture of a representative large social network that might shed new lights on the modelling of large social networks.
In the k-Apex problem the task is to find at most k vertices whose deletion makes the given graph planar.
The graphs for which there exists a solution form a minor closed class of graphs, hence by the deep results of Robertson and Seymour, there is an O(n^3) time algorithm for every fixed value of k. However, the proof is extremely complicated and the constants hidden by the big-O notation are huge.
Here we give a much simpler algorithm for this problem with quadratic running time, by iteratively reducing the input graph and then applying techniques for graphs of bounded treewidth.
Massive sizes of real-world graphs, such as social networks and web graph, impose serious challenges to process and perform analytics on them.
These issues can be resolved by working on a small summary of the graph instead .
A summary is a compressed version of the graph that removes several details, yet preserves it's essential structure.
Generally, some predefined quality measure of the summary is optimized to bound the approximation error incurred by working on the summary instead of the whole graph.
All known summarization algorithms are computationally prohibitive and do not scale to large graphs.
In this paper we present an efficient randomized algorithm to compute graph summaries with the goal to minimize reconstruction error.
We propose a novel weighted sampling scheme to sample vertices for merging that will result in the least reconstruction error.
We provide analytical bounds on the running time of the algorithm and prove approximation guarantee for our score computation.
Efficiency of our algorithm makes it scalable to very large graphs on which known algorithms cannot be applied.
We test our algorithm on several real world graphs to empirically demonstrate the quality of summaries produced and compare to state of the art algorithms.
We use the summaries to answer several structural queries about original graph and report their accuracies.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently.
However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight.
To this end, this paper presents an accurate yet compact deep network for efficient salient object detection.
More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy.
Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner.
By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy.
Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).
Vajda and Buttyan (VB) proposed a set of five lightweight RFID authentication protocols.
Defend, Fu, and Juels (DFJ) did cryptanalysis on two of them - XOR and SUBSET.
To the XOR protocol, DFJ proposed repeated keys attack and nibble attack.
In this paper, we identify the vulnerability existed in the original VB's successive session key permutation algorithm.
We propose three enhancements to prevent DFJ's attacks and make XOR protocol stronger without introducing extra resource cost.
Fuzzy logic programming is a growing declarative paradigm aiming to integrate fuzzy logic into logic programming.
One of the most difficult tasks when specifying a fuzzy logic program is determining the right weights for each rule, as well as the most appropriate fuzzy connectives and operators.
In this paper, we introduce a symbolic extension of fuzzy logic programs in which some of these parameters can be left unknown, so that the user can easily see the impact of their possible values.
Furthermore, given a number of test cases, the most appropriate values for these parameters can be automatically computed.
To develop a knowledge-aware recommender system, a key data problem is how we can obtain rich and structured knowledge information for recommender system (RS) items.
Existing datasets or methods either use side information from original recommender systems (containing very few kinds of useful information) or utilize private knowledge base (KB).
In this paper, we present the first public linked KB dataset for recommender systems, named KB4Rec v1.0, which has linked three widely used RS datasets with the popular KB Freebase.
Based on our linked dataset, we first preform some interesting qualitative analysis experiments, in which we discuss the effect of two important factors (i.e. popularity and recency) on whether a RS item can be linked to a KB entity.
Finally, we present the comparison of several knowledge-aware recommendation algorithms on our linked dataset.
In a typical real-world application of re-id, a watch-list (gallery set) of a handful of target people (e.g.suspects) to track around a large volume of non-target people are demanded across camera views, and this is called the open-world person re-id.
Different from conventional (closed-world) person re-id, a large portion of probe samples are not from target people in the open-world setting.
And, it always happens that a non-target person would look similar to a target one and therefore would seriously challenge a re-id system.
In this work, we introduce a deep open-world group-based person re-id model based on adversarial learning to alleviate the attack problem caused by similar non-target people.
The main idea is learning to attack feature extractor on the target people by using GAN to generate very target-like images (imposters), and in the meantime the model will make the feature extractor learn to tolerate the attack by discriminative learning so as to realize group-based verification.
The framework we proposed is called the adversarial open-world person re-identification, and this is realized by our Adversarial PersonNet (APN) that jointly learns a generator, a person discriminator, a target discriminator and a feature extractor, where the feature extractor and target discriminator share the same weights so as to makes the feature extractor learn to tolerate the attack by imposters for better group-based verification.
While open-world person re-id is challenging, we show for the first time that the adversarial-based approach helps stabilize person re-id system under imposter attack more effectively.
Covariant-contravariant simulation and conformance simulation generalize plain simulation and try to capture the fact that it is not always the case that "the larger the number of behaviors, the better".
We have previously studied their logical characterizations and in this paper we present the axiomatizations of the preorders defined by the new simulation relations and their induced equivalences.
The interest of our results lies in the fact that the axiomatizations help us to know the new simulations better, understanding in particular the role of the contravariant characteristics and their interplay with the covariant ones; moreover, the axiomatizations provide us with a powerful tool to (algebraically) prove results of the corresponding semantics.
But we also consider our results interesting from a metatheoretical point of view: the fact that the covariant-contravariant simulation equivalence is indeed ground axiomatizable when there is no action that exhibits both a covariant and a contravariant behaviour, but becomes non-axiomatizable whenever we have together actions of that kind and either covariant or contravariant actions, offers us a new subtle example of the narrow border separating axiomatizable and non-axiomatizable semantics.
We expect that by studying these examples we will be able to develop a general theory separating axiomatizable and non-axiomatizable semantics.
In this work, we propose a robust Head-Related Transfer Function (HRTF)-based polynomial beamformer design which accounts for the influence of a humanoid robot's head on the sound field.
In addition, it allows for a flexible steering of our previously proposed robust HRTF-based beamformer design.
We evaluate the HRTF-based polynomial beamformer design and compare it to the original HRTF-based beamformer design by means of signal-independent measures as well as word error rates of an off-the-shelf speech recognition system.
Our results confirm the effectiveness of the polynomial beamformer design, which makes it a promising approach to robust beamforming for robot audition.
Mass segmentation is an important task in mammogram analysis, providing effective morphological features and regions of interest (ROI) for mass detection and classification.
Inspired by the success of using deep convolutional features for natural image analysis and conditional random fields (CRF) for structural learning, we propose an end-to-end network for mammographic mass segmentation.
The network employs a fully convolutional network (FCN) to model potential function, followed by a CRF to perform structural learning.
Because the mass distribution varies greatly with pixel position, the FCN is combined with position priori for the task.
Due to the small size of mammogram datasets, we use adversarial training to control over-fitting.
Four models with different convolutional kernels are further fused to improve the segmentation results.
Experimental results on two public datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined with adversarial training achieves the-state-of-the-art results.
Inference models are a key component in scaling variational inference to deep latent variable models, most notably as encoder networks in variational auto-encoders (VAEs).
By replacing conventional optimization-based inference with a learned model, inference is amortized over data examples and therefore more computationally efficient.
However, standard inference models are restricted to direct mappings from data to approximate posterior estimates.
The failure of these models to reach fully optimized approximate posterior estimates results in an amortization gap.
We aim toward closing this gap by proposing iterative inference models, which learn to perform inference optimization through repeatedly encoding gradients.
Our approach generalizes standard inference models in VAEs and provides insight into several empirical findings, including top-down inference techniques.
We demonstrate the inference optimization capabilities of iterative inference models and show that they outperform standard inference models on several benchmark data sets of images and text.
Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent the crowd motion and appearance.
Convolutional Neural Networks (CNN) have shown to be a powerful tool with excellent representational capacities, which can leverage the need for hand-crafted features.
In this paper, we show that keeping track of the changes in the CNN feature across time can facilitate capturing the local abnormality.
We specifically propose a novel measure-based method which allows measuring the local abnormality in a video by combining semantic information (inherited from existing CNN models) with low-level Optical-Flow.
One of the advantage of this method is that it can be used without the fine-tuning costs.
The proposed method is validated on challenging abnormality detection datasets and the results show the superiority of our method compared to the state-of-the-art methods.
Traditional methods to achieve high localization accuracy with tactile sensors usually use a matrix of miniaturized individual sensors distributed on the area of interest.
This approach usually comes at a price of increased complexity in fabrication and circuitry, and can be hard to adapt for non planar geometries.
We propose to use low cost optic components mounted on the edges of the sensing area to measure how light traveling through an elastomer is affected by touch.
Multiple light emitters and receivers provide us with a rich signal set that contains the necessary information to pinpoint both the location and depth of an indentation with high accuracy.
We demonstrate sub-millimeter accuracy on location and depth on a 20mm by 20mm active sensing area.
Our sensor provides high depth sensitivity as a result of two different modalities in how light is guided through our elastomer.
This method results in a low cost, easy to manufacture sensor.
We believe this approach can be adapted to cover non-planar surfaces, simplifying future integration in robot skin applications.
The design and the implementation of a genetic algorithm are described.
The applicability domain is on structure-activity relationships expressed as multiple linear regressions and predictor variables are from families of structure-based molecular descriptors.
An experiment to compare different selection and survival strategies was designed and realized.
The genetic algorithm was run using the designed experiment on a set of 206 polychlorinated biphenyls searching on structure-activity relationships having known the measured octanol-water partition coefficients and a family of molecular descriptors.
The experiment shows that different selection and survival strategies create different partitions on the entire population of all possible genotypes.
This paper presents a novel statistical state-dependent timing model for voltage over scaled (VoS) logic circuits that accurately and rapidly finds the timing distribution of output bits.
Using this model erroneous VoS circuits can be represented as error-free circuits combined with an error-injector.
A case study of a two point DFT unit employing the proposed model is presented and compared to HSPICE circuit simulation.
Results show an accurate match, with significant speedup gains.
Convolutional neural networks (CNNs) show impressive performance for image classification and detection, extending heavily to the medical image domain.
Nevertheless, medical experts are sceptical in these predictions as the nonlinear multilayer structure resulting in a classification outcome is not directly graspable.
Recently, approaches have been shown which help the user to understand the discriminative regions within an image which are decisive for the CNN to conclude to a certain class.
Although these approaches could help to build trust in the CNNs predictions, they are only slightly shown to work with medical image data which often poses a challenge as the decision for a class relies on different lesion areas scattered around the entire image.
Using the DiaretDB1 dataset, we show that on retina images different lesion areas fundamental for diabetic retinopathy are detected on an image level with high accuracy, comparable or exceeding supervised methods.
On lesion level, we achieve few false positives with high sensitivity, though, the network is solely trained on image-level labels which do not include information about existing lesions.
Classifying between diseased and healthy images, we achieve an AUC of 0.954 on the DiaretDB1.
Abstract Dialectical Frameworks (ADFs) generalize Dung's argumentation frameworks allowing various relationships among arguments to be expressed in a systematic way.
We further generalize ADFs so as to accommodate arbitrary acceptance degrees for the arguments.
This makes ADFs applicable in domains where both the initial status of arguments and their relationship are only insufficiently specified by Boolean functions.
We define all standard ADF semantics for the weighted case, including grounded, preferred and stable semantics.
We illustrate our approach using acceptance degrees from the unit interval and show how other valuation structures can be integrated.
In each case it is sufficient to specify how the generalized acceptance conditions are represented by formulas, and to specify the information ordering underlying the characteristic ADF operator.
We also present complexity results for problems related to weighted ADFs.
Though convolutional neural networks have achieved state-of-the-art performance on various vision tasks, they are extremely vulnerable to adversarial examples, which are obtained by adding human-imperceptible perturbations to the original images.
Adversarial examples can thus be used as an useful tool to evaluate and select the most robust models in safety-critical applications.
However, most of the existing adversarial attacks only achieve relatively low success rates under the challenging black-box setting, where the attackers have no knowledge of the model structure and parameters.
To this end, we propose to improve the transferability of adversarial examples by creating diverse input patterns.
Instead of only using the original images to generate adversarial examples, our method applies random transformations to the input images at each iteration.
Extensive experiments on ImageNet show that the proposed attack method can generate adversarial examples that transfer much better to different networks than existing baselines.
To further improve the transferability, we (1) integrate the recently proposed momentum method into the attack process; and (2) attack an ensemble of networks simultaneously.
By evaluating our method against top defense submissions and official baselines from NIPS 2017 adversarial competition, this enhanced attack reaches an average success rate of 73.0%, which outperforms the top 1 attack submission in the NIPS competition by a large margin of 6.6%.
We hope that our proposed attack strategy can serve as a benchmark for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in future.
The code is public available at https://github.com/cihangxie/DI-2-FGSM.
We propose an approximation method for thresholding of singular values using Chebyshev polynomial approximation (CPA).
Many signal processing problems require iterative application of singular value decomposition (SVD) for minimizing the rank of a given data matrix with other cost functions and/or constraints, which is called matrix rank minimization.
In matrix rank minimization, singular values of a matrix are shrunk by hard-thresholding, soft-thresholding, or weighted soft-thresholding.
However, the computational cost of SVD is generally too expensive to handle high dimensional signals such as images; hence, in this case, matrix rank minimization requires enormous computation time.
In this paper, we leverage CPA to (approximately) manipulate singular values without computing singular values and vectors.
The thresholding of singular values is expressed by a multiplication of certain matrices, which is derived from a characteristic of CPA.
The multiplication is also efficiently computed using the sparsity of signals.
As a result, the computational cost is significantly reduced.
Experimental results suggest the effectiveness of our method through several image processing applications based on matrix rank minimization with nuclear norm relaxation in terms of computation time and approximation precision.
Rate control is widely adopted during video streaming to provide both high video qualities and low latency under various network conditions.
However, despite that many work have been proposed, they fail to tackle one major problem: previous methods determine a future transmission rate as a single for value which will be used in an entire time-slot, while real-world network conditions, unlike lab setup, often suffer from rapid and stochastic changes, resulting in the failures of predictions.
In this paper, we propose a delay-constrained rate control approach based on end-to-end deep learning.
The proposed model predicts future bit rate not as a single value, but as possible bit rate ranges using target delay gradient, with which the transmission delay is guaranteed.
We collect a large scale of real-world live streaming data to train our model, and as a result, it automatically learns the correlation between throughput and target delay gradient.
We build a testbed to evaluate our approach.
Compared with the state-of-the-art methods, our approach demonstrates a better performance in bandwidth utilization.
In all considered scenarios, a range based rate control approach outperforms the one without range by 19% to 35% in average QoE improvement.
Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field.
In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems.
We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration.
We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score.
Also, we have modified the loss function to enhance the word alignment of the model.
This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.
Grid computing is a type of distributed computing which allows sharing of computer resources through Internet.
It not only allows us to share files but also most of the software and hardware resources.
An efficient resource discovery mechanism is the fundamental requirements for grid computing systems, as it supports resource management and scheduling of applications.
Among various discovery mechanisms,Peer-to-Peer (P2P) technology witnessed rapid development and the key component for this success is efficient lookup applications of P2P.
Chord is a P2P structural model widely used as a routing protocol to find resources in grid environment.
Plenty of ideas are implemented by researchers to improve the lookup performance of chord protocol in Grid environment.
In this paper, we discuss the recent researches made on Chord Structured P2P protocol and present our proposed methods in which we use the address of Recently Visited Node (RVN) and fuzzy technique to easily locate the grid resources by reducing message complexity and time complexity.
Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection).
The main challenge of the topic lies in the semantics-equivalent code transformation (e.g., optimization, obfuscation) which would alter representations of binary code tremendously.
Another chal- lenge is the trade-off between detection accuracy and coverage.
Unfortunately, existing techniques still rely on semantics-less code features which are susceptible to the code transformation.
Besides, they adopt merely either a static or a dynamic approach to detect binary code clones, which cannot achieve high accuracy and coverage simultaneously.
In this paper, we propose a semantics-based hybrid approach to detect binary clone functions.
We execute a template binary function with its test cases, and emulate the execution of every target function for clone comparison with the runtime information migrated from that template function.
The semantic signatures are extracted during the execution of the template function and emulation of the target function.
Lastly, a similarity score is calculated from their signatures to measure their likeness.
We implement the approach in a prototype system designated as BinMatch which analyzes IA-32 binary code on the Linux platform.
We evaluate BinMatch with eight real-world projects compiled with different compilation configurations and commonly-used obfuscation methods, totally performing over 100 million pairs of function comparison.
The experimental results show that BinMatch is robust to the semantics-equivalent code transformation.
Besides, it not only covers all target functions for clone analysis, but also improves the detection accuracy comparing to the state-of-the-art solutions.
In this paper, we propose the Fourier frequency vector (FFV), inherently, associated with multidimensional Fourier transform.
With the help of FFV, we are able to provide physical meaning of so called negative frequencies in multidimensional Fourier transform (MDFT), which in turn provide multidimensional spatial and space-time series analysis.
The complex exponential representation of sinusoidal function always yields two frequencies, negative frequency corresponding to positive frequency and vice versa, in the multidimensional Fourier spectrum.
Thus, using the MDFT, we propose multidimensional Hilbert transform (MDHT) and associated multidimensional analytic signal (MDAS) with following properties: (a) the extra and redundant positive, negative, or both frequencies, introduced due to complex exponential representation of multidimensional Fourier spectrum, are suppressed, (b) real part of MDAS is original signal, (c) real and imaginary part of MDAS are orthogonal, and (d) the magnitude envelope of a original signal is obtained as the magnitude of its associated MDAS, which is the instantaneous amplitude of the MDAS.
The proposed MDHT and associated DMAS are generalization of the 1D HT and AS, respectively.
We also provide the decomposition of an image into the AM-FM image model by the Fourier method and obtain explicit expression for the analytic image computation by 2DDFT.
In this paper, we will understand that the development of the Digital Video Broadcasting to a Handheld (DVB-H) standard makes it possible to deliver live broadcast television to a mobile handheld device.
Building upon the strengths of the Digital Video Broadcasting - Terrestrial (DVB-T) standard in use in millions of homes, DVB-H recognizes the trend towards the personal consumption of media.
We present here a new probabilistic inference algorithm that gives exact results in the domain of discrete probability distributions.
This algorithm, named the Statues algorithm, calculates the marginal probability distribution on probabilistic models defined as direct acyclic graphs.
These models are made up of well-defined primitives that allow to express, in particular, joint probability distributions, Bayesian networks, discrete Markov chains, conditioning and probabilistic arithmetic.
The Statues algorithm relies on a variable binding mechanism based on the generator construct, a special form of coroutine; being related to the enumeration algorithm, this new algorithm brings important improvements in terms of efficiency, which makes it valuable in regard to other exact marginalization algorithms.
After introduction of several definitions, primitives and compositional rules, we present in details the Statues algorithm.
Then, we briefly discuss the interest of this algorithm compared to others and we present possible extensions.
Finally, we introduce Lea and MicroLea, two Python libraries implementing the Statues algorithm, along with several use cases.
A proof of the correctness of the algorithm is provided in appendix.
The job of software effort estimation is a critical one in the early stages of the software development life cycle when the details of requirements are usually not clearly identified.
Various optimization techniques help in improving the accuracy of effort estimation.
The Support Vector Regression (SVR) is one of several different soft-computing techniques that help in getting optimal estimated values.
The idea of SVR is based upon the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function.
Further, the SVR kernel methods can be applied in transforming the input data and then based on these transformations, an optimal boundary between the possible outputs can be obtained.
The main objective of the research work carried out in this paper is to estimate the software effort using use case point approach.
The use case point approach relies on the use case diagram to estimate the size and effort of software projects.
Then, an attempt has been made to optimize the results obtained from use case point analysis using various SVR kernel methods to achieve better prediction accuracy.
Recent advances in facial landmark detection achieve success by learning discriminative features from rich deformation of face shapes and poses.
Besides the variance of faces themselves, the intrinsic variance of image styles, e.g., grayscale vs. color images, light vs. dark, intense vs. dull, and so on, has constantly been overlooked.
This issue becomes inevitable as increasing web images are collected from various sources for training neural networks.
In this work, we propose a style-aggregated approach to deal with the large intrinsic variance of image styles for facial landmark detection.
Our method transforms original face images to style-aggregated images by a generative adversarial module.
The proposed scheme uses the style-aggregated image to maintain face images that are more robust to environmental changes.
Then the original face images accompanying with style-aggregated ones play a duet to train a landmark detector which is complementary to each other.
In this way, for each face, our method takes two images as input, i.e., one in its original style and the other in the aggregated style.
In experiments, we observe that the large variance of image styles would degenerate the performance of facial landmark detectors.
Moreover, we show the robustness of our method to the large variance of image styles by comparing to a variant of our approach, in which the generative adversarial module is removed, and no style-aggregated images are used.
Our approach is demonstrated to perform well when compared with state-of-the-art algorithms on benchmark datasets AFLW and 300-W. Code is publicly available on GitHub: https://github.com/D-X-Y/SAN
This work discusses how the MPContribs framework in the Materials Project (MP) allows user-contributed data to be shown and analyzed alongside the core MP database.
The Materials Project is a searchable database of electronic structure properties of over 65,000 bulk solid materials that is accessible through a web-based science-gateway.
We describe the motivation for enabling user contributions to the materials data and present the framework's features and challenges in the context of two real applications.
These use-cases illustrate how scientific collaborations can build applications with their own "user-contributed" data using MPContribs.
The Nanoporous Materials Explorer application provides a unique search interface to a novel dataset of hundreds of thousands of materials, each with tables of user-contributed values related to material adsorption and density at varying temperature and pressure.
The Unified Theoretical and Experimental x-ray Spectroscopy application discusses a full workflow for the association, dissemination and combined analyses of experimental data from the Advanced Light Source with MP's theoretical core data, using MPContribs tools for data formatting, management and exploration.
The capabilities being developed for these collaborations are serving as the model for how new materials data can be incorporated into the Materials Project website with minimal staff overhead while giving powerful tools for data search and display to the user community.
We carry out a theoretical analysis of the uplink (UL) of a massive MIMO system with per-user channel correlation and Rician fading, using two processing approaches.
Firstly, we examine the linear minimum-mean-square-error receiver under training-based imperfect channel estimates.
Secondly, we propose a statistical combining technique that is more suitable in environments with strong Line-of-Sight (LoS) components.
We derive closed-form asymptotic approximations of the UL spectral efficiency (SE) attained by each combining scheme in single and multi-cell settings, as a function of the system parameters.
These expressions are insightful in how different factors such as LoS propagation conditions and pilot contamination impact the overall system performance.
Furthermore, they are exploited to determine the optimal number of training symbols which is shown to be of significant interest at low Rician factors.
The study and numerical results substantiate that stronger LoS signals lead to better performances, and under such conditions, the statistical combining entails higher SE gains than the conventional receiver.
Discriminative Correlation Filters (DCF) are efficient in visual tracking but suffer from unwanted boundary effects.
Spatially Regularized DCF (SRDCF) has been suggested to resolve this issue by enforcing spatial penalty on DCF coefficients, which, inevitably, improves the tracking performance at the price of increasing complexity.
To tackle online updating, SRDCF formulates its model on multiple training images, further adding difficulties in improving efficiency.
In this work, by introducing temporal regularization to SRDCF with single sample, we present our spatial-temporal regularized correlation filters (STRCF).
Motivated by online Passive-Agressive (PA) algorithm, we introduce the temporal regularization to SRDCF with single sample, thus resulting in our spatial-temporal regularized correlation filters (STRCF).
The STRCF formulation can not only serve as a reasonable approximation to SRDCF with multiple training samples, but also provide a more robust appearance model than SRDCF in the case of large appearance variations.
Besides, it can be efficiently solved via the alternating direction method of multipliers (ADMM).
By incorporating both temporal and spatial regularization, our STRCF can handle boundary effects without much loss in efficiency and achieve superior performance over SRDCF in terms of accuracy and speed.
Experiments are conducted on three benchmark datasets: OTB-2015, Temple-Color, and VOT-2016.
Compared with SRDCF, STRCF with hand-crafted features provides a 5 times speedup and achieves a gain of 5.4% and 3.6% AUC score on OTB-2015 and Temple-Color, respectively.
Moreover, STRCF combined with CNN features also performs favorably against state-of-the-art CNN-based trackers and achieves an AUC score of 68.3% on OTB-2015.
We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases.
To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation.
The former consists in verifying whether there is an interpretation which conforms to both the marginal probabilities of the tuples and the integrity constraints.
The latter is the problem of answering queries under a "cautious" paradigm, taking into account all interpretations of the data in accordance with the constraints.
In this setting, we investigate the complexity of the above-mentioned problems, and identify several tractable cases of practical relevance.
One way to interpret neural model predictions is to highlight the most important input features---for example, a heatmap visualization over the words in an input sentence.
In existing interpretation methods for NLP, a word's importance is determined by either input perturbation---measuring the decrease in model confidence when that word is removed---or by the gradient with respect to that word.
To understand the limitations of these methods, we use input reduction, which iteratively removes the least important word from the input.
This exposes pathological behaviors of neural models: the remaining words appear nonsensical to humans and are not the ones determined as important by interpretation methods.
As we confirm with human experiments, the reduced examples lack information to support the prediction of any label, but models still make the same predictions with high confidence.
To explain these counterintuitive results, we draw connections to adversarial examples and confidence calibration: pathological behaviors reveal difficulties in interpreting neural models trained with maximum likelihood.
To mitigate their deficiencies, we fine-tune the models by encouraging high entropy outputs on reduced examples.
Fine-tuned models become more interpretable under input reduction without accuracy loss on regular examples.
Personal connections between creators and evaluators of scientific works are ubiquitous, and the possibility of bias ever-present.
Although connections have been shown to bias prospective judgments of (uncertain) future performance, it is unknown whether such biases occur in the much more concrete task of assessing the scientific validity of already completed work, and if so, why.
This study presents evidence that personal connections between authors and reviewers of neuroscience manuscripts are associated with biased judgments and explores the mechanisms driving the effect.
Using reviews from 7,981 neuroscience manuscripts submitted to the journal PLOS ONE, which instructs reviewers to evaluate manuscripts only on scientific validity, we find that reviewers favored authors close in the co-authorship network by ~0.11 points on a 1.0 - 4.0 scale for each step of proximity.
PLOS ONE's validity-focused review and the substantial amount of favoritism shown by distant vs. very distant reviewers, both of whom should have little to gain from nepotism, point to the central role of substantive disagreements between scientists in different "schools of thought."
The results suggest that removing bias from peer review cannot be accomplished simply by recusing the closely-connected reviewers, and highlight the value of recruiting reviewers embedded in diverse professional networks.
Finding graph indices which are unbiased to network size and density is of high importance both within a given field and across fields for enhancing comparability of modern network science studies.
The degree variance is an important metric for characterising network heterogeneity.
Here, we provide an analytically valid normalisation of degree variance to replace previous normalisations which are either invalid or not applicable to all networks.
It is shown that this normalisation provides equal values for graphs and their complements; it is maximal in the star graph (and its complement); and its expected value is constant with respect to density for random networks of the same size.
We strengthen these results with model observations in weighted random networks, random geometric networks and resting-state brain networks, showing that the proposed normalisation is unbiased to both network size and density.
The closed form expression proposed also benefits from high computational efficiency and straightforward mathematical analysis.
In an application of a subnetwork comparability problem of nationwide and within state US airport networks, the nationwide US airport network is shown to be much more heterogeneous than most within-state networks, illustrating the importance of the increased reliability of this true normalisation.
Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures.
In this work, we interpret deep residual networks as ordinary differential equations (ODEs), which have long been studied in mathematics and physics with rich theoretical and empirical success.
From this interpretation, we develop a theoretical framework on stability and reversibility of deep neural networks, and derive three reversible neural network architectures that can go arbitrarily deep in theory.
The reversibility property allows a memory-efficient implementation, which does not need to store the activations for most hidden layers.
Together with the stability of our architectures, this enables training deeper networks using only modest computational resources.
We provide both theoretical analyses and empirical results.
Experimental results demonstrate the efficacy of our architectures against several strong baselines on CIFAR-10, CIFAR-100 and STL-10 with superior or on-par state-of-the-art performance.
Furthermore, we show our architectures yield superior results when trained using fewer training data.
This is the preprint version of our paper on ICWL2015.
A virtual reality based enhanced technology for learning primary geography is proposed, which synthesizes several latest information technologies including virtual reality(VR), 3D geographical information system(GIS), 3D visualization and multimodal human-computer-interaction (HCI).
The main functions of the proposed system are introduced, i.e.Buffer analysis, Overlay analysis, Space convex hull calculation, Space convex decomposition, 3D topology analysis and 3D space intersection detection.
The multimodal technologies are employed in the system to enhance the immersive perception of the users.
In this work, we briefly outline the core 5G air interface improvements introduced by the latest New Radio (NR) specifications, as well as elaborate on the unique features of initial access in 5G NR with a particular emphasis on millimeter-wave (mmWave) frequency range.
The highly directional nature of 5G mmWave cellular systems poses a variety of fundamental differences and research problem formulations, and a holistic understanding of the key system design principles behind the 5G NR is essential.
Here, we condense the relevant information collected from a wide diversity of 5G NR standardization documents (based on 3GPP Release 15) to distill the essentials of directional access in 5G mmWave cellular, which becomes the foundation for any corresponding system-level analysis.
Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem.
Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level.
In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator.
Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator.
The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline.
We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification.
The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.
Optimal operation of a country's air transport infrastructure plays a major role in the economic development of nations.
Due to the increasing use of air transportation in today's world, flights' boarding times have become a concern for both airlines and airports, thus the importance of knowing beforehand how changes in flights demand parameters and physical airport layout will affect passengers flow and boarding times.
This paper presents a pedestrian modeling study in which a national airport passenger flow was analyzed.
The study was conducted at Vanguardia National Airport in Villavicencio, Meta, Colombia.
Different effects of structural changes are shown and provide judging elements for decision makers regarding passenger traffic in airport design.
A hot topic in data center design is to envision geo-distributed architectures spanning a few sites across wide area networks, allowing more proximity to the end users and higher survivability, defined as the capacity of a system to operate after failures.
As a shortcoming, this approach is subject to an increase of latency between servers, caused by their geographic distances.
In this paper, we address the trade-off between latency and survivability in geo-distributed data centers, through the formulation of an optimization problem.
Simulations considering realistic scenarios show that the latency increase is significant only in the case of very strong survivability requirements, whereas it is negligible for moderate survivability requirements.
For instance, the worst-case latency is less than 4~ms when guaranteeing that 80% of the servers are available after a failure, in a network where the latency could be up to 33 ms.
Dealing with structured data needs the use of expressive representation formalisms that, however, puts the problem to deal with the computational complexity of the machine learning process.
Furthermore, real world domains require tools able to manage their typical uncertainty.
Many statistical relational learning approaches try to deal with these problems by combining the construction of relevant relational features with a probabilistic tool.
When the combination is static (static propositionalization), the constructed features are considered as boolean features and used offline as input to a statistical learner; while, when the combination is dynamic (dynamic propositionalization), the feature construction and probabilistic tool are combined into a single process.
In this paper we propose a selective propositionalization method that search the optimal set of relational features to be used by a probabilistic learner in order to minimize a loss function.
The new propositionalization approach has been combined with the random subspace ensemble method.
Experiments on real-world datasets shows the validity of the proposed method.
We present in this paper, a modelling of an expertise in pragmatics.
We follow knowledge engineering techniques and observe the expert when he analyses a social discussion forum.
Then a number of models are defined.
These models emphasises the process followed by the expert and a number of criteria used in his analysis.
Results can be used as guides that help to understand and annotate discussion forum.
We aim at modelling other pragmatics analysis in order to complete the base of guides; criteria, process, etc. of discussion analysis
In this paper, we consider user location privacy in mobile edge clouds (MECs).
MECs are small clouds deployed at the network edge to offer cloud services close to mobile users, and many solutions have been proposed to maximize service locality by migrating services to follow their users.
Co-location of a user and his service, however, implies that a cyber eavesdropper observing service migrations between MECs can localize the user up to one MEC coverage area, which can be fairly small (e.g., a femtocell).
We consider using chaff services to defend against such an eavesdropper, with focus on strategies to control the chaffs.
Assuming the eavesdropper performs maximum likelihood (ML) detection, we consider both heuristic strategies that mimic the user's mobility and optimized strategies designed to minimize the detection or tracking accuracy.
We show that a single chaff controlled by the optimal strategy or its online variation can drive the eavesdropper's tracking accuracy to zero when the user's mobility is sufficiently random.
We further propose extended strategies that utilize randomization to defend against an advanced eavesdropper aware of the strategy.
The efficacy of our solutions is verified through both synthetic and trace-driven simulations.
The ARP-Path protocol has flourished as a promise for wired networks, creating shortest paths with the simplicity of pure bridging and competing directly with TRILL and SPB.
After analyzing different alternatives of ARP-Path and creating the All-Path family, the idea of migrating the protocol to wireless networks appeared to be a good alternative to protocols such as a AODV.
In this article, we check the implications of adapting ARP-Path to a wireless environment, and we prove that good ideas for wired networks might not be directly applicable to wireless networks, as not only the media differs, but also the characterization of these networks varies.
Foreground (FG) pixel labelling plays a vital role in video surveillance.
Recent engineering solutions have attempted to exploit the efficacy of deep learning (DL) models initially targeted for image classification to deal with FG pixel labelling.
One major drawback of such strategy is the lacking delineation of visual objects when training samples are limited.
To grapple with this issue, we introduce a multi-view receptive field fully convolutional neural network (MV-FCN) that harness recent seminal ideas, such as, fully convolutional structure, inception modules, and residual networking.
Therefrom, we implement a system in an encoder-decoder fashion that subsumes a core and two complementary feature flow paths.
The model exploits inception modules at early and late stages with three different sizes of receptive fields to capture invariance at various scales.
The features learned in the encoding phase are fused with appropriate feature maps in the decoding phase through residual connections for achieving enhanced spatial representation.
These multi-view receptive fields and residual feature connections are expected to yield highly generalized features for an accurate pixel-wise FG region identification.
It is, then, trained with database specific exemplary segmentations to predict desired FG objects.
The comparative experimental results on eleven benchmark datasets validate that the proposed model achieves very competitive performance with the prior- and state-of-the-art algorithms.
We also report that how well a transfer learning approach can be useful to enhance the performance of our proposed MV-FCN.
Numerous institutions and organizations need not only to preserve the material and publications they produce, but also have as their task (although it would be desirable it was an obligation) to publish, disseminate and make publicly available all the results of the research and any other scientific/academic material.
The Open Archives Initiative (OAI) and the introduction of Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), make this task much easier.
The main objective of this work is to make a comparative and qualitative study of the data -metadata specifically- contained in the whole set of Argentine repositories listed in the ROAR portal, focusing on the functional perspective of the quality of this metadata.
Another objective is to offer an overview of the status of these repositories, in an attempt to detect common failures and errors institutions incur when storing the metadata of the resources contained in these repositories, and thus be able to suggest measures to be able to improve the load and further retrieval processes.
It was found that the eight most used Dublin Core fields are: identifier, type, title, date, subject, creator, language and description.
Not all repositories fill all the fields, and the lack of normalization, or the excessive use of fields like language, type, format and subject is somewhat striking, and in some cases even alarming
This work presents the application of the artificial neural networks, trained and structurally optimized by genetic algorithms, for modeling of crude distillation process at PKN ORLEN S.A. refinery.
Models for the main fractionator distillation column products were developed using historical data.
Quality of the fractions were predicted based on several chosen process variables.
The performance of the model was validated using test data.
Neural networks used in companion with genetic algorithms proved that they can accurately predict fractions quality shifts, reproducing the results of the standard laboratory analysis.
Simple knowledge extraction method from neural network model built was also performed.
Genetic algorithms can be successfully utilized in efficient training of large neural networks and finding their optimal structures.
Majority of the face recognition algorithms use query faces captured from uncontrolled, in the wild, environment.
Often caused by the cameras limited capabilities, it is common for these captured facial images to be blurred or low resolution.
Super resolution algorithms are therefore crucial in improving the resolution of such images especially when the image size is small requiring enlargement.
This paper aims to demonstrate the effect of one of the state-of-the-art algorithms in the field of image super resolution.
To demonstrate the functionality of the algorithm, various before and after 3D face alignment cases are provided using the images from the Labeled Faces in the Wild (lfw).
Resulting images are subject to testing on a closed set face recognition protocol using unsupervised algorithms with high dimension extracted features.
The inclusion of super resolution algorithm resulted in significant improved recognition rate over recently reported results obtained from unsupervised algorithms.
It is well established that humans decision making and instrumental control uses multiple systems, some which use habitual action selection and some which require deliberate planning.
Deliberate planning systems use predictions of action-outcomes using an internal model of the agent's environment, while habitual action selection systems learn to automate by repeating previously rewarded actions.
Habitual control is computationally efficient but may be inflexible in changing environments.
Conversely, deliberate planning may be computationally expensive, but flexible in dynamic environments.
This paper proposes a general architecture comprising both control paradigms by introducing an arbitrator that controls which subsystem is used at any time.
This system is implemented for a target-reaching task with a simulated two-joint robotic arm that comprises a supervised internal model and deep reinforcement learning.
Through permutation of target-reaching conditions, we demonstrate that the proposed is capable of rapidly learning kinematics of the system without a priori knowledge, and is robust to (A) changing environmental reward and kinematics, and (B) occluded vision.
The arbitrator model is compared to exclusive deliberate planning with the internal model and exclusive habitual control instances of the model.
The results show how such a model can harness the benefits of both systems, using fast decisions in reliable circumstances while optimizing performance in changing environments.
In addition, the proposed model learns very fast.
Finally, the system which includes internal models is able to reach the target under the visual occlusion, while the pure habitual system is unable to operate sufficiently under such conditions.
Motivated by many practical applications in logistics and mobility-as-a-service, we study the top-k optimal sequenced routes (KOSR) querying on large, general graphs where the edge weights may not satisfy the triangle inequality, e.g., road network graphs with travel times as edge weights.
The KOSR querying strives to find the top-k optimal routes (i.e., with the top-k minimal total costs) from a given source to a given destination, which must visit a number of vertices with specific vertex categories (e.g., gas stations, restaurants, and shopping malls) in a particular order (e.g., visiting gas stations before restaurants and then shopping malls).
To efficiently find the top-k optimal sequenced routes, we propose two algorithms PruningKOSR and StarKOSR.
In PruningKOSR, we define a dominance relationship between two partially-explored routes.
The partially-explored routes that can be dominated by other partially-explored routes are postponed being extended, which leads to a smaller searching space and thus improves efficiency.
In StarKOSR, we further improve the efficiency by extending routes in an A* manner.
With the help of a judiciously designed heuristic estimation that works for general graphs, the cost of partially explored routes to the destination can be estimated such that the qualified complete routes can be found early.
In addition, we demonstrate the high extensibility of the proposed algorithms by incorporating Hop Labeling, an effective label indexing technique for shortest path queries, to further improve efficiency.
Extensive experiments on multiple real-world graphs demonstrate that the proposed methods significantly outperform the baseline method.
Furthermore, when k=1, StarKOSR also outperforms the state-of-the-art method for the optimal sequenced route queries.
With the evolution of mobile devices, and smart-phones in particular, comes the ability to create new experiences that enhance the way we see, interact, and manipulate objects, within the world that surrounds us.
It is now possible to blend data from our senses and our devices in numerous ways that simply were not possible before using Augmented Reality technology.
In a near future, when all of the office devices as well as your personal electronic gadgets are on a common wireless network, operating them using a universal remote controller would be possible.
This paper presents an off-the-shelf, low-cost prototype that leverages the Augmented Reality technology to deliver a novel and interactive way of operating office network devices around using a mobile device.
We believe this type of system may provide benefits to controlling multiple integrated devices and visualizing interconnectivity or utilizing visual elements to pass information from one device to another, or may be especially beneficial to control devices when interacting with them physically may be difficult or pose danger or harm.
Estimating the engagement is critical for human - robot interaction.
Engagement measures typically rely on the dynamics of the social signals exchanged by the partners, especially speech and gaze.
However, the dynamics of these signals is likely to be influenced by individual and social factors, such as personality traits, as it is well documented that they critically influence how two humans interact with each other.
Here, we assess the influence of two factors, namely extroversion and negative attitude toward robots, on speech and gaze during a cooperative task, where a human must physically manipulate a robot to assemble an object.
We evaluate if the scores of extroversion and negative attitude towards robots co-variate with the duration and frequency of gaze and speech cues.
The experiments were carried out with the humanoid robot iCub and N=56 adult participants.
We found that the more people are extrovert, the more and longer they tend to talk with the robot; and the more people have a negative attitude towards robots, the less they will look at the robot face and the more they will look at the robot hands where the assembly and the contacts occur.
Our results confirm and provide evidence that the engagement models classically used in human-robot interaction should take into account attitudes and personality traits.
Recently, Visual Question Answering (VQA) has emerged as one of the most significant tasks in multimodal learning as it requires understanding both visual and textual modalities.
Existing methods mainly rely on extracting image and question features to learn their joint feature embedding via multimodal fusion or attention mechanism.
Some recent studies utilize external VQA-independent models to detect candidate entities or attributes in images, which serve as semantic knowledge complementary to the VQA task.
However, these candidate entities or attributes might be unrelated to the VQA task and have limited semantic capacities.
To better utilize semantic knowledge in images, we propose a novel framework to learn visual relation facts for VQA.
Specifically, we build up a Relation-VQA (R-VQA) dataset based on the Visual Genome dataset via a semantic similarity module, in which each data consists of an image, a corresponding question, a correct answer and a supporting relation fact.
A well-defined relation detector is then adopted to predict visual question-related relation facts.
We further propose a multi-step attention model composed of visual attention and semantic attention sequentially to extract related visual knowledge and semantic knowledge.
We conduct comprehensive experiments on the two benchmark datasets, demonstrating that our model achieves state-of-the-art performance and verifying the benefit of considering visual relation facts.
With the recent growth of conversational systems and intelligent assistants such as Apple Siri and Google Assistant, mobile devices are becoming even more pervasive in our lives.
As a consequence, users are getting engaged with the mobile apps and frequently search for an information need in their apps.
However, users cannot search within their apps through their intelligent assistants.
This requires a unified mobile search framework that identifies the target app(s) for the user's query, submits the query to the app(s), and presents the results to the user.
In this paper, we take the first step forward towards developing unified mobile search.
In more detail, we introduce and study the task of target apps selection, which has various potential real-world applications.
To this aim, we analyze attributes of search queries as well as user behaviors, while searching with different mobile apps.
The analyses are done based on thousands of queries that we collected through crowdsourcing.
We finally study the performance of state-of-the-art retrieval models for this task and propose two simple yet effective neural models that significantly outperform the baselines.
Our neural approaches are based on learning high-dimensional representations for mobile apps.
Our analyses and experiments suggest specific future directions in this research area.
Wireless sensor networks faces unbalanced energy consumption problem over time.
Clustering provides an energy efficient method to improve lifespan of the sensor network.
Cluster head collects data from other nodes and transmits it towards the sink node.
Cluster heads which are far-off from the sink, consumes more power in transmission of information towards the sink.
We propose Region Based Energy Balanced Inter-cluster communication protocol (RBEBP) to improve lifespan of the sensor network.
Monitored area has been divided into regions; cluster heads are selected from specific region based on the residual energy of nodes in that region.
If energy of nodes of the specific region is low, nodes from another region are selected as cluster heads.
Optimized selection of cluster heads helps in improving lifespan of the sensor network.
In our scheme, cluster heads which are far-off from the sink use another cluster heads as the relay nodes to transmit their data to the sink node.
So energy of cluster heads deplete in a uniform way and complete area remain covered by sensor nodes.
Simulation results demonstrate that RBEBP can effectively reduce total energy depletion and considerably extend lifespan of the network as compared to LEACH protocol.
RBEBP also minimize the problem of energy holes in monitored area and improve the throughput of the network
Effective data visualization is a key part of the discovery process in the era of big data.
It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding.
Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in helping to identify and remove bad data from the analysis.
However, a high complexity or a high dimensionality of modern data sets represents a critical obstacle.
How do we visualize interesting structures and patterns that may exist in hyper-dimensional data spaces?
A better understanding of how we can perceive and interact with multi dimensional information poses some deep questions in the field of cognition technology and human computer interaction.
To this effect, we are exploring the use of immersive virtual reality platforms for scientific data visualization, both as software and inexpensive commodity hardware.
These potentially powerful and innovative tools for multi dimensional data visualization can also provide an easy and natural path to a collaborative data visualization and exploration, where scientists can interact with their data and their colleagues in the same visual space.
Immersion provides benefits beyond the traditional desktop visualization tools: it leads to a demonstrably better perception of a datascape geometry, more intuitive data understanding, and a better retention of the perceived relationships in the data.
In modern heterogeneous MPSoCs, the management of shared memory resources is crucial in delivering end-to-end QoS.
Previous frameworks have either focused on singular QoS targets or the allocation of partitionable resources among CPU applications at relatively slow timescales.
However, heterogeneous MPSoCs typically require instant response from the memory system where most resources cannot be partitioned.
Moreover, the health of different cores in a heterogeneous MPSoC is often measured by diverse performance objectives.
In this work, we propose a Self-Aware Resource Allocation (SARA) framework for heterogeneous MPSoCs.
Priority-based adaptation allows cores to use different target performance and self-monitor their own intrinsic health.
In response, the system allocates non-partitionable resources based on priorities.
The proposed framework meets a diverse range of QoS demands from heterogeneous cores.
We consider a problem of dispersing points on disjoint intervals on a line.
Given n pairwise disjoint intervals sorted on a line, we want to find a point in each interval such that the minimum pairwise distance of these points is maximized.
Based on a greedy strategy, we present a linear time algorithm for the problem.
Further, we also solve in linear time the cycle version of the problem where the intervals are given on a cycle.
There is an increase in usage of smaller cells or femtocells to improve performance and coverage of next-generation heterogeneous wireless networks (HetNets).
However, the interference caused by femtocells to neighboring cells is a limiting performance factor in dense HetNets.
This interference is being managed via distributed resource allocation methods.
However, as the density of the network increases so does the complexity of such resource allocation methods.
Yet, unplanned deployment of femtocells requires an adaptable and self-organizing algorithm to make HetNets viable.
As such, we propose to use a machine learning approach based on Q-learning to solve the resource allocation problem in such complex networks.
By defining each base station as an agent, a cellular network is modelled as a multi-agent network.
Subsequently, cooperative Q-learning can be applied as an efficient approach to manage the resources of a multi-agent network.
Furthermore, the proposed approach considers the quality of service (QoS) for each user and fairness in the network.
In comparison with prior work, the proposed approach can bring more than a four-fold increase in the number of supported femtocells while using cooperative Q-learning to reduce resource allocation overhead.
Hashing has been widely used for large-scale search due to its low storage cost and fast query speed.
By using supervised information, supervised hashing can significantly outperform unsupervised hashing.
Recently, discrete supervised hashing and deep hashing are two representative progresses in supervised hashing.
On one hand, hashing is essentially a discrete optimization problem.
Hence, utilizing supervised information to directly guide discrete (binary) coding procedure can avoid sub-optimal solution and improve the accuracy.
On the other hand, deep hashing, which integrates deep feature learning and hash-code learning into an end-to-end architecture, can enhance the feedback between feature learning and hash-code learning.
The key in discrete supervised hashing is to adopt supervised information to directly guide the discrete coding procedure in hashing.
The key in deep hashing is to adopt the supervised information to directly guide the deep feature learning procedure.
However, there have not existed works which can use the supervised information to directly guide both discrete coding procedure and deep feature learning procedure in the same framework.
In this paper, we propose a novel deep hashing method, called deep discrete supervised hashing (DDSH), to address this problem.
DDSH is the first deep hashing method which can utilize supervised information to directly guide both discrete coding procedure and deep feature learning procedure, and thus enhance the feedback between these two important procedures.
Experiments on three real datasets show that DDSH can outperform other state-of-the-art baselines, including both discrete hashing and deep hashing baselines, for image retrieval.
Low Power Design has become a significant requirement when the CMOS technology entered the nanometer era.
Multiple-Supply Voltage (MSV) is a popular and effective method for both dynamic and static power reduction while maintaining performance.
Level shifters may cause area and Interconnect Length Overhead (ILO), and should be considered at both floorplanning and post-floorplanning stages.
In this paper, we propose a two phases algorithm framework, called VLSAF, to solve voltage and level shifter assignment problem.
At floorplanning phase, we use a convex cost network flow algorithm to assign voltage and a minimum cost flow algorithm to handle level-shifter assignment.
At post-floorplanning phase, a heuristic method is adopted to redistribute white spaces and calculate the positions and shapes of level shifters.
The experimental results show VLSAF is effective.
This paper presents a computational approach to modelling group creativity.
It presents an analysis of two studies of group creativity selected from different research cultures and identifies a common theme ("idea build-up") that is then used in the formalisation of an agent-based model used to support reasoning about the complex dynamics of building on the ideas of others.
We propose a new code design that aims to distribute an LDPC code over a relay channel.
It is based on a split-and-extend approach, which allows the relay to split the set of bits connected to some parity-check of the LDPC code into two or several subsets.
Subsequently, the sums of bits within each subset are used in a repeat-accumulate manner in order to generate extra bits sent from the relay toward the destination.
We show that the proposed design yields LDPC codes with enhanced correction capacity and can be advantageously applied to existing codes, which allows for addressing cooperation issues for evolving standards.
Finally, we derive density evolution equations for the proposed design, and we show that Split-Extended LDPC codes can approach very closely the capacity of the Gaussian relay channel.
For extracting meaningful topics from texts, their structures should be considered properly.
In this paper, we aim to analyze structured time-series documents such as a collection of news articles and a series of scientific papers, wherein topics evolve along time depending on multiple topics in the past and are also related to each other at each time.
To this end, we propose a dynamic and static topic model, which simultaneously considers the dynamic structures of the temporal topic evolution and the static structures of the topic hierarchy at each time.
We show the results of experiments on collections of scientific papers, in which the proposed method outperformed conventional models.
Moreover, we show an example of extracted topic structures, which we found helpful for analyzing research activities.
We propose an approach to decomposing a thematic information stream into principal components.
Each principal component is related to a narrow topic extracted from the information stream.
The essence of the approach arises from analogy with the Fourier transform.
We examine methods for analyzing the principal components and propose using multifractal analysis for identifying similar topics.
The decomposition technique is applied to the information stream dedicated to Brexit.
We provide a comparison between the principal components obtained by applying the decomposition to Brexit stream and the related topics extracted by Google Trends.
We present ABA+, a new approach to handling preferences in a well known structured argumentation formalism, Assumption-Based Argumentation (ABA).
In ABA+, preference information given over assumptions is incorporated directly into the attack relation, thus resulting in attack reversal.
ABA+ conservatively extends ABA and exhibits various desirable features regarding relationship among argumentation semantics as well as preference handling.
We also introduce Weak Contraposition, a principle concerning reasoning with rules and preferences that relaxes the standard principle of contraposition, while guaranteeing additional desirable features for ABA+.
The protection of confidential image data from unauthorized access is an important area of research in network communication.
This paper presents a high-level security encryption scheme for gray scale images.
The gray level image is first decomposed into binary images using bit scale decomposition.
Each binary image is then compressed by selecting a good scanning path that minimizes the total number of bits needed to encode the bit sequence along the scanning path using two dimensional run encoding.
The compressed bit string is then scrambled iteratively using a pseudo-random number generator and finally encrypted using a bit level permutation OMFLIP.
The performance is tested, illustrated and discussed.
This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers.
First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and performs relation classification with differing objectives for each subtask of the shared task.
This approach achieved overall F1 scores of 76.7 and 83.2 for relation classification on clean and noisy data, respectively.
Furthermore, for combined relation extraction and classification on clean data, it obtained F1 scores of 37.4 and 33.6 for each phase.
Our system ranks 3rd in all three sub-tasks of the shared task.
Simplified Molecular Input Line Entry System (SMILES) is a single line text representation of a unique molecule.
One molecule can however have multiple SMILES strings, which is a reason that canonical SMILES have been defined, which ensures a one to one correspondence between SMILES string and molecule.
Here the fact that multiple SMILES represent the same molecule is explored as a technique for data augmentation of a molecular QSAR dataset modeled by a long short term memory (LSTM) cell based neural network.
The augmented dataset was 130 times bigger than the original.
The network trained with the augmented dataset shows better performance on a test set when compared to a model built with only one canonical SMILES string per molecule.
The correlation coefficient R2 on the test set was improved from 0.56 to 0.66 when using SMILES enumeration, and the root mean square error (RMS) likewise fell from 0.62 to 0.55.
The technique also works in the prediction phase.
By taking the average per molecule of the predictions for the enumerated SMILES a further improvement to a correlation coefficient of 0.68 and a RMS of 0.52 was found.
High-dimensional time series are common in many domains.
Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations.
However, most representation learning algorithms for time series data are difficult to interpret.
This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time.
To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling.
This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance.
We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original.
Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space.
This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty.
We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set.
Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data.
Monocular depth estimation aims at estimating a pixelwise depth map for a single image, which has wide applications in scene understanding and autonomous driving.
Existing supervised and unsupervised methods face great challenges.
Supervised methods require large amounts of depth measurement data, which are generally difficult to obtain, while unsupervised methods are usually limited in estimation accuracy.
Synthetic data generated by graphics engines provide a possible solution for collecting large amounts of depth data.
However, the large domain gaps between synthetic and realistic data make directly training with them challenging.
In this paper, we propose to use the stereo matching network as a proxy to learn depth from synthetic data and use predicted stereo disparity maps for supervising the monocular depth estimation network.
Cross-domain synthetic data could be fully utilized in this novel framework.
Different strategies are proposed to ensure learned depth perception capability well transferred across different domains.
Our extensive experiments show state-of-the-art results of monocular depth estimation on KITTI dataset.
The objective of this paper is to introduce an artificial intelligence based optimization approach, which is inspired from Piagets theory on cognitive development.
The approach has been designed according to essential processes that an individual may experience while learning something new or improving his / her knowledge.
These processes are associated with the Piagets ideas on an individuals cognitive development.
The approach expressed in this paper is a simple algorithm employing swarm intelligence oriented tasks in order to overcome single-objective optimization problems.
For evaluating effectiveness of this early version of the algorithm, test operations have been done via some benchmark functions.
The obtained results show that the approach / algorithm can be an alternative to the literature in terms of single-objective optimization.
The authors have suggested the name: Cognitive Development Optimization Algorithm (CoDOA) for the related intelligent optimization approach.
Mobile phone 1800MHz band already allowed to be used in some airlines.
Many studies talked about lower the mobile phone output power to the lowest of 0dBm, but seldom talk about the Random Access Channel RACH, which will emit at the highest power of 30dBm at the instant of call making and is not controllable until the connection between the Base Station and mobile is made.Hence the impact of RACH and TDMA noise generated in the aircraft.
The motivation behind this paper is to dissecting the secure network for Business to Business (B2B) application by Implementing Access Control List (ACL) and Service Level Agreement (SLA).
This data provides the nature of attacks reported as external or internal attacks.
This paper presents the initial finding of attacks, types of attacks and their ratio within specific time.
It demonstrates the advance technique and methodology to reduce the attacks and vulnerabilities and minimize the ratio of attacks to the networks and application and keep the network secure and runs application smoothly regarding that.
It also identifies the location of attacks, the reason behind the attack and the technique used in attacking.
The whole field of system security is limitless and in an evolutionary stage.
To comprehend the exploration being performed today, foundation learning of the web and assaults, the security is vital and in this way they are investigated.
It provides the statistical analytics about various attacks and nature of attacks for acquiring the results through simulation to prove the hypothesis.
Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text.
To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information.
In this work, we explore the idea of incorporating syntactic parse tree into neural networks.
Specifically, we employ the Tree-LSTM model and Tree-GRU model, which are based on the tree structure, to encode the arguments in a relation.
Moreover, we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks.
Experimental results show that our method achieves state-of-the-art performance on PDTB corpus.
Recent years have seen major innovations in developing energy-efficient wireless technologies such as Bluetooth Low Energy (BLE) for Internet of Things (IoT).
Despite demonstrating significant benefits in providing low power transmission and massive connectivity, hardly any of these technologies have made it to directly connect to the Internet.
Recent advances demonstrate the viability of direct communication among heterogeneous IoT devices with incompatible physical (PHY) layers.
These techniques, however, require modifications in transmission power or time, which may affect the media access control (MAC) layer behaviors in legacy networks.
In this paper, we argue that the frequency domain can serve as a free side channel with minimal interruptions to legacy networks.
To this end, we propose DopplerFi, a communication framework that enables a two-way communication channel between BLE and Wi-Fi by injecting artificial Doppler shifts, which can be decoded by sensing the patterns in the Gaussian frequency shift keying (GFSK) demodulator and Channel State Information (CSI).
The artificial Doppler shifts can be compensated by the inherent frequency synchronization module and thus have a negligible impact on legacy communications.
Our evaluation using commercial off-the-shelf (COTS) BLE chips and 802.11-compliant testbeds have demonstrated that DopplerFi can achieve throughput up to 6.5~Kbps at the cost of merely less than 0.8% throughput loss.
Utilizing device-to-device (D2D) connections among mobile devices is promising to meet the increasing throughput demand over cellular links.
In particular, when mobile devices are in close proximity of each other and are interested in the same content, D2D connections such as Wi-Fi Direct can be opportunistically used to construct a cooperative (and jointly operating) cellular and D2D networking system.
However, it is crucial to understand, quantify, and exploit the potential of network coding for cooperating mobile devices in the joint cellular and D2D setup.
In this paper, we consider this problem, and (i) develop a network coding framework, namely NCMI, for cooperative mobile devices in the joint cellular and D2D setup, where cellular and D2D link capacities are the same, and (ii) characterize the performance of the proposed network coding framework, where we use packet completion time, which is the number of transmission slots to recover all packets, as a performance metric.
We demonstrate the benefits of our network coding framework through simulations.
An instance with a bad mask might make a composite image that uses it look fake.
This encourages us to learn segmentation by generating realistic composite images.
To achieve this, we propose a novel framework that exploits a new proposed prior called the independence prior based on Generative Adversarial Networks (GANs).
The generator produces an image with multiple category-specific instance providers, a layout module and a composition module.
Firstly, each provider independently outputs a category-specific instance image with a soft mask.
Then the provided instances' poses are corrected by the layout module.
Lastly, the composition module combines these instances into a final image.
Training with adversarial loss and penalty for mask area, each provider learns a mask that is as small as possible but enough to cover a complete category-specific instance.
Weakly supervised semantic segmentation methods widely use grouping cues modeling the association between image parts, which are either artificially designed or learned with costly segmentation labels or only modeled on local pairs.
Unlike them, our method automatically models the dependence between any parts and learns instance segmentation.
We apply our framework in two cases: (1) Foreground segmentation on category-specific images with box-level annotation.
(2) Unsupervised learning of instance appearances and masks with only one image of homogeneous object cluster (HOC).
We get appealing results in both tasks, which shows the independence prior is useful for instance segmentation and it is possible to unsupervisedly learn instance masks with only one image.
This article explores the coalitional stability of a new cooperative control policy for freeways and parallel queuing facilities with multiple servers.
Based on predicted future delays per queue or lane, a VOT-heterogeneous population of agents can agree to switch lanes or queues and transfer payments to each other in order to minimize the total cost of the incoming platoon.
The strategic interaction is captured by an n-level Stackelberg model with coalitions, while the cooperative structure is formulated as a partition function game (PFG).
The stability concept explored is the strong-core for PFGs which we found appropiate given the nature of the problem.
This concept ensures that the efficient allocation is individually rational and coalitionally stable.
We analyze this control mechanism for two settings: a static vertical queue and a dynamic horizontal queue.
For the former, we first characterize the properties of the underlying cooperative game.
Our simulation results suggest that the setting is always strong-core stable.
For the latter, we propose a new relaxation program for the strong-core concept.
Our simulation results on a freeway bottleneck with constant outflow using Newell's car-following model show the imputations to be generally strong-core stable and the coalitional instabilities to remain small with regard to users' costs.
Rigid structure-from-motion (RSfM) and non-rigid structure-from-motion (NRSfM) have long been treated in the literature as separate (different) problems.
Inspired by a previous work which solved directly for 3D scene structure by factoring the relative camera poses out, we revisit the principle of "maximizing rigidity" in structure-from-motion literature, and develop a unified theory which is applicable to both rigid and non-rigid structure reconstruction in a rigidity-agnostic way.
We formulate these problems as a convex semi-definite program, imposing constraints that seek to apply the principle of minimizing non-rigidity.
Our results demonstrate the efficacy of the approach, with state-of-the-art accuracy on various 3D reconstruction problems.
We explore the hypothesis that it is possible to obtain information about the dynamics of a blog network by analysing the temporal relationships between blogs at a semantic level, and that this type of analysis adds to the knowledge that can be extracted by studying the network only at the structural level of URL links.
We present an algorithm to automatically detect fine-grained discussion topics, characterized by n-grams and time intervals.
We then propose a probabilistic model to estimate the temporal relationships that blogs have with one another.
We define the precursor score of blog A in relation to blog B as the probability that A enters a new topic before B, discounting the effect created by asymmetric posting rates.
Network-level metrics of precursor and laggard behavior are derived from these dyadic precursor score estimations.
This model is used to analyze a network of French political blogs.
The scores are compared to traditional link degree metrics.
We obtain insights into the dynamics of topic participation on this network, as well as the relationship between precursor/laggard and linking behaviors.
We validate and analyze results with the help of an expert on the French blogosphere.
Finally, we propose possible applications to the improvement of search engine ranking algorithms.
The emergence and popularization of online social networks suddenly made available a large amount of data from social organization, interaction and human behavior.
All this information opens new perspectives and challenges to the study of social systems, being of interest to many fields.
Although most online social networks are recent (less than fifteen years old), a vast amount of scientific papers was already published on this topic, dealing with a broad range of analytical methods and applications.
This work describes how computational researches have approached this subject and the methods used to analyze such systems.
Founded on a wide though non-exaustive review of the literature, a taxonomy is proposed to classify and describe different categories of research.
Each research category is described and the main works, discoveries and perspectives are highlighted.
We consider a general small-scale market for agent-to-agent resource sharing, in which each agent could either be a server (seller) or a client (buyer) in each time period.
In every time period, a server has a certain amount of resources that any client could consume, and randomly gets matched with a client.
Our target is to maximize the resource utilization in such an agent-to-agent market, where the agents are strategic.
During each transaction, the server gets money and the client gets resources.
Hence, trade ratio maximization implies efficiency maximization of our system.
We model the proposed market system through a Mean Field Game approach and prove the existence of the Mean Field Equilibrium, which can achieve an almost 100% trade ratio.
Finally, we carry out a simulation study motivated by an agent-to-agent computing market, and a case study on a proposed photovoltaic market, and show the designed market benefits both individuals and the system as a whole.
In this contribution to the 3rd CHiME Speech Separation and Recognition Challenge (CHiME-3) we extend the acoustic front-end of the CHiME-3 baseline speech recognition system by a coherence-based Wiener filter which is applied to the output signal of the baseline beamformer.
To compute the time- and frequency-dependent postfilter gains the ratio between direct and diffuse signal components at the output of the baseline beamformer is estimated and used as approximation of the short-time signal-to-noise ratio.
The proposed spectral enhancement technique is evaluated with respect to word error rates of the CHiME-3 challenge baseline speech recognition system using real speech recorded in public environments.
Results confirm the effectiveness of the coherence-based postfilter when integrated into the front-end signal enhancement.
The prediction of the long-term impact of a scientific article is challenging task, addressed by the bibliometrician through resorting to a proxy whose reliability increases with the breadth of the citation window.
In the national research assessment exercises using metrics the citation window is necessarily short, but in some cases is sufficient to advise the use of simple citations.
For the Italian VQR 2011-2014, the choice was instead made to adopt a linear weighted combination of citations and journal metric percentiles, with weights differentiated by discipline and year.
Given the strategic importance of the exercise, whose results inform the allocation of a significant share of resources for the national academic system, we examined whether the predictive power of the proposed indicator is stronger than the simple citation count.
The results show the opposite, for all discipline in the sciences and a citation window above two years.
We propose a novel encoding/transmission scheme called continuous chain (CC) transmission that is able to improve the finite-length performance of a system using spatially-coupled low-density parity-check (SC-LDPC) codes.
In CC transmission, instead of transmitting a sequence of independent codewords from a terminated SC-LDPC code chain, we connect multiple chains in a layered format, where encoding, transmission, and decoding are now performed in a continuous fashion.
The connections between chains are created at specific points, chosen to improve the finite-length performance of the code structure under iterative decoding.
We describe the design of CC schemes for different SC-LDPC code ensembles constructed from protographs: a (J,K)-regular SC-LDPC code chain, a spatially-coupled repeat-accumulate (SC-RA) code, and a spatially-coupled accumulate-repeat-jagged-accumulate (SC- ARJA) code.
In all cases, significant performance improvements are reported and, in addition, it is shown that using CC transmission only requires a small increase in decoding complexity and decoding delay with respect to a system employing a single SC-LDPC code chain for transmission.
This paper presents a novel mechanism to endogenously determine the fair division of a state into electoral districts in a two-party setting.
No geometric constraints are imposed on voter distributions or district shapes; instead, it is assumed that any partition of the population into districts of equal population is feasible.
One party divides the map, then the other party chooses a minimum threshold level of support needed to win a district.
Districts in which neither party meets this threshold are awarded randomly.
Despite the inherent asymmetry, the equilibria of this mechanism always yield fair outcomes, up to integer rounding.
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics.
GPUs are high performance many-core processors that can obtain very high FLOP rates.
Since the first idea of using GPU for general purpose computing, things have evolved and now there are several approaches to GPU programming: CUDA from NVIDIA and Stream from AMD.
CUDA is now a popular programming model for general purpose computations on GPU for C/C++ programmers.
A great number of applications were ported to CUDA programming model and they obtain speedups of orders of magnitude comparing to optimized CPU implementations.
In this paper we present an implementation of a library for solving linear systems using the CCUDA framework.
We present the results of performance tests and show that using GPU one can obtain speedups of about of approximately 80 times comparing with a CPU implementation.
Cause-effect relations are an important part of human knowledge.
In real life, humans often reason about complex causes linked to complex effects.
By comparison, existing formalisms for representing knowledge about causal relations are quite limited in the kind of specifications of causes and effects they allow.
In this paper, we present the new language C-Log, which offers a significantly more expressive representation of effects, including such features as the creation of new objects.
We show how C-Log integrates with first-order logic, resulting in the language FO(C).
We also compare FO(C) with several related languages and paradigms, including inductive definitions, disjunctive logic programming, business rules and extensions of Datalog.
Vulnerability of dedicated hash functions to various attacks has made the task of designing hash function much more challenging.
This provides us a strong motivation to design a new cryptographic hash function viz.HF-hash.
This is a hash function, whose compression function is designed by using first 32 polynomials of HFE Challenge-1 with 64 variables by forcing remaining 16 variables as zero.
HF-hash gives 256 bits message digest and is as efficient as SHA-256.
It is secure against the differential attack proposed by Chabaud and Joux as well as by Wang et. al. applied to SHA-0 and SHA-1.
One major challenge in training Deep Neural Networks is preventing overfitting.
Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of training data.
In this work, we propose a new regularizer called DeCov which leads to significantly reduced overfitting (as indicated by the difference between train and val performance), and better generalization.
Our regularizer encourages diverse or non-redundant representations in Deep Neural Networks by minimizing the cross-covariance of hidden activations.
This simple intuition has been explored in a number of past works but surprisingly has never been applied as a regularizer in supervised learning.
Experiments across a range of datasets and network architectures show that this loss always reduces overfitting while almost always maintaining or increasing generalization performance and often improving performance over Dropout.
Policy iteration (PI) is a recursive process of policy evaluation and improvement to solve an optimal decision-making, e.g., reinforcement learning (RL) or optimal control problem and has served as the fundamental to develop RL methods.
Motivated by integral PI (IPI) schemes in optimal control and RL methods in continuous time and space (CTS), this paper proposes on-policy IPI to solve the general RL problem in CTS, with its environment modeled by an ordinary differential equation (ODE).
In such continuous domain, we also propose four off-policy IPI methods---two are the ideal PI forms that use advantage and Q-functions, respectively, and the other two are natural extensions of the existing off-policy IPI schemes to our general RL framework.
Compared to the IPI methods in optimal control, the proposed IPI schemes can be applied to more general situations and do not require an initial stabilizing policy to run; they are also strongly relevant to the RL algorithms in CTS such as advantage updating, Q-learning, and value-gradient based (VGB) greedy policy improvement.
Our on-policy IPI is basically model-based but can be made partially model-free; each off-policy method is also either partially or completely model-free.
The mathematical properties of the IPI methods---admissibility, monotone improvement, and convergence towards the optimal solution---are all rigorously proven, together with the equivalence of on- and off-policy IPI.
Finally, the IPI methods are simulated with an inverted-pendulum model to support the theory and verify the performance.
Power and energy consumption is a fundamental issue in Body Sensor Networks (BSNs) since nodes must operate properly and autonomously for a certain period of time without battery replacement or change.
This is due to the fact that the sensors in BSNs are either implanted in the body or are in very near position to the body.
Thus, the duration of replacing the batteries should be of utmost importance.
Most of the existing researches suggested the development of a more improved battery cells or developing an energy aware routing protocol to tackle the energy consumption in WBSN.
But this is not the case as most energy consumption in WBSN occur as a result of mobility in routing and sensor node placement.
Therefore, improving the battery cells might not solve the energy consumption in WBSN.
The Graham-Diaconis inequality shows the equivalence between two well-known methods of measuring the similarity of two given ranked lists of items: Spearman's footrule and Kendall's tau.
The original inequality assumes unweighted items in input lists.
In this paper, we first define versions of these methods for weighted items.
We then prove a generalization of the inequality for the weighted versions.
We introduce Hair-GANs, an architecture of generative adversarial networks, to recover the 3D hair structure from a single image.
The goal of our networks is to build a parametric transformation from 2D hair maps to 3D hair structure.
The 3D hair structure is represented as a 3D volumetric field which encodes both the occupancy and the orientation information of the hair strands.
Given a single hair image, we first align it with a bust model and extract a set of 2D maps encoding the hair orientation information in 2D, along with the bust depth map to feed into our Hair-GANs.
With our generator network, we compute the 3D volumetric field as the structure guidance for the final hair synthesis.
The modeling results not only resemble the hair in the input image but also possesses many vivid details in other views.
The efficacy of our method is demonstrated by using a variety of hairstyles and comparing with the prior art.
In this work we combine two research threads from Vision/ Graphics and Natural Language Processing to formulate an image generation task conditioned on attributes in a multi-turn setting.
By multiturn, we mean the image is generated in a series of steps of user-specified conditioning information.
Our proposed approach is practically useful and offers insights into neural interpretability.
We introduce a framework that includes a novel training algorithm as well as model improvements built for the multi-turn setting.
We demonstrate that this framework generates a sequence of images that match the given conditioning information and that this task is useful for more detailed benchmarking and analysis of conditional image generation methods.
We study gaze estimation on tablets, our key design goal is uncalibrated gaze estimation using the front-facing camera during natural use of tablets, where the posture and method of holding the tablet is not constrained.
We collected the first large unconstrained gaze dataset of tablet users, labeled Rice TabletGaze dataset.
The dataset consists of 51 subjects, each with 4 different postures and 35 gaze locations.
Subjects vary in race, gender and in their need for prescription glasses, all of which might impact gaze estimation accuracy.
Driven by our observations on the collected data, we present a TabletGaze algorithm for automatic gaze estimation using multi-level HoG feature and Random Forests regressor.
The TabletGaze algorithm achieves a mean error of 3.17 cm.
We perform extensive evaluation on the impact of various factors such as dataset size, race, wearing glasses and user posture on the gaze estimation accuracy and make important observations about the impact of these factors.
Given the wide success of convolutional neural networks (CNNs) applied to natural images, researchers have begun to apply them to neuroimaging data.
To date, however, exploration of novel CNN architectures tailored to neuroimaging data has been limited.
Several recent works fail to leverage the 3D structure of the brain, instead treating the brain as a set of independent 2D slices.
Approaches that do utilize 3D convolutions rely on architectures developed for object recognition tasks in natural 2D images.
Such architectures make assumptions about the input that may not hold for neuroimaging.
For example, existing architectures assume that patterns in the brain exhibit translation invariance.
However, a pattern in the brain may have different meaning depending on where in the brain it is located.
There is a need to explore novel architectures that are tailored to brain images.
We present two simple modifications to existing CNN architectures based on brain image structure.
Applied to the task of brain age prediction, our network achieves a mean absolute error (MAE) of 1.4 years and trains 30% faster than a CNN baseline that achieves a MAE of 1.6 years.
Our results suggest that lessons learned from developing models on natural images may not directly transfer to neuroimaging tasks.
Instead, there remains a large space of unexplored questions regarding model development in this area, whose answers may differ from conventional wisdom.
Spoken dialogue systems allow humans to interact with machines using natural speech.
As such, they have many benefits.
By using speech as the primary communication medium, a computer interface can facilitate swift, human-like acquisition of information.
In recent years, speech interfaces have become ever more popular, as is evident from the rise of personal assistants such as Siri, Google Now, Cortana and Amazon Alexa.
Recently, data-driven machine learning methods have been applied to dialogue modelling and the results achieved for limited-domain applications are comparable to or outperform traditional approaches.
Methods based on Gaussian processes are particularly effective as they enable good models to be estimated from limited training data.
Furthermore, they provide an explicit estimate of the uncertainty which is particularly useful for reinforcement learning.
This article explores the additional steps that are necessary to extend these methods to model multiple dialogue domains.
We show that Gaussian process reinforcement learning is an elegant framework that naturally supports a range of methods, including prior knowledge, Bayesian committee machines and multi-agent learning, for facilitating extensible and adaptable dialogue systems.
The quality of the data in spreadsheets is less discussed than the structural integrity of the formulas.
Yet it is an area of great interest to the owners and users of the spreadsheet.
This paper provides an overview of Information Quality (IQ) and Data Quality (DQ) with specific reference to how data is sourced, structured, and presented in spreadsheets.
Users may strive to formulate an adequate textual query for their information need.
Search engines assist the users by presenting query suggestions.
To preserve the original search intent, suggestions should be context-aware and account for the previous queries issued by the user.
Achieving context awareness is challenging due to data sparsity.
We present a probabilistic suggestion model that is able to account for sequences of previous queries of arbitrary lengths.
Our novel hierarchical recurrent encoder-decoder architecture allows the model to be sensitive to the order of queries in the context while avoiding data sparsity.
Additionally, our model can suggest for rare, or long-tail, queries.
The produced suggestions are synthetic and are sampled one word at a time, using computationally cheap decoding techniques.
This is in contrast to current synthetic suggestion models relying upon machine learning pipelines and hand-engineered feature sets.
Results show that it outperforms existing context-aware approaches in a next query prediction setting.
In addition to query suggestion, our model is general enough to be used in a variety of other applications.
The paper introduces sufficient conditions for input-to-state stability (ISS) of a class of impulsive systems with jump maps that depend on time.
Such systems can naturally represent an interconnection of several impulsive systems with different impulse time sequences.
Using a concept of ISS-Lyapunov function for subsystems a small-gain type theorem equipped with a new dwell-time condition to verify ISS of an interconnection has been proven.
Odometry forms an important component of many manned and autonomous systems.
In the rail industry in particular, having precise and robust odometry is crucial for the correct operation of the Automatic Train Protection systems that ensure the safety of high-speed trains in operation around the world.
Two problems commonly encountered in such odometry systems are miscalibration of the wheel encoders and slippage of the wheels under acceleration and braking, resulting in incorrect velocity estimates.
This paper introduces an odometry system that addresses these problems.
It comprises of an Extended Kalman Filter that tracks the calibration of the wheel encoders as state variables, and a measurement pre-processing stage called Sensor Consensus Analysis (SCA) that scales the uncertainty of a measurement based on how consistent it is with the measurements of the other sensors.
SCA uses the statistical z-test to determine when an individual measurement is inconsistent with the other measurements, and scales the uncertainty until the z-test passes.
This system is demonstrated on data from German Intercity-Express high-speed trains and it is shown to successfully deal with errors due to miscalibration and wheel slip.
Bearing only cooperative localization has been used successfully on aerial and ground vehicles.
In this paper we present an extension of the approach to the underwater domain.
The focus is on adapting the technique to handle the challenging visibility conditions underwater.
Furthermore, data from inertial, magnetic, and depth sensors are utilized to improve the robustness of the estimation.
In addition to robotic applications, the presented technique can be used for cave mapping and for marine archeology surveying, both by human divers.
Experimental results from different environments, including a fresh water, low visibility, lake in South Carolina; a cavern in Florida; and coral reefs in Barbados during the day and during the night, validate the robustness and the accuracy of the proposed approach.
A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies.
However, a well established measure of RNNs long-term memory capacity is lacking, and thus formal understanding of the effect of depth on their ability to correlate data throughout time is limited.
Specifically, existing depth efficiency results on convolutional networks do not suffice in order to account for the success of deep RNNs on data of varying lengths.
In order to address this, we introduce a measure of the network's ability to support information flow across time, referred to as the Start-End separation rank, which reflects the distance of the function realized by the recurrent network from modeling no dependency between the beginning and end of the input sequence.
We prove that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts.
Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and provide an exemplar of quantifying this key attribute which may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks.
We obtain our results by considering a class of recurrent networks referred to as Recurrent Arithmetic Circuits, which merge the hidden state with the input via the Multiplicative Integration operation, and empirically demonstrate the discussed phenomena on common RNNs.
Finally, we employ the tool of quantum Tensor Networks to gain additional graphic insight regarding the complexity brought forth by depth in recurrent networks.
Clustering is a useful data exploratory method with its wide applicability in multiple fields.
However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions.
This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization.
The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers.
Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively.
An extensive analysis highlights the superior performance of the proposed method over other techniques.
Image-to-image translation is considered a next frontier in the field of medical image analysis, with numerous potential applications.
However, recent advances in this field offer individualized solutions by utilizing specialized architectures which are task specific or by suffering from limited capacities and thus requiring refinement through non end-to-end training.
In this paper, we propose a novel general purpose framework for medical image-to-image translation, titled MedGAN, which operates in an end-to-end manner on the image level.
MedGAN builds upon recent advances in the field of generative adversarial networks(GANs) by combining the adversarial framework with a unique combination of non-adversarial losses which captures the high and low frequency components of the desired target modality.
Namely, we utilize a discriminator network as a trainable feature extractor which penalizes the discrepancy between the translated medical images and the desired modalities in the pixel and perceptual sense.
Moreover, style-transfer losses are utilized to match the textures and fine-structures of the desired target images to the outputs.
Additionally, we present a novel generator architecture, titled CasNet, which enhances the sharpness of the translated medical outputs through progressive refinement via encoder decoder pairs.
To demonstrate the effectiveness of our approach, we apply MedGAN on three novel and challenging applications: PET-CT translation, correction of MR motion artefacts and PET image denoising.
Qualitative and quantitative comparisons with state-of-the-art techniques have emphasized the superior performance of the proposed framework.
MedGAN can be directly applied as a general framework for future medical translation tasks.
In this research, we investigate the subject of path-finding.
A pruned version of visibility graph based on Candidate Vertices is formulated, followed by a new visibility check technique.
Such combination enables us to quickly identify the useful vertices and thus find the optimal path more efficiently.
The algorithm proposed is demonstrated on various path-finding cases.
The performance of the new technique on visibility graphs is compared to the traditional A* on Grids, Theta* and A* on Visibility Graphs in terms of path length, number of nodes evaluated, as well as computational time.
The key algorithmic contribution is that the new approach combines the merits of grid-based method and visibility graph-based method and thus yields better overall performance.
In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset.
This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition.
This problem is frequently related to the imaging applications, i.e. detection within the scene acquired by a camera.
To maximize available data about the scene, hyperspectral cameras are used; at each pixel, they record spectral data in hundreds of narrow bands.
The typical feature of hyperspectral imaging is that characteristic properties of target materials are visible in the small number of bands, where light of certain wavelength interacts with characteristic molecules.
A target-independent band selection method based on statistical principles is a versatile tool for solving this problem in different practical applications.
Combination of a regular background and a rare standing out anomaly will produce a distortion in the joint distribution of hyperspectral pixels.
Higher Order Cumulants Tensors are a natural `window' into this distribution, allowing to measure properties and suggest candidate bands for removal.
While there have been attempts at producing band selection algorithms based on the 3 rd cumulant's tensor i.e. the joint skewness, the literature lacks a systematic analysis of how the order of the cumulant tensor used affects effectiveness of band selection in detection applications.
In this paper we present an analysis of a general algorithm for band selection based on higher order cumulants.
We discuss its usability related to the observed breaking points in performance, depending both on method order and the desired number of bands.
Finally we perform experiments and evaluate these methods in a hyperspectral detection scenario.
Glaucoma is the second leading cause of blindness all over the world, with approximately 60 million cases reported worldwide in 2010.
If undiagnosed in time, glaucoma causes irreversible damage to the optic nerve leading to blindness.
The optic nerve head examination, which involves measurement of cup-to-disc ratio, is considered one of the most valuable methods of structural diagnosis of the disease.
Estimation of cup-to-disc ratio requires segmentation of optic disc and optic cup on eye fundus images and can be performed by modern computer vision algorithms.
This work presents universal approach for automatic optic disc and cup segmentation, which is based on deep learning, namely, modification of U-Net convolutional neural network.
Our experiments include comparison with the best known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS.
For both optic disc and cup segmentation, our method achieves quality comparable to current state-of-the-art methods, outperforming them in terms of the prediction time.
In the last two decades, DNA self-assembly has grown into a major area of research attracting people from diverse background.
It has numerous potential applications such as targeted drug delivery, artificial photosynthesis etc.
In the last decade, another area received wide attention known as DNA origami, where using M13 virus and carefully designed staple strands one can fold the DNA into desired 2-D and 3-D shapes.
In 2016, a group of researchers at MIT have developed an automated DNA nanostructures strategy and an open source software 'daedalus' based on MATLAB for developing the nanostructures.
In this work, we present a truly open source software '3dnaprinter' based on Java (without MATLAB) that can do the same work.
Speech emotion recognition is a challenging task for three main reasons: 1) human emotion is abstract, which means it is hard to distinguish; 2) in general, human emotion can only be detected in some specific moments during a long utterance; 3) speech data with emotional labeling is usually limited.
In this paper, we present a novel attention based fully convolutional network for speech emotion recognition.
We employ fully convolutional network as it is able to handle variable-length speech, free of the demand of segmentation to keep critical information not lost.
The proposed attention mechanism can make our model be aware of which time-frequency region of speech spectrogram is more emotion-relevant.
Considering limited data, the transfer learning is also adapted to improve the accuracy.
Especially, it's interesting to observe obvious improvement obtained with natural scene image based pre-trained model.
Validated on the publicly available IEMOCAP corpus, the proposed model outperformed the state-of-the-art methods with a weighted accuracy of 70.4% and an unweighted accuracy of 63.9% respectively.
Technological developments call for increasing perception and action capabilities of robots.
Among other skills, vision systems that can adapt to any possible change in the working conditions are needed.
Since these conditions are unpredictable, we need benchmarks which allow to assess the generalization and robustness capabilities of our visual recognition algorithms.
In this work we focus on robotic kitting in unconstrained scenarios.
As a first contribution, we present a new visual dataset for the kitting task.
Differently from standard object recognition datasets, we provide images of the same objects acquired under various conditions where camera, illumination and background are changed.
This novel dataset allows for testing the robustness of robot visual recognition algorithms to a series of different domain shifts both in isolation and unified.
Our second contribution is a novel online adaptation algorithm for deep models, based on batch-normalization layers, which allows to continuously adapt a model to the current working conditions.
Differently from standard domain adaptation algorithms, it does not require any image from the target domain at training time.
We benchmark the performance of the algorithm on the proposed dataset, showing its capability to fill the gap between the performances of a standard architecture and its counterpart adapted offline to the given target domain.
We study the problem of compressing recurrent neural networks (RNNs).
In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can be run efficiently on mobile devices.
In this work, we present a technique for general recurrent model compression that jointly compresses both recurrent and non-recurrent inter-layer weight matrices.
We find that the proposed technique allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to a third of its original size with negligible loss in accuracy.
This paper investigates the relations between three different properties, which are of importance in optimal control problems: dissipativity of the underlying dynamics with respect to a specific supply rate, optimal operation at steady state, and the turnpike property.
We show in a continuous-time setting that if along optimal trajectories a strict dissipation inequality is satisfied, then this implies optimal operation at this steady state and the existence of a turnpike at the same steady state.
Finally, we establish novel converse turnpike results, i.e., we show that the existence of a turnpike at a steady state implies optimal operation at this steady state and dissipativity with respect to this steady state.
We draw upon a numerical example to illustrate our findings.
Smart city projects address many of the current problems afflicting high populated areas and cities and, as such, are a target for government, institutions and private organizations that plan to explore its foreseen advantages.
In technical terms, smart city projects present a complex set of requirements including a large number users with highly different and heterogeneous requirements.
In this scenario, this paper proposes and analyses the impact and perspectives on adopting software-defined networking and artificial intelligence as innovative approaches for smart city project development and deployment.
Big data is also considered as an inherent element of most smart city project that must be tackled.
A framework layered view is proposed with a discussion about software-defined networking and machine learning impacts on innovation followed by a use case that demonstrates the potential benefits of cognitive learning for smart cities.
It is argued that the complexity of smart city projects do require new innovative approaches that potentially result in more efficient and intelligent systems.
The increasingly dense deployments of wireless CSMA networks arising from applications of Internet-of-things call for an improvement to mitigate the interference among simultaneous transmitting wireless devices.
For cost efficiency and backward compatibility with legacy transceiver hardware, a simple approach to address interference is by appropriately configuring the carrier sensing thresholds in wireless CSMA protocols, particularly in dense wireless networks.
Most prior studies of the configuration of carrier sensing thresholds are based on a simplified conflict graph model, whereas this paper considers a realistic signal-to-interference-and-noise ratio model.
We provide a comprehensive study for two effective wireless CSMA protocols: Cumulative-interference-Power Carrier Sensing and Incremental-interference-Power Carrier Sensing, in two aspects: (1) static approach that sets a universal carrier sensing threshold to ensure interference-safe transmissions regardless of network topology, and (2) adaptive approach that adjusts the carrier sensing thresholds dynamically based on the feedback of nearby transmissions.
We also provide simulation studies to evaluate the starvation ratio, fairness, and goodput of our approaches.
This study considers the control of parent-child systems where a parent system is acted on by a set of controllable child systems (i.e. a swarm).
Examples of such systems include a swarm of robots pushing an object over a surface, a swarm of aerial vehicles carrying a large load, or a set of end effectors manipulating an object.
In this paper, a general approach for decoupling the swarm from the parent system through a low-dimensional abstract state space is presented.
The requirements of this approach are given along with how constraints on both systems propagate through the abstract state and impact the requirements of the controllers for both systems.
To demonstrate, several controllers with hard state constraints are designed to track a given desired angle trajectory of a tilting plane with a swarm of robots driving on top.
Both homogeneous and heterogeneous swarms of varying sizes and properties are considered to test the robustness of this architecture.
The controllers are shown to be locally asymptotically stable and are demonstrated in simulation.
Label space expansion for multi-label classification (MLC) is a methodology that encodes the original label vectors to higher dimensional codes before training and decodes the predicted codes back to the label vectors during testing.
The methodology has been demonstrated to improve the performance of MLC algorithms when coupled with off-the-shelf error-correcting codes for encoding and decoding.
Nevertheless, such a coding scheme can be complicated to implement, and cannot easily satisfy a common application need of cost-sensitive MLC---adapting to different evaluation criteria of interest.
In this work, we show that a simpler coding scheme based on the concept of a reference pair of label vectors achieves cost-sensitivity more naturally.
In particular, our proposed cost-sensitive reference pair encoding (CSRPE) algorithm contains cluster-based encoding, weight-based training and voting-based decoding steps, all utilizing the cost information.
Furthermore, we leverage the cost information embedded in the code space of CSRPE to propose a novel active learning algorithm for cost-sensitive MLC.
Extensive experimental results verify that CSRPE performs better than state-of-the-art algorithms across different MLC criteria.
The results also demonstrate that the CSRPE-backed active learning algorithm is superior to existing algorithms for active MLC, and further justify the usefulness of CSRPE.
Label embedding (LE) is an important family of multi-label classification algorithms that digest the label information jointly for better performance.
Different real-world applications evaluate performance by different cost functions of interest.
Current LE algorithms often aim to optimize one specific cost function, but they can suffer from bad performance with respect to other cost functions.
In this paper, we resolve the performance issue by proposing a novel cost-sensitive LE algorithm that takes the cost function of interest into account.
The proposed algorithm, cost-sensitive label embedding with multidimensional scaling (CLEMS), approximates the cost information with the distances of the embedded vectors by using the classic multidimensional scaling approach for manifold learning.
CLEMS is able to deal with both symmetric and asymmetric cost functions, and effectively makes cost-sensitive decisions by nearest-neighbor decoding within the embedded vectors.
We derive theoretical results that justify how CLEMS achieves the desired cost-sensitivity.
Furthermore, extensive experimental results demonstrate that CLEMS is significantly better than a wide spectrum of existing LE algorithms and state-of-the-art cost-sensitive algorithms across different cost functions.
Grounding (i.e. localizing) arbitrary, free-form textual phrases in visual content is a challenging problem with many applications for human-computer interaction and image-text reference resolution.
Few datasets provide the ground truth spatial localization of phrases, thus it is desirable to learn from data with no or little grounding supervision.
We propose a novel approach which learns grounding by reconstructing a given phrase using an attention mechanism, which can be either latent or optimized directly.
During training our approach encodes the phrase using a recurrent network language model and then learns to attend to the relevant image region in order to reconstruct the input phrase.
At test time, the correct attention, i.e., the grounding, is evaluated.
If grounding supervision is available it can be directly applied via a loss over the attention mechanism.
We demonstrate the effectiveness of our approach on the Flickr 30k Entities and ReferItGame datasets with different levels of supervision, ranging from no supervision over partial supervision to full supervision.
Our supervised variant improves by a large margin over the state-of-the-art on both datasets.
The results of chest X-ray (CXR) analysis of 2D images to get the statistically reliable predictions (availability of tuberculosis) by computer-aided diagnosis (CADx) on the basis of deep learning are presented.
They demonstrate the efficiency of lung segmentation, lossless and lossy data augmentation for CADx of tuberculosis by deep convolutional neural network (CNN) applied to the small and not well-balanced dataset even.
CNN demonstrates ability to train (despite overfitting) on the pre-processed dataset obtained after lung segmentation in contrast to the original not-segmented dataset.
Lossless data augmentation of the segmented dataset leads to the lowest validation loss (without overfitting) and nearly the same accuracy (within the limits of standard deviation) in comparison to the original and other pre-processed datasets after lossy data augmentation.
The additional limited lossy data augmentation results in the lower validation loss, but with a decrease of the validation accuracy.
In conclusion, besides the more complex deep CNNs and bigger datasets, the better progress of CADx for the small and not well-balanced datasets even could be obtained by better segmentation, data augmentation, dataset stratification, and exclusion of non-evident outliers.
The study of eye gaze fixations on photographic images is an active research area.
In contrast, the image subcategory of freehand sketches has not received as much attention for such studies.
In this paper, we analyze the results of a free-viewing gaze fixation study conducted on 3904 freehand sketches distributed across 160 object categories.
Our analysis shows that fixation sequences exhibit marked consistency within a sketch, across sketches of a category and even across suitably grouped sets of categories.
This multi-level consistency is remarkable given the variability in depiction and extreme image content sparsity that characterizes hand-drawn object sketches.
In our paper, we show that the multi-level consistency in the fixation data can be exploited to (a) predict a test sketch's category given only its fixation sequence and (b) build a computational model which predicts part-labels underlying fixations on objects.
We hope that our findings motivate the community to deem sketch-like representations worthy of gaze-based studies vis-a-vis photographic images.
Social networks have been popular platforms for information propagation.
An important use case is viral marketing: given a promotion budget, an advertiser can choose some influential users as the seed set and provide them free or discounted sample products; in this way, the advertiser hopes to increase the popularity of the product in the users' friend circles by the world-of-mouth effect, and thus maximizes the number of users that information of the production can reach.
There has been a body of literature studying the influence maximization problem.
Nevertheless, the existing studies mostly investigate the problem on a one-off basis, assuming fixed known influence probabilities among users, or the knowledge of the exact social network topology.
In practice, the social network topology and the influence probabilities are typically unknown to the advertiser, which can be varying over time, i.e., in cases of newly established, strengthened or weakened social ties.
In this paper, we focus on a dynamic non-stationary social network and design a randomized algorithm, RSB, based on multi-armed bandit optimization, to maximize influence propagation over time.
The algorithm produces a sequence of online decisions and calibrates its explore-exploit strategy utilizing outcomes of previous decisions.
It is rigorously proven to achieve an upper-bounded regret in reward and applicable to large-scale social networks.
Practical effectiveness of the algorithm is evaluated using both synthetic and real-world datasets, which demonstrates that our algorithm outperforms previous stationary methods under non-stationary conditions.
We present two novel algorithms for learning formulas in Linear Temporal Logic (LTL) from examples.
The first learning algorithm reduces the learning task to a series of satisfiability problems in propositional Boolean logic and produces a smallest LTL formula (in terms of the number of subformulas) that is consistent with the given data.
Our second learning algorithm, on the other hand, combines the SAT-based learning algorithm with classical algorithms for learning decision trees.
The result is a learning algorithm that scales to real-world scenarios with hundreds of examples, but can no longer guarantee to produce minimal consistent LTL formulas.
We compare both learning algorithms and demonstrate their performance on a wide range of synthetic benchmarks.
Additionally, we illustrate their usefulness on the task of understanding executions of a leader election protocol.
Would you like to have your own cryptography method?
Experts say you should not do it.
If you think you can develop a better cryptography method anyway.
We present a brief discussion about some well known cryptography methods and how our model fails against the traditional attacks.
We do not want to discourage anybody, we just want to show that, despite of the importance of developing better cryptography models, it is a very hard task.
Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections.
Hence, CS can be thought of as a natural candidate for acquisition of multidimensional signals, as the amount of data acquired and processed by conventional sensors could create problems in terms of computational complexity.
In this paper, we propose a framework for the acquisition and reconstruction of multidimensional correlated signals.
The approach is general and can be applied to D dimensional signals, even if the algorithms we propose to practically implement such architectures apply to 2-D and 3-D signals.
The proposed architectures employ iterative local signal reconstruction based on a hybrid transform/prediction correlation model, coupled with a proper initialization strategy.
The theory of distributed conceptual structures, as outlined in this paper, is concerned with the distribution and conception of knowledge.
It rests upon two related theories, Information Flow and Formal Concept Analysis, which it seeks to unify.
Information Flow (IF) is concerned with the distribution of knowledge.
The foundations of Information Flow is explicitly based upon a mathematical theory known as the Chu Construction in *-autonomous categories and implicitly based upon the mathematics of closed categories.
Formal Concept Analysis (FCA) is concerned with the conception and analysis of knowledge.
In this paper we connect these two studies by extending the basic theorem of Formal Concept Analysis to the distributed realm of Information Flow.
The main results are the categorical equivalence between classifications and concept lattices at the level of functions, and the categorical equivalence between bonds and complete adjoints at the level of relations.
With this we hope to accomplish a rapprochement between Information Flow and Formal Concept Analysis.
In this paper, we describe our algorithmic approach, which was used for submissions in the fifth Emotion Recognition in the Wild (EmotiW 2017) group-level emotion recognition sub-challenge.
We extracted feature vectors of detected faces using the Convolutional Neural Network trained for face identification task, rather than traditional pre-training on emotion recognition problems.
In the final pipeline an ensemble of Random Forest classifiers was learned to predict emotion score using available training set.
In case when the faces have not been detected, one member of our ensemble extracts features from the whole image.
During our experimental study, the proposed approach showed the lowest error rate when compared to other explored techniques.
In particular, we achieved 75.4% accuracy on the validation data, which is 20% higher than the handcrafted feature-based baseline.
The source code using Keras framework is publicly available.
Analysis of informative contents and sentiments of social users has been attempted quite intensively in the recent past.
Most of the systems are usable only for monolingual data and fails or gives poor results when used on data with code-mixing property.
To gather attention and encourage researchers to work on this crisis, we prepared gold standard Bengali-English code-mixed data with language and polarity tag for sentiment analysis purposes.
In this paper, we discuss the systems we prepared to collect and filter raw Twitter data.
In order to reduce manual work while annotation, hybrid systems combining rule based and supervised models were developed for both language and sentiment tagging.
The final corpus was annotated by a group of annotators following a few guidelines.
The gold standard corpus thus obtained has impressive inter-annotator agreement obtained in terms of Kappa values.
Various metrics like Code-Mixed Index (CMI), Code-Mixed Factor (CF) along with various aspects (language and emotion) also qualitatively polled the code-mixed and sentiment properties of the corpus.
Industrial Control Systems (ICS) are used worldwide in critical infrastructures.
An ICS system can be a single embedded system working stand-alone for controlling a simple process or ICS can also be a very complex Distributed Control System (DCS) connected to Supervisory Control And Data Acquisition (SCADA) system(s) in a nuclear power plant.
Although ICS are widely used to-day, there are very little research on the forensic acquisition and analyze ICS artefacts.
In this paper we present a case study of forensics in ICS where we de-scribe a method of safeguarding important volatile artefacts from an embedded industrial control system and several other sources
In this paper, we present a vision based collaborative localization framework for groups of micro aerial vehicles (MAV).
The vehicles are each assumed to be equipped with a forward-facing monocular camera, and to be capable of communicating with each other.
This collaborative localization approach is built upon a distributed algorithm where individual and relative pose estimation techniques are combined for the group to localize against surrounding environments.
The MAVs initially detect and match salient features between each other to create a sparse reconstruction of the observed environment, which acts as a global map.
Once a map is available, each MAV performs feature detection and tracking with a robust outlier rejection process to estimate its own six degree-of-freedom pose.
Occasionally, the MAVs can also fuse relative measurements with individual measurements through feature matching and multiple-view geometry based relative pose computation.
We present the implementation of this algorithm for MAVs and environments simulated within Microsoft AirSim, and discuss the results and the advantages of collaborative localization.
In standard graph clustering/community detection, one is interested in partitioning the graph into more densely connected subsets of nodes.
In contrast, the "search" problem of this paper aims to only find the nodes in a "single" such community, the target, out of the many communities that may exist.
To do so , we are given suitable side information about the target; for example, a very small number of nodes from the target are labeled as such.
We consider a general yet simple notion of side information: all nodes are assumed to have random weights, with nodes in the target having higher weights on average.
Given these weights and the graph, we develop a variant of the method of moments that identifies nodes in the target more reliably, and with lower computation, than generic community detection methods that do not use side information and partition the entire graph.
Our empirical results show significant gains in runtime, and also gains in accuracy over other graph clustering algorithms.
Knowledge Management (KM) is a relatively new phenomenon that appears in the field of Public Sector Organizations (PSO) bringing new paradigms of organizational management, challenges, risks and opportunities for its implementation, development and evaluation.
KM can be seen as a systematic and deliberate effort to coordinate people, technology, organizational structures and its environment through knowledge reuse and innovation.
This management approach has been established in parallel with the development and use of information and communications technologies (ICT).
Nowadays more PSO are embodying KM practices in their core processes for support them, and as an advanced management strategy to create a new culture based on technology and resources efficiency.
In this paper, we observed that KM can support organizational goals in PSO.
The aim of this paper is to understand KM factors and its associated components, and propose KM metrics for measure KM programs in PSO.
Through a critical literature review we analysed diverse studies related with KM performance indicators in PSO, then based on previous works we summarized the more convenient this purpose.
We found that, in academic literature, studies about KM measurement in PSO are uncommon and emerging.
As well, in the last section of this paper, we present a proposal of KM metrics for PSO, and some recommendations and practical implications for KM metrics development in PSO.
This academic endeavour seeks to contribute to theoretical debate about KM measure development for KM initiatives in PSO.
MPSoCs are gaining popularity because of its potential to solve computationally expensive applications.
A multi-core processor combines two or more independent cores (normally a CPU) into a single package composed of a single integrated circuit (Chip).
However, as the number of components on a single chip and their performance continue to increase, a shift from computation-based to communication-based design becomes mandatory.
As a result, the communication architecture plays a major role in the area, performance, and energy consumption of the overall system.
In this paper, multiple soft-cores (IPs) such as Micro Blaze in an FPGA is used to study the effect of different connection topologies on the performance of a parallel program.
Learning fine-grained details is a key issue in image aesthetic assessment.
Most of the previous methods extract the fine-grained details via random cropping strategy, which may undermine the integrity of semantic information.
Extensive studies show that humans perceive fine-grained details with a mixture of foveal vision and peripheral vision.
Fovea has the highest possible visual acuity and is responsible for seeing the details.
The peripheral vision is used for perceiving the broad spatial scene and selecting the attended regions for the fovea.
Inspired by these observations, we propose a Gated Peripheral-Foveal Convolutional Neural Network (GPF-CNN).
It is a dedicated double-subnet neural network, i.e. a peripheral subnet and a foveal subnet.
The former aims to mimic the functions of peripheral vision to encode the holistic information and provide the attended regions.
The latter aims to extract fine-grained features on these key regions.
Considering that the peripheral vision and foveal vision play different roles in processing different visual stimuli, we further employ a gated information fusion (GIF) network to weight their contributions.
The weights are determined through the fully connected layers followed by a sigmoid function.
We conduct comprehensive experiments on the standard AVA and Photo.net datasets for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction.
The experimental results demonstrate the effectiveness of the proposed method.
We investigate a variety of problems of finding tours and cycle covers with minimum turn cost.
Questions of this type have been studied in the past, with complexity and approximation results as well as open problems dating back to work by Arkin et al.in 2001.
A wide spectrum of practical applications have renewed the interest in these questions, and spawned variants: for full coverage, every point has to be covered, for subset coverage, specific points have to be covered, and for penalty coverage, points may be left uncovered by incurring an individual penalty.
We make a number of contributions.
We first show that finding a minimum-turn (full) cycle cover is NP-hard even in 2-dimensional grid graphs, solving the long-standing open Problem 53 in The Open Problems Project edited by Demaine, Mitchell and O'Rourke.
We also prove NP-hardness of finding a subset cycle cover of minimum turn cost in thin grid graphs, for which Arkin et al.gave a polynomial-time algorithm for full coverage; this shows that their boundary techniques cannot be applied to compute exact solutions for subset and penalty variants.
On the positive side, we establish the first constant-factor approximation algorithms for all considered subset and penalty problem variants, making use of LP/IP techniques.
For full coverage in more general grid graphs (e.g., hexagonal grids), our approximation factors are better than the combinatorial ones of Arkin et al.
Our approach can also be extended to other geometric variants, such as scenarios with obstacles and linear combinations of turn and distance costs.
Monte Carlo Tree Search techniques have generally dominated General Video Game Playing, but recent research has started looking at Evolutionary Algorithms and their potential at matching Tree Search level of play or even outperforming these methods.
Online or Rolling Horizon Evolution is one of the options available to evolve sequences of actions for planning in General Video Game Playing, but no research has been done up to date that explores the capabilities of the vanilla version of this algorithm in multiple games.
This study aims to critically analyse the different configurations regarding population size and individual length in a set of 20 games from the General Video Game AI corpus.
Distinctions are made between deterministic and stochastic games, and the implications of using superior time budgets are studied.
Results show that there is scope for the use of these techniques, which in some configurations outperform Monte Carlo Tree Search, and also suggest that further research in these methods could boost their performance.
Although deep learning models have been successfully applied to a variety of tasks, due to the millions of parameters, they are becoming increasingly opaque and complex.
In order to establish trust for their widespread commercial use, it is important to formalize a principled framework to reason over these models.
In this work, we use ideas from causal inference to describe a general framework to reason over CNN models.
Specifically, we build a Structural Causal Model (SCM) as an abstraction over a specific aspect of the CNN.
We also formulate a method to quantitatively rank the filters of a convolution layer according to their counterfactual importance.
We illustrate our approach with popular CNN architectures such as LeNet5, VGG19, and ResNet32.
Numerous fake images spread on social media today and can severely jeopardize the credibility of online content to public.
In this paper, we employ deep networks to learn distinct fake image related features.
In contrast to authentic images, fake images tend to be eye-catching and visually striking.
Compared with traditional visual recognition tasks, it is extremely challenging to understand these psychologically triggered visual patterns in fake images.
Traditional general image classification datasets, such as ImageNet set, are designed for feature learning at the object level but are not suitable for learning the hyper-features that would be required by image credibility analysis.
In order to overcome the scarcity of training samples of fake images, we first construct a large-scale auxiliary dataset indirectly related to this task.
This auxiliary dataset contains 0.6 million weakly-labeled fake and real images collected automatically from social media.
Through an AdaBoost-like transfer learning algorithm, we train a CNN model with a few instances in the target training set and 0.6 million images in the collected auxiliary set.
This learning algorithm is able to leverage knowledge from the auxiliary set and gradually transfer it to the target task.
Experiments on a real-world testing set show that our proposed domain transferred CNN model outperforms several competing baselines.
It obtains superiror results over transfer learning methods based on the general ImageNet set.
Moreover, case studies show that our proposed method reveals some interesting patterns for distinguishing fake and authentic images.
Sentiment understanding has been a long-term goal of AI in the past decades.
This paper deals with sentence-level sentiment classification.
Though a variety of neural network models have been proposed very recently, however, previous models either depend on expensive phrase-level annotation, whose performance drops substantially when trained with only sentence-level annotation; or do not fully employ linguistic resources (e.g., sentiment lexicons, negation words, intensity words), thus not being able to produce linguistically coherent representations.
In this paper, we propose simple models trained with sentence-level annotation, but also attempt to generating linguistically coherent representations by employing regularizers that model the linguistic role of sentiment lexicons, negation words, and intensity words.
Results show that our models are effective to capture the sentiment shifting effect of sentiment, negation, and intensity words, while still obtain competitive results without sacrificing the models' simplicity.
The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification.
However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models.
This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification.
Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture.
Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification.
We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a "shallow" dictionary learning model with augmentation.
Finally, we examine the influence of each augmentation on the model's classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.
In this work, we propose using camera arrays coupled with coherent illumination as an effective method of improving spatial resolution in long distance images by a factor of ten and beyond.
Recent advances in ptychography have demonstrated that one can image beyond the diffraction limit of the objective lens in a microscope.
We demonstrate a similar imaging system to image beyond the diffraction limit in long range imaging.
We emulate a camera array with a single camera attached to an X-Y translation stage.
We show that an appropriate phase retrieval based reconstruction algorithm can be used to effectively recover the lost high resolution details from the multiple low resolution acquired images.
We analyze the effects of noise, required degree of image overlap, and the effect of increasing synthetic aperture size on the reconstructed image quality.
We show that coherent camera arrays have the potential to greatly improve imaging performance.
Our simulations show resolution gains of 10x and more are achievable.
Furthermore, experimental results from our proof-of-concept systems show resolution gains of 4x-7x for real scenes.
Finally, we introduce and analyze in simulation a new strategy to capture macroscopic Fourier Ptychography images in a single snapshot, albeit using a camera array.
The ability to semantically interpret hand-drawn line sketches, although very challenging, can pave way for novel applications in multimedia.
We propose SketchParse, the first deep-network architecture for fully automatic parsing of freehand object sketches.
SketchParse is configured as a two-level fully convolutional network.
The first level contains shared layers common to all object categories.
The second level contains a number of expert sub-networks.
Each expert specializes in parsing sketches from object categories which contain structurally similar parts.
Effectively, the two-level configuration enables our architecture to scale up efficiently as additional categories are added.
We introduce a router layer which (i) relays sketch features from shared layers to the correct expert (ii) eliminates the need to manually specify object category during inference.
To bypass laborious part-level annotation, we sketchify photos from semantic object-part image datasets and use them for training.
Our architecture also incorporates object pose prediction as a novel auxiliary task which boosts overall performance while providing supplementary information regarding the sketch.
We demonstrate SketchParse's abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval.
As a novel application, we also outline how SketchParse's output can be used to generate caption-style descriptions for hand-drawn sketches.
Recently, compressed sensing (CS) computed tomography (CT) using sparse projection views has been extensively investigated to reduce the potential risk of radiation to patient.
However, due to the insufficient number of projection views, an analytic reconstruction approach results in severe streaking artifacts and CS-based iterative approach is computationally very expensive.
To address this issue, here we propose a novel deep residual learning approach for sparse view CT reconstruction.
Specifically, based on a novel persistent homology analysis showing that the manifold of streaking artifacts is topologically simpler than original ones, a deep residual learning architecture that estimates the streaking artifacts is developed.
Once a streaking artifact image is estimated, an artifact-free image can be obtained by subtracting the streaking artifacts from the input image.
Using extensive experiments with real patient data set, we confirm that the proposed residual learning provides significantly better image reconstruction performance with several orders of magnitude faster computational speed.
In structure learning, the output is generally a structure that is used as supervision information to achieve good performance.
Considering the interpretation of deep learning models has raised extended attention these years, it will be beneficial if we can learn an interpretable structure from deep learning models.
In this paper, we focus on Recurrent Neural Networks (RNNs) whose inner mechanism is still not clearly understood.
We find that Finite State Automaton (FSA) that processes sequential data has more interpretable inner mechanism and can be learned from RNNs as the interpretable structure.
We propose two methods to learn FSA from RNN based on two different clustering methods.
We first give the graphical illustration of FSA for human beings to follow, which shows the interpretability.
From the FSA's point of view, we then analyze how the performance of RNNs are affected by the number of gates, as well as the semantic meaning behind the transition of numerical hidden states.
Our results suggest that RNNs with simple gated structure such as Minimal Gated Unit (MGU) is more desirable and the transitions in FSA leading to specific classification result are associated with corresponding words which are understandable by human beings.
In this paper, we propose a novel waveform design which efficiently combines two air interface components: Frequency and Quadrature-Amplitude Modulation (FQAM) and Filter Bank Multicarrier (FBMC).
The proposed approach takes the unique characteristics of FQAM into consideration and exploits the design of prototype filters for FBMC to effectively avoid self-interference between adjacent subcarriers in the complex domain, thus providing improved performance compared with conventional solutions in terms of self-interference, spectrum confinement and complexity with negligible rate loss.
The moderation of content in many social media systems, such as Twitter and Facebook, motivated the emergence of a new social network system that promotes free speech, named Gab.
Soon after that, Gab has been removed from Google Play Store for violating the company's hate speech policy and it has been rejected by Apple for similar reasons.
In this paper we characterize Gab, aiming at understanding who are the users who joined it and what kind of content they share in this system.
Our findings show that Gab is a very politically oriented system that hosts banned users from other social networks, some of them due to possible cases of hate speech and association with extremism.
We provide the first measurement of news dissemination inside a right-leaning echo chamber, investigating a social media where readers are rarely exposed to content that cuts across ideological lines, but rather are fed with content that reinforces their current political or social views.
Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently.
To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs).
Markov Chains assume that a user's next action can be predicted on the basis of just their last (or last few) actions, while RNNs in principle allow for longer-term semantics to be uncovered.
Generally speaking, MC-based methods perform best in extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser datasets where higher model complexity is affordable.
The goal of our work is to balance these two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based on relatively few actions (like an MC).
At each time step, SASRec seeks to identify which items are `relevant' from a user's action history, and use them to predict the next item.
Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets.
Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models.
Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.
In this paper we propose a segmentation-free query by string word spotting method.
Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC).
These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings.
In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation.
On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation.
Finally we introduce a re-ranking step in order to boost retrieval performance.
We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets
A basic setup of a two-tier network, where two mobile users exchange messages with a multi-antenna macrocell basestation, is studied from a rate perspective subject to beamforming and power constraints.
The communication is facilitated by two femtocell basestations which act as relays as there is no direct link between the macrocell basestation and the mobile users.
We propose a scheme based on physical-layer network coding and compute-and-forward combined with a novel approach that solves the problem of beamformer design and power allocation.
We also show that the optimal beamformers are always a convex combination of the channels between the macro- and femtocell basestations.
We then establish the cut-set bound of the setup to show that the presented scheme almost achieves the capacity of the setup numerically.
A large number of services for research data management strive to adhere to the FAIR guiding principles for scientific data management and stewardship.
To evaluate these services and to indicate possible improvements, use-case-centric metrics are needed as an addendum to existing metric frameworks.
The retrieval of spatially and temporally annotated images can exemplify such a use case.
The prototypical implementation indicates that currently no research data repository achieves the full score.
Suggestions on how to increase the score include automatic annotation based on the metadata inside the image file and support for content negotiation to retrieve the images.
These and other insights can lead to an improvement of data integration workflows, resulting in a better and more FAIR approach to manage research data.
Derivatives play a critical role in computational statistics, examples being Bayesian inference using Hamiltonian Monte Carlo sampling and the training of neural networks.
Automatic differentiation is a powerful tool to automate the calculation of derivatives and is preferable to more traditional methods, especially when differentiating complex algorithms and mathematical functions.
The implementation of automatic differentiation however requires some care to insure efficiency.
Modern differentiation packages deploy a broad range of computational techniques to improve applicability, run time, and memory management.
Among these techniques are operation overloading, region based memory, and expression templates.
There also exist several mathematical techniques which can yield high performance gains when applied to complex algorithms.
For example, semi-analytical derivatives can reduce by orders of magnitude the runtime required to numerically solve and differentiate an algebraic equation.
Open problems include the extension of current packages to provide more specialized routines, and efficient methods to perform higher-order differentiation.
Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently.
These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements.
Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets.
One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry.
However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself.
Improving on this situation, we present a scalable, communication-efficient, parallel algorithm for computing the Wasserstein barycenter of arbitrary distributions.
Our algorithm can operate directly on continuous input distributions and is optimized for streaming data.
Our method is even robust to nonstationary input distributions and produces a barycenter estimate that tracks the input measures over time.
The algorithm is semi-discrete, needing to discretize only the barycenter estimate.
To the best of our knowledge, we also provide the first bounds on the quality of the approximate barycenter as the discretization becomes finer.
Finally, we demonstrate the practical effectiveness of our method, both in tracking moving distributions on a sphere, as well as in a large-scale Bayesian inference task.
The identification of semantic relations between terms within texts is a fundamental task in Natural Language Processing which can support applications requiring a lightweight semantic interpretation model.
Currently, semantic relation classification concentrates on relations which are evaluated over open-domain data.
This work provides a critique on the set of abstract relations used for semantic relation classification with regard to their ability to express relationships between terms which are found in a domain-specific corpora.
Based on this analysis, this work proposes an alternative semantic relation model based on reusing and extending the set of abstract relations present in the DOLCE ontology.
The resulting set of relations is well grounded, allows to capture a wide range of relations and could thus be used as a foundation for automatic classification of semantic relations.
Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application.
However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or dataset of various topics to form ontology using machine learning techniques.
In this paper, two topic modeling algorithms are explored, namely LSI & SVD and Mr.LDA for learning topic ontology.
The objective is to determine the statistical relationship between document and terms to build a topic ontology and ontology graph with minimum human intervention.
Experimental analysis on building a topic ontology and semantic retrieving corresponding topic ontology for the user's query demonstrating the effectiveness of the proposed approach.
Extended Kalman filter (EKF) does not guarantee consistent mean and covariance under linearization, even though it is the main framework for robotic localization.
While Lie group improves the modeling of the state space in localization, the EKF on Lie group still relies on the arbitrary Gaussian assumption in face of nonlinear models.
We instead use von Mises filter for orientation estimation together with the conventional Kalman filter for position estimation, and thus we are able to characterize the first two moments of the state estimates.
Since the proposed algorithm holds a solid probabilistic basis, it is fundamentally relieved from the inconsistency problem.
Furthermore, we extend the localization algorithm to fully circular representation even for position, which is similar to grid patterns found in mammalian brains and in recurrent neural networks.
The applicability of the proposed algorithms is substantiated not only by strong mathematical foundation but also by the comparison against other common localization methods.
The development of chemical reaction models aids understanding and prediction in areas ranging from biology to electrochemistry and combustion.
A systematic approach to building reaction network models uses observational data not only to estimate unknown parameters, but also to learn model structure.
Bayesian inference provides a natural approach to this data-driven construction of models.
Yet traditional Bayesian model inference methodologies that numerically evaluate the evidence for each model are often infeasible for nonlinear reaction network inference, as the number of plausible models can be combinatorially large.
Alternative approaches based on model-space sampling can enable large-scale network inference, but their realization presents many challenges.
In this paper, we present new computational methods that make large-scale nonlinear network inference tractable.
First, we exploit the topology of networks describing potential interactions among chemical species to design improved "between-model" proposals for reversible-jump Markov chain Monte Carlo.
Second, we introduce a sensitivity-based determination of move types which, when combined with network-aware proposals, yields significant additional gains in sampling performance.
These algorithms are demonstrated on inference problems drawn from systems biology, with nonlinear differential equation models of species interactions.
We study the problem of locating a particularly dangerous node, the so-called black hole in a synchronous anonymous ring network with mobile agents.
A black hole is a harmful stationary process residing in a node of the network and destroying destroys all mobile agents visiting that node without leaving any trace.
We consider the more challenging scenario when the agents are identical and initially scattered within the network.
Moreover, we solve the problem with agents that have constant-sized memory and carry a constant number of identical tokens, which can be placed at nodes of the network.
In contrast, the only known solutions for the case of scattered agents searching for a black hole, use stronger models where the agents have non-constant memory, can write messages in whiteboards located at nodes or are allowed to mark both the edges and nodes of the network with tokens.
This paper solves the problem for ring networks containing a single black hole.
We are interested in the minimum resources (number of agents and tokens) necessary for locating all links incident to the black hole.
We present deterministic algorithms for ring topologies and provide matching lower and upper bounds for the number of agents and the number of tokens required for deterministic solutions to the black hole search problem, in oriented or unoriented rings, using movable or unmovable tokens.
Our goal is to show that the standard model-theoretic concept of types can be applied in the study of order-invariant properties, i.e., properties definable in a logic in the presence of an auxiliary order relation, but not actually dependent on that order relation.
This is somewhat surprising since order-invariant properties are more of a combinatorial rather than a logical object.
We provide two applications of this notion.
One is a proof, from the basic principles, of a theorem by Courcelle stating that over trees, order-invariant MSO properties are expressible in MSO with counting quantifiers.
The other is an analog of the Feferman-Vaught theorem for order-invariant properties.
Machine learning (ML) has been widely applied to image classification.
Here, we extend this application to data generated by a camera comprised of only a standard CMOS image sensor with no lens.
We first created a database of lensless images of handwritten digits.
Then, we trained a ML algorithm on this dataset.
Finally, we demonstrated that the trained ML algorithm is able to classify the digits with accuracy as high as 99% for 2 digits.
Our approach clearly demonstrates the potential for non-human cameras in machine-based decision-making scenarios.
We design an algorithm to compute the Newton polytope of the resultant, known as resultant polytope, or its orthogonal projection along a given direction.
The resultant is fundamental in algebraic elimination, optimization, and geometric modeling.
Our algorithm exactly computes vertex- and halfspace-representations of the polytope using an oracle producing resultant vertices in a given direction, thus avoiding walking on the polytope whose dimension is alpha-n-1, where the input consists of alpha points in Z^n.
Our approach is output-sensitive as it makes one oracle call per vertex and facet.
It extends to any polytope whose oracle-based definition is advantageous, such as the secondary and discriminant polytopes.
Our publicly available implementation uses the experimental CGAL package triangulation.
Our method computes 5-, 6- and 7-dimensional polytopes with 35K, 23K and 500 vertices, respectively, within 2hrs, and the Newton polytopes of many important surface equations encountered in geometric modeling in <1sec, whereas the corresponding secondary polytopes are intractable.
It is faster than tropical geometry software up to dimension 5 or 6.
Hashing determinantal predicates accelerates execution up to 100 times.
One variant computes inner and outer approximations with, respectively, 90% and 105% of the true volume, up to 25 times faster.
Inspired by the trend on unifying theories of programming, this paper shows how the algebraic treatment of standard data dependency theory equips relational data with functional types and an associated type system which is useful for type checking database operations and for query optimization.
Such a typed approach to database programming is then shown to be of the same family as other programming logics such as eg.
Hoare logic or that of strongest invariant functions which has been used in the analysis of while statements.
The prospect of using automated deduction systems such as Prover9 for type-checking and query optimization on top of such an algebraic approach is considered.
Non-rigid structure-from-motion (NRSfM) has so far been mostly studied for recovering 3D structure of a single non-rigid/deforming object.
To handle the real world challenging multiple deforming objects scenarios, existing methods either pre-segment different objects in the scene or treat multiple non-rigid objects as a whole to obtain the 3D non-rigid reconstruction.
However, these methods fail to exploit the inherent structure in the problem as the solution of segmentation and the solution of reconstruction could not benefit each other.
In this paper, we propose a unified framework to jointly segment and reconstruct multiple non-rigid objects.
To compactly represent complex multi-body non-rigid scenes, we propose to exploit the structure of the scenes along both temporal direction and spatial direction, thus achieving a spatio-temporal representation.
Specifically, we represent the 3D non-rigid deformations as lying in a union of subspaces along the temporal direction and represent the 3D trajectories as lying in the union of subspaces along the spatial direction.
This spatio-temporal representation not only provides competitive 3D reconstruction but also outputs robust segmentation of multiple non-rigid objects.
The resultant optimization problem is solved efficiently using the Alternating Direction Method of Multipliers (ADMM).
Extensive experimental results on both synthetic and real multi-body NRSfM datasets demonstrate the superior performance of our proposed framework compared with the state-of-the-art methods.
Surface parameterizations have been widely used in computer graphics and geometry processing.
In particular, as simply-connected open surfaces are conformally equivalent to the unit disk, it is desirable to compute the disk conformal parameterizations of the surfaces.
In this paper, we propose a novel algorithm for the conformal parameterization of a simply-connected open surface onto the unit disk, which significantly speeds up the computation, enhances the conformality and stability, and guarantees the bijectivity.
The conformality distortions at the inner region and on the boundary are corrected by two steps, with the aid of an iterative scheme using quasi-conformal theories.
Experimental results demonstrate the effectiveness of our proposed method.
Correlation filter (CF) based trackers generally include two modules, i.e., feature representation and on-line model adaptation.
In existing off-line deep learning models for CF trackers, the model adaptation usually is either abandoned or has closed-form solution to make it feasible to learn deep representation in an end-to-end manner.
However, such solutions fail to exploit the advances in CF models, and cannot achieve competitive accuracy in comparison with the state-of-the-art CF trackers.
In this paper, we investigate the joint learning of deep representation and model adaptation, where an updater network is introduced for better tracking on future frame by taking current frame representation, tracking result, and last CF tracker as input.
By modeling the representor as convolutional neural network (CNN), we truncate the alternating direction method of multipliers (ADMM) and interpret it as a deep network of updater, resulting in our model for learning representation and truncated inference (RTINet).
Experiments demonstrate that our RTINet tracker achieves favorable tracking accuracy against the state-of-the-art trackers and its rapid version can run at a real-time speed of 24 fps.
The code and pre-trained models will be publicly available at https://github.com/tourmaline612/RTINet.
The discovery of frequent itemsets can serve valuable economic and research purposes.
Releasing discovered frequent itemsets, however, presents privacy challenges.
In this paper, we study the problem of how to perform frequent itemset mining on transaction databases while satisfying differential privacy.
We propose an approach, called PrivBasis, which leverages a novel notion called basis sets.
A theta-basis set has the property that any itemset with frequency higher than theta is a subset of some basis.
We introduce algorithms for privately constructing a basis set and then using it to find the most frequent itemsets.
Experiments show that our approach greatly outperforms the current state of the art.
Analysis and recognition of driving styles are profoundly important to intelligent transportation and vehicle calibration.
This paper presents a novel driving style analysis framework using the primitive driving patterns learned from naturalistic driving data.
In order to achieve this, first, a Bayesian nonparametric learning method based on a hidden semi-Markov model (HSMM) is introduced to extract primitive driving patterns from time series driving data without prior knowledge of the number of these patterns.
In the Bayesian nonparametric approach, we utilize a hierarchical Dirichlet process (HDP) instead of learning the unknown number of smooth dynamical modes of HSMM, thus generating the primitive driving patterns.
Each primitive pattern is clustered and then labeled using behavioral semantics according to drivers' physical and psychological perception thresholds.
For each driver, 75 primitive driving patterns in car-following scenarios are learned and semantically labeled.
In order to show the HDP-HSMM's utility to learn primitive driving patterns, other two Bayesian nonparametric approaches, HDP-HMM and sticky HDP-HMM, are compared.
The naturalistic driving data of 18 drivers were collected from the University of Michigan Safety Pilot Model Deployment (SPDM) database.
The individual driving styles are discussed according to distribution characteristics of the learned primitive driving patterns and also the difference in driving styles among drivers are evaluated using the Kullback-Leibler divergence.
The experiment results demonstrate that the proposed primitive pattern-based method can allow one to semantically understand driver behaviors and driving styles.
In this paper we examine the existence of correlation between movie similarity and low level features from respective movie content.
In particular, we demonstrate the extraction of multi-modal representation models of movies based on subtitles, audio and metadata mining.
We emphasize our research in topic modeling of movies based on their subtitles.
In order to demonstrate the proposed content representation approach, we have built a small dataset of 160 widely known movies.
We assert movie similarities, as propagated by the singular modalities and fusion models, in the form of recommendation rankings.
We showcase a novel topic model browser for movies that allows for exploration of the different aspects of similarities between movies and an information retrieval system for movie similarity based on multi-modal content.
A multilayer perceptron can behave as a generative classifier by applying bidirectional learning (BL).
It consists of training an undirected neural network to map input to output and vice-versa; therefore it can produce a classifier in one direction, and a generator in the opposite direction for the same data.
The learning process of BL tries to reproduce the neuroplasticity stated in Hebbian theory using only backward propagation of errors.
In this paper, two novel learning techniques are introduced which use BL for improving robustness to white noise static and adversarial examples.
The first method is bidirectional propagation of errors, which the error propagation occurs in backward and forward directions.
Motivated by the fact that its generative model receives as input a constant vector per class, we introduce as a second method the hybrid adversarial networks (HAN).
Its generative model receives a random vector as input and its training is based on generative adversarial networks (GAN).
To assess the performance of BL, we perform experiments using several architectures with fully and convolutional layers, with and without bias.
Experimental results show that both methods improve robustness to white noise static and adversarial examples, and even increase accuracy, but have different behavior depending on the architecture and task, being more beneficial to use the one or the other.
Nevertheless, HAN using a convolutional architecture with batch normalization presents outstanding robustness, reaching state-of-the-art accuracy on adversarial examples of hand-written digits.
In Android, communications between apps and system services are supported by a transaction-based Inter-Process Communication (IPC) mechanism.
Binder, as the cornerstone of this IPC mechanism, separates two communicating parties as client and server.
As with any client-server model, the server should not make any assumption on the validity (sanity) of client-side transaction.
To our surprise, we find this principle has frequently been overlooked in the implementation of Android system services.
In this paper, we demonstrate the prevalence and severity of this vulnerability surface and try to answer why developers keep making this seemingly simple mistake.
Specifically, we design and implement BinderCracker, an automatic testing framework that supports parameter-aware fuzzing and has identified more than 100 vulnerabilities in six major versions of Android, including the latest version Android 6.0, Marshmallow.
Some of the vulnerabilities have severe security implications, causing privileged code execution or permanent Denial-of-Service (DoS).
We analyzed the root causes of these vulnerabilities to find that most of them exist because system service developers only considered exploitations via public APIs.
We thus highlight the deficiency of testing only on client-side public APIs and argue for the necessity of testing and protection on the Binder interface - the actual security boundary.
Specifically, we discuss the effectiveness and practicality of potential countermeasures, such as precautionary testing and runtime diagnostic.
Artificial neural networks learn how to solve new problems through a computationally intense and time consuming process.
One way to reduce the amount of time required is to inject preexisting knowledge into the network.
To make use of past knowledge, we can take advantage of techniques that transfer the knowledge learned from one task, and reuse it on another (sometimes unrelated) task.
In this paper we propose a novel selective breeding technique that extends the transfer learning with behavioural genetics approach proposed by Kohli, Magoulas and Thomas (2013), and evaluate its performance on financial data.
Numerical evidence demonstrates the credibility of the new approach.
We provide insights on the operation of transfer learning and highlight the benefits of using behavioural principles and selective breeding when tackling a set of diverse financial applications problems.
We consider the communication scenario where K transmitters are each connected to a common receiver with an orthogonal noiseless link.
One of the transmitters has a message for the receiver, who is prohibited from learning anything in the information theoretic sense about which transmitter sends the message (transmitter anonymity is guaranteed).
The capacity of anonymous communications is the maximum number of bits of desired information that can be anonymously communicated per bit of total communication.
For this anonymous communication problem over a parallel channel with K transmitters and 1 receiver, we show that the capacity is 1/K, i.e., to communicate 1 bit anonymously, each transmitter must send a 1 bit signal.
Further, it is required that each transmitter has at least 1 bit correlated randomness (that is independent of the messages) per message bit and the size of correlated randomness at all K transmitters is at least K-1 bits per message bit.
Automatic recognition of spontaneous facial expressions is a major challenge in the field of affective computing.
Head rotation, face pose, illumination variation, occlusion etc. are the attributes that increase the complexity of recognition of spontaneous expressions in practical applications.
Effective recognition of expressions depends significantly on the quality of the database used.
Most well-known facial expression databases consist of posed expressions.
However, currently there is a huge demand for spontaneous expression databases for the pragmatic implementation of the facial expression recognition algorithms.
In this paper, we propose and establish a new facial expression database containing spontaneous expressions of both male and female participants of Indian origin.
The database consists of 428 segmented video clips of the spontaneous facial expressions of 50 participants.
In our experiment, emotions were induced among the participants by using emotional videos and simultaneously their self-ratings were collected for each experienced emotion.
Facial expression clips were annotated carefully by four trained decoders, which were further validated by the nature of stimuli used and self-report of emotions.
An extensive analysis was carried out on the database using several machine learning algorithms and the results are provided for future reference.
Such a spontaneous database will help in the development and validation of algorithms for recognition of spontaneous expressions.
The Apriori algorithm that mines frequent itemsets is one of the most popular and widely used data mining algorithms.
Now days many algorithms have been proposed on parallel and distributed platforms to enhance the performance of Apriori algorithm.
They differ from each other on the basis of load balancing technique, memory system, data decomposition technique and data layout used to implement them.
The problems with most of the distributed framework are overheads of managing distributed system and lack of high level parallel programming language.
Also with grid computing there is always potential chances of node failures which cause multiple re-executions of tasks.
These problems can be overcome by the MapReduce framework introduced by Google.
MapReduce is an efficient, scalable and simplified programming model for large scale distributed data processing on a large cluster of commodity computers and also used in cloud computing.
In this paper, we present the overview of parallel Apriori algorithm implemented on MapReduce framework.
They are categorized on the basis of Map and Reduce functions used to implement them e.g.1-phase vs. k-phase, I/O of Mapper, Combiner and Reducer, using functionality of Combiner inside Mapper etc.
This survey discusses and analyzes the various implementations of Apriori on MapReduce framework on the basis of their distinguishing characteristics.
Moreover, it also includes the advantages and limitations of MapReduce framework.
Most of the parameters in large vocabulary models are used in embedding layer to map categorical features to vectors and in softmax layer for classification weights.
This is a bottle-neck in memory constraint on-device training applications like federated learning and on-device inference applications like automatic speech recognition (ASR).
One way of compressing the embedding and softmax layers is to substitute larger units such as words with smaller sub-units such as characters.
However, often the sub-unit models perform poorly compared to the larger unit models.
We propose WEST, an algorithm for encoding categorical features and output classes with a sequence of random or domain dependent sub-units and demonstrate that this transduction can lead to significant compression without compromising performance.
WEST bridges the gap between larger unit and sub-unit models and can be interpreted as a MaxEnt model over sub-unit features, which can be of independent interest.
Books have the power to make us feel happiness, sadness, pain, surprise, or sorrow.
An author's dexterity in the use of these emotions captivates readers and makes it difficult for them to put the book down.
In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books.
We obtained the best weighted F1-score of 69% for predicting books' success in a multitask setting (simultaneously predicting success and genre of books).
Model compression and knowledge distillation have been successfully applied for cross-architecture and cross-domain transfer learning.
However, a key requirement is that training examples are in correspondence across the domains.
We show that in many scenarios of practical importance such aligned data can be synthetically generated using computer graphics pipelines allowing domain adaptation through distillation.
We apply this technique to learn models for recognizing low-resolution images using labeled high-resolution images, non-localized objects using labeled localized objects, line-drawings using labeled color images, etc.
Experiments on various fine-grained recognition datasets demonstrate that the technique improves recognition performance on the low-quality data and beats strong baselines for domain adaptation.
Finally, we present insights into workings of the technique through visualizations and relating it to existing literature.
This document describes G2D, a software that enables capturing videos from Grand Theft Auto V (GTA V), a popular role playing game set in an expansive virtual city.
The target users of our software are computer vision researchers who wish to collect hyper-realistic computer-generated imagery of a city from the street level, under controlled 6DOF camera poses and varying environmental conditions (weather, season, time of day, traffic density, etc.).
G2D accesses/calls the native functions of the game; hence users can directly interact with G2D while playing the game.
Specifically, G2D enables users to manipulate conditions of the virtual environment on the fly, while the gameplay camera is set to automatically retrace a predetermined 6DOF camera pose trajectory within the game coordinate system.
Concurrently, automatic screen capture is executed while the virtual environment is being explored.
G2D and its source code are publicly available at https://goo.gl/SS7fS6   In addition, we demonstrate an application of G2D to generate a large-scale dataset with groundtruth camera poses for testing structure-from-motion (SfM) algorithms.
The dataset and generated 3D point clouds are also made available at https://goo.gl/DNzxHx
We study verifiable sufficient conditions and computable performance bounds for sparse recovery algorithms such as the Basis Pursuit, the Dantzig selector and the Lasso estimator, in terms of a newly defined family of quality measures for the measurement matrices.
With high probability, the developed measures for subgaussian random matrices are bounded away from zero as long as the number of measurements is reasonably large.
Comparing to the restricted isotropic constant based performance analysis, the arguments in this paper are much more concise and the obtained bounds are tighter.
Numerical experiments are presented to illustrate our theoretical results.
k-mers (nucleotide strings of length k) form the basis of several algorithms in computational genomics.
In particular, k-mer abundance information in sequence data is useful in read error correction, parameter estimation for genome assembly, digital normalization etc.
We give a streaming algorithm Kmerlight for computing the k-mer abundance histogram from sequence data.
Our algorithm is fast and uses very small memory footprint.
We provide analytical bounds on the error guarantees of our algorithm.
Kmerlight can efficiently process genome scale and metagenome scale data using standard desktop machines.
Few applications of abundance histograms computed by Kmerlight are also shown.
We use abundance histogram for de novo estimation of repetitiveness in the genome based on a simple probabilistic model that we propose.
We also show estimation of k-mer error rate in the sampling using abundance histogram.
Our algorithm can also be used for abundance estimation in a general streaming setting.
The Kmerlight tool is written in C++ and is available for download and use from https://github.com/nsivad/kmerlight.
The segmentation of transparent objects can be very useful in computer vision applications.
However, because they borrow texture from their background and have a similar appearance to their surroundings, transparent objects are not handled well by regular image segmentation methods.
We propose a method that overcomes these problems using the consistency and distortion properties of a light-field image.
Graph-cut optimization is applied for the pixel labeling problem.
The light-field linearity is used to estimate the likelihood of a pixel belonging to the transparent object or Lambertian background, and the occlusion detector is used to find the occlusion boundary.
We acquire a light field dataset for the transparent object, and use this dataset to evaluate our method.
The results demonstrate that the proposed method successfully segments transparent objects from the background.
We explore geometry of London's streets using computational mode of an excitable chemical system, Belousov-Zhabotinsky (BZ) medium.
We virtually fill in the streets with a BZ medium and study propagation of excitation waves for a range of excitability parameters, gradual transition from excitable to sub-excitable to non-excitable.
We demonstrate a pruning strategy adopted by the medium with decreasing excitability when wider and ballistically appropriate streets are selected.
We explain mechanics of streets selection and pruning.
The results of the paper will be used in future studies of studying dynamics of cities with living excitable substrates.
It is essential to find new ways of enabling experts in different disciplines to collaborate more efficient in the development of ever more complex systems, under increasing market pressures.
One possible solution for this challenge is to use a heterogeneous model-based approach where different teams can produce their conventional models and carry out their usual mono-disciplinary analysis, but in addition, the different models can be coupled for simulation (co-simulation), allowing the study of the global behavior of the system.
Due to its potential, co-simulation is being studied in many different disciplines but with limited sharing of findings.
Our aim with this work is to summarize, bridge, and enhance future research in this multidisciplinary area.
We provide an overview of co-simulation approaches, research challenges, and research opportunities, together with a detailed taxonomy with different aspects of the state of the art of co-simulation and classification for the past five years.
The main research needs identified are: finding generic approaches for modular, stable and accurate coupling of simulation units; and expressing the adaptations required to ensure that the coupling is correct.
We present a weakly supervised model that jointly performs both semantic- and instance-segmentation -- a particularly relevant problem given the substantial cost of obtaining pixel-perfect annotation for these tasks.
In contrast to many popular instance segmentation approaches based on object detectors, our method does not predict any overlapping instances.
Moreover, we are able to segment both "thing" and "stuff" classes, and thus explain all the pixels in the image.
"Thing" classes are weakly-supervised with bounding boxes, and "stuff" with image-level tags.
We obtain state-of-the-art results on Pascal VOC, for both full and weak supervision (which achieves about 95% of fully-supervised performance).
Furthermore, we present the first weakly-supervised results on Cityscapes for both semantic- and instance-segmentation.
Finally, we use our weakly supervised framework to analyse the relationship between annotation quality and predictive performance, which is of interest to dataset creators.
We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes.
The master node has access to a database of files.
In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files, assuming the cached files are uncoded.
The caches of the worker nodes are updated every iteration, and it should be designed to satisfy any possible unknown permutation of the files in subsequent iterations.
For this problem, we characterize the exact rate-memory trade-off for worst-case shuffling by deriving the minimum communication load for a given storage capacity per worker node.
As a byproduct, the exact rate-memory trade-off for any shuffling is characterized when the number of files is equal to the number of worker nodes.
We propose a novel deterministic coded shuffling scheme, which improves the state of the art, by exploiting the cache memories to create coded functions that can be decoded by several worker nodes.
Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles.
The identification of nodes occupying important positions in a network structure is crucial for the understanding of the associated real-world system.
Usually, betweenness centrality is used to evaluate a node capacity to connect different graph regions.
However, we argue here that this measure is not adapted for that task, as it gives equal weight to "local" centers (i.e. nodes of high degree central to a single region) and to "global" bridges, which connect different communities.
This distinction is important as the roles of such nodes are different in terms of the local and global organisation of the network structure.
In this paper we propose a decomposition of betweenness centrality into two terms, one highlighting the local contributions and the other the global ones.
We call the latter bridgeness centrality and show that it is capable to specifically spot out global bridges.
In addition, we introduce an effective algorithmic implementation of this measure and demonstrate its capability to identify global bridges in air transportation and scientific collaboration networks.
Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes.
We introduce a technique to represent grids supporting aggregated range queries that requires little space when the data points in the grid are clustered, which is common in practice.
We show how this general technique can be used to support two important types of aggregated queries, which are ranked range queries and counting range queries.
Our experimental evaluation shows that this technique can speed up aggregated queries up to more than an order of magnitude, with a small space overhead.
Background: Code review is a cognitively demanding and time-consuming process.
Previous qualitative studies hinted at how changesets divided according to a logical partitioning could be easier to review.
Aims: (1) Quantitatively measure the effects of change-decomposition on the outcome of code review (in terms of number of found defects, wrongly reported issues, suggested improvements, time, and understanding); (2) Qualitatively analyze how subjects approach the review and navigate the code building knowledge and addressing existing issues, in large vs. decomposed changes.
Method: Controlled experiment using the pull-based development model involving 28 software developers among professionals and graduate students.
Results: Change-decomposition leads to fewer wrongly reported issues, influences how subjects approach and conduct the review activity (by increasing context-seeking), yet impacts neither understanding the change rationale nor the number of found defects.
Conclusions: Change-decomposition reduces the noise for subsequent data analyses but also significantly support the tasks of the developers in charge of reviewing the changes.
As such, commits belonging to different concepts should be separated, adopting this as a best practice in software engineering.
Over the last few years, a rapidly increasing number of Internet-of-Things (IoT) systems that adopt voice as the primary user input have emerged.
These systems have been shown to be vulnerable to various types of voice spoofing attacks.
Existing defense techniques can usually only protect from a specific type of attack or require an additional authentication step that involves another device.
Such defense strategies are either not strong enough or lower the usability of the system.
Based on the fact that legitimate voice commands should only come from humans rather than a playback device, we propose a novel defense strategy that is able to detect the sound source of a voice command based on its acoustic features.
The proposed defense strategy does not require any information other than the voice command itself and can protect a system from multiple types of spoofing attacks.
Our proof-of-concept experiments verify the feasibility and effectiveness of this defense strategy.
Modeling human conversations is the essence for building satisfying chat-bots with multi-turn dialog ability.
Conversation modeling will notably benefit from domain knowledge since the relationships between sentences can be clarified due to semantic hints introduced by knowledge.
In this paper, a deep neural network is proposed to incorporate background knowledge for conversation modeling.
Through a specially designed Recall gate, domain knowledge can be transformed into the extra global memory of Long Short-Term Memory (LSTM), so as to enhance LSTM by cooperating with its local memory to capture the implicit semantic relevance between sentences within conversations.
In addition, this paper introduces the loose structured domain knowledge base, which can be built with slight amount of manual work and easily adopted by the Recall gate.
Our model is evaluated on the context-oriented response selecting task, and experimental results on both two datasets have shown that our approach is promising for modeling human conversations and building key components of automatic chatting systems.
FPGA becomes a popular technology for implementing Convolutional Neural Network (CNN) in recent years.
Most CNN applications on FPGA are domain-specific, e.g., detecting objects from specific categories, in which commonly-used CNN models pre-trained on general datasets may not be efficient enough.
This paper presents TuRF, an end-to-end CNN acceleration framework to efficiently deploy domain-specific applications on FPGA by transfer learning that adapts pre-trained models to specific domains, replacing standard convolution layers with efficient convolution blocks, and applying layer fusion to enhance hardware design performance.
We evaluate TuRF by deploying a pre-trained VGG-16 model for a domain-specific image recognition task onto a Stratix V FPGA.
Results show that designs generated by TuRF achieve better performance than prior methods for the original VGG-16 and ResNet-50 models, while for the optimised VGG-16 model TuRF designs are more accurate and easier to process.
Complex systems of systems (SoS) are characterized by multiple interconnected subsystems.
Typically, each subsystem is designed and analyzed using methodologies and formalisms that are specific to the particular subsystem model of computation considered --- Petri nets, continuous time ODEs, nondeterministic automata, to name a few.
When interconnecting subsystems, a designer needs to choose, based on the specific subsystems models, a common abstraction framework to analyze the composition.
In this paper we introduce a new framework for abstraction, composition and analysis of SoS that builds on results and methods developed in sheaf theory, category theory and topos theory.
In particular, we will be modeling behaviors of systems using sheaves, leverage category theoretic methods to define wiring diagrams and formalize composition and, by establishing a connection with topos theory, define a formal (intuitionistic/constructive) logic with a sound sheaf semantics
Recent work have shown that Reed-Muller (RM) codes achieve the erasure channel capacity.
However, this performance is obtained with maximum-likelihood decoding which can be costly for practical applications.
In this paper, we propose an encoding/decoding scheme for Reed-Muller codes on the packet erasure channel based on Plotkin construction.
We present several improvements over the generic decoding.
They allow, for a light cost, to compete with maximum-likelihood decoding performance, especially on high-rate codes, while significantly outperforming it in terms of speed.
Layer-wise Relevance Propagation (LRP) and saliency maps have been recently used to explain the predictions of Deep Learning models, specifically in the domain of text classification.
Given different attribution-based explanations to highlight relevant words for a predicted class label, experiments based on word deleting perturbation is a common evaluation method.
This word removal approach, however, disregards any linguistic dependencies that may exist between words or phrases in a sentence, which could semantically guide a classifier to a particular prediction.
In this paper, we present a feature-based evaluation framework for comparing the two attribution methods on customer reviews (public data sets) and Customer Due Diligence (CDD) extracted reports (corporate data set).
Instead of removing words based on the relevance score, we investigate perturbations based on embedded features removal from intermediate layers of Convolutional Neural Networks.
Our experimental study is carried out on embedded-word, embedded-document, and embedded-ngrams explanations.
Using the proposed framework, we provide a visualization tool to assist analysts in reasoning toward the model's final prediction.
Natural disasters affect hundreds of millions of people worldwide every year.
Early warning, humanitarian response and recovery mechanisms can be improved by using big data sources.
Measuring the different dimensions of the impact of natural disasters is critical for designing policies and building up resilience.
Detailed quantification of the movement and behaviours of affected populations requires the use of high granularity data that entails privacy risks.
Leveraging all this data is costly and has to be done ensuring privacy and security of large amounts of data.
Proxies based on social media and data aggregates would streamline this process by providing evidences and narrowing requirements.
We propose a framework that integrates environmental data, social media, remote sensing, digital topography and mobile phone data to understand different types of floods and how data can provide insights useful for managing humanitarian action and recovery plans.
Thus, data is dynamically requested upon data-based indicators forming a multi-granularity and multi-access data pipeline.
We present a composed study of three cases to show potential variability in the natures of floodings,as well as the impact and applicability of data sources.
Critical heterogeneity of the available data in the different cases has to be addressed in order to design systematic approaches based on data.
The proposed framework establishes the foundation to relate the physical and socio-economical impacts of floods.
Most brain-computer interfaces (BCIs) based on functional near-infrared spectroscopy (fNIRS) require that users perform mental tasks such as motor imagery, mental arithmetic, or music imagery to convey a message or to answer simple yes or no questions.
These cognitive tasks usually have no direct association with the communicative intent, which makes them difficult for users to perform.
In this paper, a 3-class intuitive BCI is presented which enables users to directly answer yes or no questions by covertly rehearsing the word 'yes' or 'no' for 15 s. The BCI also admits an equivalent duration of unconstrained rest which constitutes the third discernable task.
Twelve participants each completed one offline block and six online blocks over the course of 2 sessions.
The mean value of the change in oxygenated hemoglobin concentration during a trial was calculated for each channel and used to train a regularized linear discriminant analysis (RLDA) classifier.
By the final online block, 9 out of 12 participants were performing above chance (p<0.001), with a 3-class accuracy of 83.8+9.4%.
Even when considering all participants, the average online 3-class accuracy over the last 3 blocks was 64.1+20.6%, with only 3 participants scoring below chance (p<0.001).
For most participants, channels in the left temporal and temporoparietal cortex provided the most discriminative information.
To our knowledge, this is the first report of an online fNIRS 3-class imagined speech BCI.
Our findings suggest that imagined speech can be used as a reliable activation task for selected users for the development of more intuitive BCIs for communication.
We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation.
We rely on graph-convolutional networks (GCNs), a recent class of neural networks developed for modeling graph-structured data.
Our GCNs use predicted syntactic dependency trees of source sentences to produce representations of words (i.e. hidden states of the encoder) that are sensitive to their syntactic neighborhoods.
GCNs take word representations as input and produce word representations as output, so they can easily be incorporated as layers into standard encoders (e.g., on top of bidirectional RNNs or convolutional neural networks).
We evaluate their effectiveness with English-German and English-Czech translation experiments for different types of encoders and observe substantial improvements over their syntax-agnostic versions in all the considered setups.
Recent advances in optimization methods used for training convolutional neural networks (CNNs) with kernels, which are normalized according to particular constraints, have shown remarkable success.
This work introduces an approach for training CNNs using ensembles of joint spaces of kernels constructed using different constraints.
For this purpose, we address a problem of optimization on ensembles of products of submanifolds (PEMs) of convolution kernels.
To this end, we first propose three strategies to construct ensembles of PEMs in CNNs.
Next, we expound their geometric properties (metric and curvature properties) in CNNs.
We make use of our theoretical results by developing a geometry-aware SGD algorithm (G-SGD) for optimization on ensembles of PEMs to train CNNs.
Moreover, we analyze convergence properties of G-SGD considering geometric properties of PEMs.
In the experimental analyses, we employ G-SGD to train CNNs on Cifar-10, Cifar-100 and Imagenet datasets.
The results show that geometric adaptive step size computation methods of G-SGD can improve training loss and convergence properties of CNNs.
Moreover, we observe that classification performance of baseline CNNs can be boosted using G-SGD on ensembles of PEMs identified by multiple constraints.
The paper aims to show how an application can be developed that converts the English language into the Punjabi Language, and the same application can convert the Text to Speech(TTS) i.e. pronounce the text.
This application can be really beneficial for those with special needs.
In this paper, we consider noncoherent random linear coding networks (RLCNs) as a discrete memoryless channel (DMC) whose input and output alphabets consist of subspaces.
This contrasts with previous channel models in the literature which assume matrices as the channel input and output.
No particular assumptions are made on the network topology or the transfer matrix, except that the latter may be rank-deficient according to some rank deficiency probability distribution.
We introduce a random vector basis selection procedure which renders the DMC symmetric.
The capacity we derive can be seen as a lower bound on the capacity of noncoherent RLCNs, where subspace coding suffices to achieve this bound.
A significant weakness of most current deep Convolutional Neural Networks is the need to train them using vast amounts of manu- ally labelled data.
In this work we propose a unsupervised framework to learn a deep convolutional neural network for single view depth predic- tion, without requiring a pre-training stage or annotated ground truth depths.
We achieve this by training the network in a manner analogous to an autoencoder.
At training time we consider a pair of images, source and target, with small, known camera motion between the two such as a stereo pair.
We train the convolutional encoder for the task of predicting the depth map for the source image.
To do so, we explicitly generate an inverse warp of the target image using the predicted depth and known inter-view displacement, to reconstruct the source image; the photomet- ric error in the reconstruction is the reconstruction loss for the encoder.
The acquisition of this training data is considerably simpler than for equivalent systems, requiring no manual annotation, nor calibration of depth sensor to camera.
We show that our network trained on less than half of the KITTI dataset (without any further augmentation) gives com- parable performance to that of the state of art supervised methods for single view depth estimation.
Caricature generation is an interesting yet challenging task.
The primary goal is to generate plausible caricatures with reasonable exaggerations given face images.
Conventional caricature generation approaches mainly use low-level geometric transformations such as image warping to generate exaggerated images, which lack richness and diversity in terms of content and style.
The recent progress in generative adversarial networks (GANs) makes it possible to learn an image-to-image transformation from data, so that richer contents and styles can be generated.
However, directly applying the GAN-based models to this task leads to unsatisfactory results because there is a large variance in the caricature distribution.
Moreover, some models require strictly paired training data which largely limits their usage scenarios.
In this paper, we propose CariGAN overcome these problems.
Instead of training on paired data, CariGAN learns transformations only from weakly paired images.
Specifically, to enforce reasonable exaggeration and facial deformation, facial landmarks are adopted as an additional condition to constrain the generated image.
Furthermore, an attention mechanism is introduced to encourage our model to focus on the key facial parts so that more vivid details in these regions can be generated.
Finally, a Diversity Loss is proposed to encourage the model to produce diverse results to help alleviate the `mode collapse' problem of the conventional GAN-based models.
Extensive experiments on a new large-scale `WebCaricature' dataset show that the proposed CariGAN can generate more plausible caricatures with larger diversity compared with the state-of-the-art models.
This work deals with a classic problem: "Given a set of coins among which there is a counterfeit coin of a different weight, find this counterfeit coin using ordinary balance scales, with the minimum number of weighings possible, and indicate whether it weighs less or more than the rest".
The method proposed here not only calculates the minimum number of weighings necessary, but also indicates how to perform these weighings, it is easily mechanizeable and valid for any number of coins.
Instructions are also given as to how to generalize the procedure to include cases where there is more than one counterfeit coin.
Convolutional neural networks are ubiquitous in Machine Learning applications for solving a variety of problems.
They however can not be used in their native form when the domain of the data is commonly encountered manifolds such as the sphere, the special orthogonal group, the Grassmanian, the manifold of symmetric positive definite matrices and others.
Most recently, generalization of CNNs to data domains such as the 2-sphere has been reported by some research groups, which is referred to as the spherical CNNs (SCNNs).
The key property of SCNNs distinct from CNNs is that they exhibit the rotational equivariance property that allows for sharing learned weights within a layer.
In this paper, we theoretically generalize the CNNs to Riemannian homogeneous manifolds, that include but are not limited to the aforementioned example manifolds.
Our key contributions in this work are: (i) A theorem stating that linear group equivariance systems are fully characterized by correlation of functions on the domain manifold and vice-versa.
This is fundamental to the characterization of all linear group equivariant systems and parallels the widely used result in linear system theory for vector spaces.
(ii) As a corrolary, we prove the equivariance of the correlation operation to group actions admitted by the input domains which are Riemannian homogeneous manifolds.
(iii) We present the first end-to-end deep network architecture for classification of diffusion magnetic resonance image (dMRI) scans acquired from a cohort of 44 Parkinson Disease patients and 50 control/normal subjects.
(iv) A proof of concept experiment involving synthetic data generated on the manifold of symmetric positive definite matrices is presented to demonstrate the applicability of our network to other types of domains.
Augmented accuracy in prediction of diabetes will open up new frontiers in health prognostics.
Data overfitting is a performance-degrading issue in diabetes prognosis.
In this study, a prediction system for the disease of diabetes is pre-sented where the issue of overfitting is minimized by using the dropout method.
Deep learning neural network is used where both fully connected layers are fol-lowed by dropout layers.
The output performance of the proposed neural network is shown to have outperformed other state-of-art methods and it is recorded as by far the best performance for the Pima Indians Diabetes Data Set.
We propose a framework, named Aggregated Wasserstein, for computing a dissimilarity measure or distance between two Hidden Markov Models with state conditional distributions being Gaussian.
For such HMMs, the marginal distribution at any time spot follows a Gaussian mixture distribution, a fact exploited to softly match, aka register, the states in two HMMs.
We refer to such HMMs as Gaussian mixture model-HMM (GMM-HMM).
The registration of states is inspired by the intrinsic relationship of optimal transport and the Wasserstein metric between distributions.
Specifically, the components of the marginal GMMs are matched by solving an optimal transport problem where the cost between components is the Wasserstein metric for Gaussian distributions.
The solution of the optimization problem is a fast approximation to the Wasserstein metric between two GMMs.
The new Aggregated Wasserstein distance is a semi-metric and can be computed without generating Monte Carlo samples.
It is invariant to relabeling or permutation of the states.
This distance quantifies the dissimilarity of GMM-HMMs by measuring both the difference between the two marginal GMMs and the difference between the two transition matrices.
Our new distance is tested on the tasks of retrieval and classification of time series.
Experiments on both synthetic data and real data have demonstrated its advantages in terms of accuracy as well as efficiency in comparison with existing distances based on the Kullback-Leibler divergence.
A minimal solution using two affine correspondences is presented to estimate the common focal length and the fundamental matrix between two semi-calibrated cameras - known intrinsic parameters except a common focal length.
To the best of our knowledge, this problem is unsolved.
The proposed approach extends point correspondence-based techniques with linear constraints derived from local affine transformations.
The obtained multivariate polynomial system is efficiently solved by the hidden-variable technique.
Observing the geometry of local affinities, we introduce novel conditions eliminating invalid roots.
To select the best one out of the remaining candidates, a root selection technique is proposed outperforming the recent ones especially in case of high-level noise.
The proposed 2-point algorithm is validated on both synthetic data and 104 publicly available real image pairs.
A Matlab implementation of the proposed solution is included in the paper.
Unsupervised pre-trained word embeddings are used effectively for many tasks in natural language processing to leverage unlabeled textual data.
Often these embeddings are either used as initializations or as fixed word representations for task-specific classification models.
In this work, we extend our classification model's task loss with an unsupervised auxiliary loss on the word-embedding level of the model.
This is to ensure that the learned word representations contain both task-specific features, learned from the supervised loss component, and more general features learned from the unsupervised loss component.
We evaluate our approach on the task of temporal relation extraction, in particular, narrative containment relation extraction from clinical records, and show that continued training of the embeddings on the unsupervised objective together with the task objective gives better task-specific embeddings, and results in an improvement over the state of the art on the THYME dataset, using only a general-domain part-of-speech tagger as linguistic resource.
Lifelong machine learning methods acquire knowledge over a series of consecutive tasks, continually building upon their experience.
Current lifelong learning algorithms rely upon a single learning agent that has centralized access to all data.
In this paper, we extend the idea of lifelong learning from a single agent to a network of multiple agents that collectively learn a series of tasks.
Each agent faces some (potentially unique) set of tasks; the key idea is that knowledge learned from these tasks may benefit other agents trying to learn different (but related) tasks.
Our Collective Lifelong Learning Algorithm (CoLLA) provides an efficient way for a network of agents to share their learned knowledge in a distributed and decentralized manner, while preserving the privacy of the locally observed data.
Note that a decentralized scheme is a subclass of distributed algorithms where a central server does not exist and in addition to data, computations are also distributed among the agents.
We provide theoretical guarantees for robust performance of the algorithm and empirically demonstrate that CoLLA outperforms existing approaches for distributed multi-task learning on a variety of data sets.
Due to the proliferation of online social networks (OSNs), users find themselves participating in multiple OSNs.
These users leave their activity traces as they maintain friendships and interact with other users in these OSNs.
In this work, we analyze how users maintain friendship in multiple OSNs by studying users who have accounts in both Twitter and Instagram.
Specifically, we study the similarity of a user's friendship and the evenness of friendship distribution in multiple OSNs.
Our study shows that most users in Twitter and Instagram prefer to maintain different friendships in the two OSNs, keeping only a small clique of common friends in across the OSNs.
Based upon our empirical study, we conduct link prediction experiments to predict missing friendship links in multiple OSNs using the neighborhood features, neighborhood friendship maintenance features and cross-link features.
Our link prediction experiments shows that un- supervised methods can yield good accuracy in predicting links in one OSN using another OSN data and the link prediction accuracy can be further improved using supervised method with friendship maintenance and others measures as features.
A set S of n points is 2-color universal for a graph G on n vertices if for every proper 2-coloring of G and for every 2-coloring of S with the same sizes of color classes as G has, G is straight-line embeddable on S. We show that the so-called double chain is 2-color universal for paths if each of the two chains contains at least one fifth of all the points, but not if one of the chains is more than approximately 28 times longer than the other.
A 2-coloring of G is equitable if the sizes of the color classes differ by at most 1.
A bipartite graph is equitable if it admits an equitable proper coloring.
We study the case when S is the double-chain with chain sizes differing by at most 1 and G is an equitable bipartite graph.
We prove that this S is not 2-color universal if G is not a forest of caterpillars and that it is 2-color universal for equitable caterpillars with at most one half non-leaf vertices.
We also show that if this S is equitably 2-colored, then equitably properly 2-colored forests of stars can be embedded on it.
We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task.
We use this task to compare a series of architectures which are ubiquitous in the sequence-processing literature, in addition to a new model class---PossibleWorldNets---which computes entailment as a "convolution over possible worlds".
Results show that convolutional networks present the wrong inductive bias for this class of problems relative to LSTM RNNs, tree-structured neural networks outperform LSTM RNNs due to their enhanced ability to exploit the syntax of logic, and PossibleWorldNets outperform all benchmarks.
Requirements engineering provides several practices to analyze how a user wants to interact with a future software.
Mockups, prototypes, and scenarios are suitable to understand usability issues and user requirements early.
Nevertheless, users are often dissatisfied with the usability of a resulting software.
Apparently, previously explored information was lost or no longer accessible during the development phase.
Scenarios are one effective practice to describe behavior.
However, they are commonly notated in natural language which is often improper to capture and communicate interaction knowledge comprehensible to developers and users.
The dynamic aspect of interaction is lost if only static descriptions are used.
Digital prototyping enables the creation of interactive prototypes by adding responsive controls to hand- or digitally drawn mockups.
We propose to capture the events of these controls to obtain a representation of the interaction.
From this data, we generate videos, which demonstrate interaction sequences, as additional support for textual scenarios.
Variants of scenarios can be created by modifying the captured event sequences and mockups.
Any change is unproblematic since videos only need to be regenerated.
Thus, we achieve video as a by-product of digital prototyping.
This reduces the effort compared to video recording such as screencasts.
A first evaluation showed that such a generated video supports a faster understanding of a textual scenario compared to static mockups.
Face completion is a challenging generation task because it requires generating visually pleasing new pixels that are semantically consistent with the unmasked face region.
This paper proposes a geometry-aware Face Completion and Editing NETwork (FCENet) by systematically studying facial geometry from the unmasked region.
Firstly, a facial geometry estimator is learned to estimate facial landmark heatmaps and parsing maps from the unmasked face image.
Then, an encoder-decoder structure generator serves to complete a face image and disentangle its mask areas conditioned on both the masked face image and the estimated facial geometry images.
Besides, since low-rank property exists in manually labeled masks, a low-rank regularization term is imposed on the disentangled masks, enforcing our completion network to manage occlusion area with various shape and size.
Furthermore, our network can generate diverse results from the same masked input by modifying estimated facial geometry, which provides a flexible mean to edit the completed face appearance.
Extensive experimental results qualitatively and quantitatively demonstrate that our network is able to generate visually pleasing face completion results and edit face attributes as well.
This paper aims to compare between four different types of feature extraction approaches in terms of texture segmentation.
The feature extraction methods that were used for segmentation are Gabor filters (GF), Gaussian Markov random fields (GMRF), run-length matrix (RLM) and co-occurrence matrix (GLCM).
It was shown that the GF performed best in terms of quality of segmentation while the GLCM localises the texture boundaries better as compared to the other methods.
The finite element method (FEM) has several computational steps to numerically solve a particular problem, to which many efforts have been directed to accelerate the solution stage of the linear system of equations.
However, the finite element matrix construction, which is also time-consuming for unstructured meshes, has been less investigated.
The generation of the global finite element matrix is performed in two steps, computing the local matrices by numerical integration and assembling them into a global system, which has traditionally been done in serial computing.
This work presents a fast technique to construct the global finite element matrix that arises by solving the Poisson's equation in a three-dimensional domain.
The proposed methodology consists in computing the numerical integration, due to its intrinsic parallel opportunities, in the graphics processing unit (GPU) and computing the matrix assembly, due to its intrinsic serial operations, in the central processing unit (CPU).
In the numerical integration, only the lower triangular part of each local stiffness matrix is computed thanks to its symmetry, which saves GPU memory and computing time.
As a result of symmetry, the global sparse matrix also contains non-zero elements only in its lower triangular part, which reduces the assembly operations and memory usage.
This methodology allows generating the global sparse matrix from any unstructured finite element mesh size on GPUs with little memory capacity, only limited by the CPU memory.
Most of the mammal species hold polygynous mating systems.
The majority of the marriage systems of mankind were also polygynous over civilized history, however, socially imposed monogamy gradually prevails throughout the world.
This is difficult to understand because those mostly influential in society are themselves benefitted from polygyny.
Actually, the puzzle of monogamous marriage could be explained by a simple mechanism, which lies in the sexual selection dynamics of civilized human societies, driven by wealth redistribution.
The discussions in this paper are mainly based on the approach of social computing, with a combination of both experimental and analytical analysis.
We investigate the possibility of deriving metric trace semantics in a coalgebraic framework.
First, we generalize a technique for systematically lifting functors from the category Set of sets to the category PMet of pseudometric spaces, showing under which conditions also natural transformations, monads and distributive laws can be lifted.
By exploiting some recent work on an abstract determinization, these results enable the derivation of trace metrics starting from coalgebras in Set.
More precisely, for a coalgebra on Set we determinize it, thus obtaining a coalgebra in the Eilenberg-Moore category of a monad.
When the monad can be lifted to PMet, we can equip the final coalgebra with a behavioral distance.
The trace distance between two states of the original coalgebra is the distance between their images in the determinized coalgebra through the unit of the monad.
We show how our framework applies to nondeterministic automata and probabilistic automata.
Community detection of network flows conventionally assumes one-step dynamics on the links.
For sparse networks and interest in large-scale structures, longer timescales may be more appropriate.
Oppositely, for large networks and interest in small-scale structures, shorter timescales may be better.
However, current methods for analyzing networks at different timescales require expensive and often infeasible network reconstructions.
To overcome this problem, we introduce a method that takes advantage of the inner-workings of the map equation and evades the reconstruction step.
This makes it possible to efficiently analyze large networks at different Markov times with no extra overhead cost.
The method also evades the costly unipartite projection for identifying flow modules in bipartite networks.
Finding optimal data for inpainting is a key problem in the context of partial differential equation based image compression.
The data that yields the most accurate reconstruction is real-valued.
Thus, quantisation models are mandatory to allow an efficient encoding.
These can also be understood as challenging data clustering problems.
Although clustering approaches are well suited for this kind of compression codecs, very few works actually consider them.
Each pixel has a global impact on the reconstruction and optimal data locations are strongly correlated with their corresponding colour values.
These facts make it hard to predict which feature works best.
In this paper we discuss quantisation strategies based on popular methods such as k-means.
We are lead to the central question which kind of feature vectors are best suited for image compression.
To this end we consider choices such as the pixel values, the histogram or the colour map.
Our findings show that the number of colours can be reduced significantly without impacting the reconstruction quality.
Surprisingly, these benefits do not directly translate to a good image compression performance.
The gains in the compression ratio are lost due to increased storage costs.
This suggests that it is integral to evaluate the clustering on both, the reconstruction error and the final file size.
This is a study of the MOR cryptosystem using the special linear group over finite fields.
The automorphism group of the special linear group is analyzed for this purpose.
At our current state of knowledge, I show that the MOR cryptosystem has better security than the ElGamal cryptosystem over finite fields.
Recently, much progress has been made in learning general-purpose sentence representations that can be used across domains.
However, most of the existing models typically treat each word in a sentence equally.
In contrast, extensive studies have proven that human read sentences efficiently by making a sequence of fixation and saccades.
This motivates us to improve sentence representations by assigning different weights to the vectors of the component words, which can be treated as an attention mechanism on single sentences.
To that end, we propose two novel attention models, in which the attention weights are derived using significant predictors of human reading time, i.e., Surprisal, POS tags and CCG supertags.
The extensive experiments demonstrate that the proposed methods significantly improve upon the state-of-the-art sentence representation models.
Artificial Intelligence is a central topic in the computer science curriculum.
From the year 2011 a project-based learning methodology based on computer games has been designed and implemented into the intelligence artificial course at the University of the Bio-Bio.
The project aims to develop software-controlled agents (bots) which are programmed by using heuristic algorithms seen during the course.
This methodology allows us to obtain good learning results, however several challenges have been founded during its implementation.
In this paper we show how linguistic descriptions of data can help to provide students and teachers with technical and personalized feedback about the learned algorithms.
Algorithm behavior profile and a new Turing test for computer games bots based on linguistic modelling of complex phenomena are also proposed in order to deal with such challenges.
In order to show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of authors and its incorporation in the process of assessment allows us to improve the teaching learning process.
Considering today's web scenario, there is a need of effective and meaningful search over the web which is provided by Semantic Web.
Existing search engines are keyword based.
They are vulnerable in answering intelligent queries from the user due to the dependence of their results on information available in web pages.
While semantic search engines provides efficient and relevant results as the semantic web is an extension of the current web in which information is given well defined meaning.
MetaCrawler is a search tool that uses several existing search engines and provides combined results by using their own page ranking algorithm.
This paper proposes development of a meta-semantic-search engine called SemanTelli which works within cloud.
SemanTelli fetches results from different semantic search engines such as Hakia, DuckDuckGo, SenseBot with the help of intelligent agents that eliminate the limitations of existing search engines.
We present a unified probabilistic framework for simultaneous trajectory estimation and planning (STEAP).
Estimation and planning problems are usually considered separately, however, within our framework we show that solving them simultaneously can be more accurate and efficient.
The key idea is to compute the full continuous-time trajectory from start to goal at each time-step.
While the robot traverses the trajectory, the history portion of the trajectory signifies the solution to the estimation problem, and the future portion of the trajectory signifies a solution to the planning problem.
Building on recent probabilistic inference approaches to continuous-time localization and mapping and continuous-time motion planning, we solve the joint problem by iteratively recomputing the maximum a posteriori trajectory conditioned on all available sensor data and cost information.
Our approach can contend with high-degree-of-freedom (DOF) trajectory spaces, uncertainty due to limited sensing capabilities, model inaccuracy, the stochastic effect of executing actions, and can find a solution in real-time.
We evaluate our framework empirically in both simulation and on a mobile manipulator.
Using blockchain technology, it is possible to create contracts that offer a reward in exchange for a trained machine learning model for a particular data set.
This would allow users to train machine learning models for a reward in a trustless manner.
The smart contract will use the blockchain to automatically validate the solution, so there would be no debate about whether the solution was correct or not.
Users who submit the solutions won't have counterparty risk that they won't get paid for their work.
Contracts can be created easily by anyone with a dataset, even programmatically by software agents.
This creates a market where parties who are good at solving machine learning problems can directly monetize their skillset, and where any organization or software agent that has a problem to solve with AI can solicit solutions from all over the world.
This will incentivize the creation of better machine learning models, and make AI more accessible to companies and software agents.
Advances in data collection and data storage technologies have given way to the establishment of transactional databases among companies and organizations, as they allow enormous amounts of data to be stored efficiently.
Useful knowledge can be mined from these data, which can be used in several ways depending on the nature of the data.
Quite often companies and organizations are willing to share data for the sake of mutual benefit.
However, the sharing of such data comes with risks, as problems with privacy may arise.
Sensitive data, along with sensitive knowledge inferred from this data, must be protected from unintentional exposure to unauthorized parties.
One form of the inferred knowledge is frequent patterns mined in the form of frequent itemsets from transactional databases.
The problem of protecting such patterns is known as the frequent itemset hiding problem.
In this paper we present a toolbox, which provides several implementations of frequent itemset hiding algorithms.
Firstly, we summarize the most important aspects of each algorithm.
We then introduce the architecture of the toolbox and its novel features.
Finally, we provide experimental results on real world datasets, demonstrating the efficiency of the toolbox and the convenience it offers in comparing different algorithms.
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics.
Our model is evaluated on a real-world robotic manipulation task that requires displacing objects to target locations by poking.
The robot gathered over 400 hours of experience by executing more than 100K pokes on different objects.
We propose a novel approach based on deep neural networks for modeling the dynamics of robot's interactions directly from images, by jointly estimating forward and inverse models of dynamics.
The inverse model objective provides supervision to construct informative visual features, which the forward model can then predict and in turn regularize the feature space for the inverse model.
The interplay between these two objectives creates useful, accurate models that can then be used for multi-step decision making.
This formulation has the additional benefit that it is possible to learn forward models in an abstract feature space and thus alleviate the need of predicting pixels.
Our experiments show that this joint modeling approach outperforms alternative methods.
Serious scientific games are games whose purpose is not only fun.
In the field of science, the serious goals include crucial activities for scientists: outreach, teaching and research.
The number of serious games is increasing rapidly, in particular citizen science games, games that allow people to produce and/or analyze scientific data.
Interestingly, it is possible to build a set of rules providing a guideline to create or improve serious games.
We present arguments gathered from our own experience ( Phylo , DocMolecules , HiRE-RNA contest and Pangu) as well as examples from the growing literature on scientific serious games.
Non-orthogonal multiple access (NOMA) is regarded as a candidate radio access technique for the next generation wireless networks because of its manifold spectral gains.
A two-phase cooperative relaying strategy (CRS) is proposed in this paper by exploiting the concept of both downlink and uplink NOMA (termed as DU-CNOMA).
In the proposed protocol, a transmitter considered as a source transmits a NOMA composite signal consisting of two symbols to the destination and relay during the first phase, following the principle of downlink NOMA.
In the second phase, the relay forwards the symbol decoded by successive interference cancellation to the destination, whereas the source transmits a new symbol to the destination in parallel with the relay, following the principle of uplink NOMA.
The ergodic sum capacity, outage probability, and outage sum capacity are investigated comprehensively along with analytical derivations, under both perfect and imperfect successive interference cancellation.
The performance improvement of the proposed DU-CNOMA over the conventional CRS using NOMA, is proved through analysis and computer simulation.
Furthermore, the correctness of the author's analysis is proved through a strong agreement between simulation and analytical results.
In this paper, the basic ideal of the Event Space Theory and Analyzing Events are expatiated on.
Then it is suggested that how to set up event base library in developing application software.
Based above the designing principle of facing methodology.
Finally, in order to explain how to apply the Event Space Theory in developing economic evaluation software, the software of "sewage treatment CAD" in a national "8th-Five Year Plan Research Project" of PRC is used as an example.
This software concerns economic effectiveness evaluation for construction projects.
Resource allocation with quality of service constraints is one of the most challenging problems in elastic optical networks which is normally formulated as an MINLP optimization program.
In this paper, we focus on novel properties of geometric optimization and provide a heuristic approach for resource allocation which is very faster than its MINLP counterpart.
Our heuristic consists of two main parts for routing/traffic ordering and power/spectrum assignment.
It aims at minimization of transmitted optical power and spectrum usage constrained to quality of service and physical requirements.
We consider three routing/traffic ordering procedures and compare them in terms of total transmitted optical power, total received noise power and total nonlinear interference including self- and cross-channel interferences.
We propose a posynomial expression for optical signal to noise ratio in which fiber nonlinearities and spontaneous emission noise have been addressed.
We also propose posynomial expressions that relate modulation spectral efficiency to its corresponding minimum required optical signal to noise ratio.
We then use the posynomial expressions to develop six geometric formulations for power/spectrum assignment part of the heuristic which are different in run time, complexity and accuracy.
Simulation results demonstrate that the proposed solution has a very good accuracy and much lower computational complexity in comparison with MINLP formulation.
As example for European Cost239 optical network with 46 transmit transponders, the geometric formulations can be more than 59 times faster than its MINLP counterpart.
Numerical results also reveal that in long-haul elastic optical networks, considering the product of the number of common fiber spans and the transmission bit rate is a better goal function for routing/traffic ordering sub-problem.
In this paper we shall introduce a simple, effective numerical method for finding differential operators for scalar and vector-valued functions on surfaces.
The key idea of our algorithm is to develop an intrinsic and unified way to compute directly the partial derivatives of functions defined on triangular meshes which are the discretization of regular surfaces under consideration.
Most importantly, the divergence theorem and conservation laws on triangular meshes are fulfilled.
Using deep learning for different machine learning tasks such as image classification and word embedding has recently gained many attentions.
Its appealing performance reported across specific Natural Language Processing (NLP) tasks in comparison with other approaches is the reason for its popularity.
Word embedding is the task of mapping words or phrases to a low dimensional numerical vector.
In this paper, we use deep learning to embed Wikipedia Concepts and Entities.
The English version of Wikipedia contains more than five million pages, which suggest its capability to cover many English Entities, Phrases, and Concepts.
Each Wikipedia page is considered as a concept.
Some concepts correspond to entities, such as a person's name, an organization or a place.
Contrary to word embedding, Wikipedia Concepts Embedding is not ambiguous, so there are different vectors for concepts with similar surface form but different mentions.
We proposed several approaches and evaluated their performance based on Concept Analogy and Concept Similarity tasks.
The results show that proposed approaches have the performance comparable and in some cases even higher than the state-of-the-art methods.
When conducting modern cybercrime investigations, evidence has often to be gathered from computer systems located at cloud-based data centres of hosting providers.
In cases where the investigation cannot rely on the cooperation of the hosting provider, or where documentation is not available, investigators can often find the identification of which distinct server among many is of interest difficult and extremely time consuming.
To address the problem of identifying these servers, in this paper a new approach to rapidly and reliably identify these cloud hosting computer systems is presented.
In the outlined approach, a handheld device composed of an embedded computer combined with a method of undetectable interception of Ethernet based communications is presented.
This device is tested and evaluated, and a discussion is provided on its usefulness in identifying of server of interest to an investigation.
VeriFast is a leading research prototype tool for the sound modular verification of safety and correctness properties of single-threaded and multithreaded C and Java programs.
It has been used as a vehicle for exploration and validation of novel program verification techniques and for industrial case studies; it has served well at a number of program verification competitions; and it has been used for teaching by multiple teachers independent of the authors.
However, until now, while VeriFast's operation has been described informally in a number of publications, and specific verification techniques have been formalized, a clear and precise exposition of how VeriFast works has not yet appeared.
In this article we present for the first time a formal definition and soundness proof of a core subset of the VeriFast program verification approach.
The exposition aims to be both accessible and rigorous: the text is based on lecture notes for a graduate course on program verification, and it is backed by an executable machine-readable definition and machine-checked soundness proof in Coq.
In this work, I present an optimization problem which consists of assigning entries of a stellar catalog to multiple entries of another stellar catalog such that the probability of such assignment is maximum.
I show a way of modeling it as a Maximum Weighted Stable Set Problem which is further used to solve a real astronomical instance and I partially characterize the forbidden subgraphs of the resulting family of graphs given by that reduction.
Finally, I prove that the problem is NP-Hard.
We present an efficient framework that can generate a coherent paragraph to describe a given video.
Previous works on video captioning usually focus on video clips.
They typically treat an entire video as a whole and generate the caption conditioned on a single embedding.
On the contrary, we consider videos with rich temporal structures and aim to generate paragraph descriptions that can preserve the story flow while being coherent and concise.
Towards this goal, we propose a new approach, which produces a descriptive paragraph by assembling temporally localized descriptions.
Given a video, it selects a sequence of distinctive clips and generates sentences thereon in a coherent manner.
Particularly, the selection of clips and the production of sentences are done jointly and progressively driven by a recurrent network -- what to describe next depends on what have been said before.
Here, the recurrent network is learned via self-critical sequence training with both sentence-level and paragraph-level rewards.
On the ActivityNet Captions dataset, our method demonstrated the capability of generating high-quality paragraph descriptions for videos.
Compared to those by other methods, the descriptions produced by our method are often more relevant, more coherent, and more concise.
Designing fast and scalable algorithm for mining frequent itemsets is always being a most eminent and promising problem of data mining.
Apriori is one of the most broadly used and popular algorithm of frequent itemset mining.
Designing efficient algorithms on MapReduce framework to process and analyze big datasets is contemporary research nowadays.
In this paper, we have focused on the performance of MapReduce based Apriori on homogeneous as well as on heterogeneous Hadoop cluster.
We have investigated a number of factors that significantly affects the execution time of MapReduce based Apriori running on homogeneous and heterogeneous Hadoop Cluster.
Factors are specific to both algorithmic and non-algorithmic improvements.
Considered factors specific to algorithmic improvements are filtered transactions and data structures.
Experimental results show that how an appropriate data structure and filtered transactions technique drastically reduce the execution time.
The non-algorithmic factors include speculative execution, nodes with poor performance, data locality & distribution of data blocks, and parallelism control with input split size.
We have applied strategies against these factors and fine tuned the relevant parameters in our particular application.
Experimental results show that if cluster specific parameters are taken care of then there is a significant reduction in execution time.
Also we have discussed the issues regarding MapReduce implementation of Apriori which may significantly influence the performance.
Ambient backscatter is an intriguing wireless communication paradigm that allows small devices to compute and communicate by using only the power they harvest from radio-frequency (RF) signals in the air.
Ambient backscattering devices reflect existing RF signals emitted by legacy communications systems, such as digital TV broadcasting, cellular or Wi-Fi ones, which would be otherwise treated as harmful sources of interference.
This paper deals with the ultimate performance limits of ambient backscatter systems in broadband fading environments, by considering different amounts of network state information at the receivers.
After introducing a detailed signal model of the relevant communication links, we study the influence of physical parameters on the capacity of both legacy and backscatter systems.
We find that, under reasonable operative conditions, a legacy system employing multicarrier modulation can turn the RF interference arising from the backscatter process into a form of multipath diversity that can be suitably exploited to noticeably increase its performance.
Moreover, we show that, even when employing simple single-carrier modulation techniques, the backscatter system can achieve significant data rates over relatively short distances, especially when the intended recipient of the backscatter signal is co-located with the legacy transmitter, i.e., they are on the same machine.
The paper presents an extension of Shannon fuzzy entropy for intuitionistic fuzzy one.
Firstly, we presented a new formula for calculating the distance and similarity of intuitionistic fuzzy information.
Then, we constructed measures for information features like score, certainty and uncertainty.
Also, a new concept was introduced, namely escort fuzzy information.
Then, using the escort fuzzy information, Shannon's formula for intuitionistic fuzzy information was obtained.
It should be underlined that Shannon's entropy for intuitionistic fuzzy information verifies the four defining conditions of intuitionistic fuzzy uncertainty.
The measures of its two components were also identified: fuzziness (ambiguity) and incompleteness (ignorance).
The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters.
In this paper, we study distributionally robust MDPs where ambiguity sets for the uncertain parameters are of a format that can easily incorporate in its description the uncertainty's generalized moment as well as statistical distance information.
In this way, we generalize existing works on distributionally robust MDP with generalized-moment-based and statistical-distance-based ambiguity sets to incorporate information from the former class such as moments and dispersions to the latter class that critically depends on empirical observations of the uncertain parameters.
We show that, under this format of ambiguity sets, the resulting distributionally robust MDP remains tractable under mild technical conditions.
To be more specific, a distributionally robust policy can be constructed by solving a sequence of one-stage convex optimization subproblems.
A scientist may publish tens or hundreds of papers over a career, but these contributions are not evenly spaced in time.
Sixty years of studies on career productivity patterns in a variety of fields suggest an intuitive and universal pattern: productivity tends to rise rapidly to an early peak and then gradually declines.
Here, we test the universality of this conventional narrative by analyzing the structures of individual faculty productivity time series, constructed from over 200,000 publications and matched with hiring data for 2453 tenure-track faculty in all 205 Ph.D-granting computer science departments in the U.S. and Canada.
Unlike prior studies, which considered only some faculty or some institutions, or lacked common career reference points, here we combine a large bibliographic dataset with comprehensive information on career transitions that covers an entire field of study.
We show that the conventional narrative confidently describes only one fifth of faculty, regardless of department prestige or researcher gender, and the remaining four fifths of faculty exhibit a rich diversity of productivity patterns.
To explain this diversity, we introduce a simple model of productivity trajectories, and explore correlations between its parameters and researcher covariates, showing that departmental prestige predicts overall individual productivity and the timing of the transition from first- to last-author publications.
These results demonstrate the unpredictability of productivity over time, and open the door for new efforts to understand how environmental and individual factors shape scientific productivity.
Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting.
The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate.
However, this assumption often does not hold; e.g. in recognition, class labels may be missing; in detection, objects in the image may not be localized; and in general, the labeling may be subjective.
In this work we propose a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency.
We consider a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.
In experiments we demonstrate that our approach yields substantial robustness to label noise on several datasets.
On MNIST handwritten digits, we show that our model is robust to label corruption.
On the Toronto Face Database, we show that our model handles well the case of subjective labels in emotion recognition, achieving state-of-the- art results, and can also benefit from unlabeled face images with no modification to our method.
On the ILSVRC2014 detection challenge data, we show that our approach extends to very deep networks, high resolution images and structured outputs, and results in improved scalable detection.
Developing an intelligent vehicle which can perform human-like actions requires the ability to learn basic driving skills from a large amount of naturalistic driving data.
The algorithms will become efficient if we could decompose the complex driving tasks into motion primitives which represent the elementary compositions of driving skills.
Therefore, the purpose of this paper is to segment unlabeled trajectory data into a library of motion primitives.
By applying a probabilistic inference based on an iterative Expectation-Maximization algorithm, our method segments the collected trajectories while learning a set of motion primitives represented by the dynamic movement primitives.
The proposed method utilizes the mutual dependencies between the segmentation and representation of motion primitives and the driving-specific based initial segmentation.
By utilizing this mutual dependency and the initial condition, this paper presents how we can enhance the performance of both the segmentation and the motion primitive library establishment.
We also evaluate the applicability of the primitive representation method to imitation learning and motion planning algorithms.
The model is trained and validated by using the driving data collected from the Beijing Institute of Technology intelligent vehicle platform.
The results show that the proposed approach can find the proper segmentation and establish the motion primitive library simultaneously.
Most of the recent successful methods in accurate object detection and localization used some variants of R-CNN style two stage Convolutional Neural Networks (CNN) where plausible regions were proposed in the first stage then followed by a second stage for decision refinement.
Despite the simplicity of training and the efficiency in deployment, the single stage detection methods have not been as competitive when evaluated in benchmarks consider mAP for high IoU thresholds.
In this paper, we proposed a novel single stage end-to-end trainable object detection network to overcome this limitation.
We achieved this by introducing Recurrent Rolling Convolution (RRC) architecture over multi-scale feature maps to construct object classifiers and bounding box regressors which are "deep in context".
We evaluated our method in the challenging KITTI dataset which measures methods under IoU threshold of 0.7.
We showed that with RRC, a single reduced VGG-16 based model already significantly outperformed all the previously published results.
At the time this paper was written our models ranked the first in KITTI car detection (the hard level), the first in cyclist detection and the second in pedestrian detection.
These results were not reached by the previous single stage methods.
The code is publicly available.
This paper presents an iterated local search for the fixed-charge uncapacitated network design problem with user-optimal flow (FCNDP-UOF), which concerns routing multiple commodities from its origin to its destination by signing a network through selecting arcs, with an objective of minimizing the sum of the fixed costs of the selected arcs plus the sum of variable costs associated to the flows on each arc.
Besides that, since the FCNDP-UOF is a bi-level problem, each commodity has to be transported through a shortest path, concerning the edges length, in the built network.
The proposed algorithm generate a initial solution using a variable fixing heuristic.
Then a local branching strategy is applied to improve the quality of the solution.
At last, an efficient perturbation strategy is presented to perform cycle-based moves to explore different parts of the solution space.
Computational experiments shows that the proposed solution method consistently produces high-quality solutions in reasonable computational times.
We open source fingerprint Match in Box, a complete end-to-end fingerprint recognition system embedded within a 4 inch cube.
Match in Box stands in contrast to a typical bulky and expensive proprietary fingerprint recognition system which requires sending a fingerprint image to an external host for processing and subsequent spoof detection and matching.
In particular, Match in Box is a first of a kind, portable, low-cost, and easy-to-assemble fingerprint reader with an enrollment database embedded within the reader's memory and open source fingerprint spoof detector, feature extractor, and matcher all running on the reader's internal vision processing unit (VPU).
An onboard touch screen and rechargeable battery pack make this device extremely portable and ideal for applying both fingerprint authentication (1:1 comparison) and fingerprint identification (1:N search) to applications (vaccination tracking, food and benefit distribution programs, human trafficking prevention) in rural communities, especially in developing countries.
We also show that Match in Box is suited for capturing neonate fingerprints due to its high resolution (1900 ppi) cameras.
We review some of the latest approaches to analysing cardiac electrophysiology data using machine learning and predictive modelling.
Cardiac arrhythmias, particularly atrial fibrillation, are a major global healthcare challenge.
Treatment is often through catheter ablation, which involves the targeted localized destruction of regions of the myocardium responsible for initiating or perpetuating the arrhythmia.
Ablation targets are either anatomically defined, or identified based on their functional properties as determined through the analysis of contact intracardiac electrograms acquired with increasing spatial density by modern electroanatomic mapping systems.
While numerous quantitative approaches have been investigated over the past decades for identifying these critical curative sites, few have provided a reliable and reproducible advance in success rates.
Machine learning techniques, including recent deep-learning approaches, offer a potential route to gaining new insight from this wealth of highly complex spatio-temporal information that existing methods struggle to analyse.
Coupled with predictive modelling, these techniques offer exciting opportunities to advance the field and produce more accurate diagnoses and robust personalised treatment.
We outline some of these methods and illustrate their use in making predictions from the contact electrogram and augmenting predictive modelling tools, both by more rapidly predicting future states of the system and by inferring the parameters of these models from experimental observations.
Recent improvements in computing allow for the processing and analysis of very large datasets in a variety of fields.
Often the analysis requires the creation of low-rank approximations to the datasets leading to efficient storage.
This article presents and analyzes a novel approach for creating nonnegative, structured dictionaries using NMF applied to reordered pixels of single, natural images.
We reorder the pixels based on patches and present our approach in general.
We investigate our approach when using the Singular Value Decomposition (SVD) and Nonnegative Matrix Factorizations (NMF) as low-rank approximations.
Peak Signal-to-Noise Ratio (PSNR) and Mean Structural Similarity Index (MSSIM) are used to evaluate the algorithm.
We report that while the SVD provides the best reconstructions, its dictionary of vectors lose both the sign structure of the original image and details of localized image content.
In contrast, the dictionaries produced using NMF preserves the sign structure of the original image matrix and offer a nonnegative, parts-based dictionary.
Mixture of Softmaxes (MoS) has been shown to be effective at addressing the expressiveness limitation of Softmax-based models.
Despite the known advantage, MoS is practically sealed by its large consumption of memory and computational time due to the need of computing multiple Softmaxes.
In this work, we set out to unleash the power of MoS in practical applications by investigating improved word coding schemes, which could effectively reduce the vocabulary size and hence relieve the memory and computation burden.
We show both BPE and our proposed Hybrid-LightRNN lead to improved encoding mechanisms that can halve the time and memory consumption of MoS without performance losses.
With MoS, we achieve an improvement of 1.5 BLEU scores on IWSLT 2014 German-to-English corpus and an improvement of 0.76 CIDEr score on image captioning.
Moreover, on the larger WMT 2014 machine translation dataset, our MoS-boosted Transformer yields 29.5 BLEU score for English-to-German and 42.1 BLEU score for English-to-French, outperforming the single-Softmax Transformer by 0.8 and 0.4 BLEU scores respectively and achieving the state-of-the-art result on WMT 2014 English-to-German task.
The Message Passing Interface (MPI) is the prevalent programming model used on today's supercomputers.
Therefore, MPI library developers are looking for the best possible performance (shortest run-time) of individual MPI functions across many different supercomputer architectures.
Several MPI benchmark suites have been developed to assess the performance of MPI implementations.
Unfortunately, the outcome of these benchmarks is often neither reproducible nor statistically sound.
To overcome these issues, we show which experimental factors have an impact on the run-time of blocking collective MPI operations and how to control them.
We address the problem of process and clock synchronization in MPI benchmarks.
Finally, we present a new experimental method that allows us to obtain reproducible and statistically sound MPI measurements.
The entropy of an ergodic source is the limit of properly rescaled 1-block entropies of sources obtained applying successive non-sequential recursive pairs substitutions (see P. Grassberger 2002 ArXiv:physics/0207023 and D. Benedetto, E. Caglioti and D. Gabrielli 2006 Jour.Stat.Mech.Theo.Exp.09 doi:10.1088/1742.-5468/2006/09/P09011).
In this paper we prove that the cross entropy and the Kullback-Leibler divergence can be obtained in a similar way.
Network super point is a kind of special host which plays an important role in network management and security.
For a core network, detecting super points in real time is a burden task because it requires plenty computing resources to keep up with the high speed of packets.
Previous works try to solve this problem by using expensive memory, such as static random access memory, and multi cores of CPU.
But the number of cores in CPU is small and each core of CPU has a high price.
In this work, we use a popular parallel computing platform, graphic processing unit GPU, to mining core network's super point.
We propose a double direction hash functions group which can map hosts randomly and restore them from a dense structure.
Because the high randomness and simple process of the double direction hash functions, our algorithm reduce the memory to smaller than one-fourth of other algorithms.
Because the small memory requirement of our algorithm, a low cost GPU, only worth 200 dollars, is fast enough to deal with a high speed network such as 750 Gb/s.
No other algorithm can cope with such a high bandwidth traffic as accuracy as our algorithm on such a cheap platform.
Experiments on the traffic collecting from a core network demonstrate the advantage of our efficient algorithm.
Future machine to machine (M2M) communications need to support a massive number of devices communicating with each other with little or no human intervention.
Random access techniques were originally proposed to enable M2M multiple access, but suffer from severe congestion and access delay in an M2M system with a large number of devices.
In this paper, we propose a novel multiple access scheme for M2M communications based on the capacity-approaching analog fountain code to efficiently minimize the access delay and satisfy the delay requirement for each device.
This is achieved by allowing M2M devices to transmit at the same time on the same channel in an optimal probabilistic manner based on their individual delay requirements.
Simulation results show that the proposed scheme achieves a near optimal rate performance and at the same time guarantees the delay requirements of the devices.
We further propose a simple random access strategy and characterized the required overhead.
Simulation results show the proposed approach significantly outperforms the existing random access schemes currently used in long term evolution advanced (LTE-A) standard in terms of the access delay.
We performed two online surveys of Stack Overflow answerers and visitors to assess their awareness to outdated code and software licenses in Stack Overflow answerers.
The answerer survey targeted 607 highly reputed Stack Overflow users and received a high response rate of 33%.
Our findings are as follows.
Although most of the code snippets in the answers are written from scratch, there are code snippets cloned from the corresponding questions, from personal or company projects, or from open source projects.
Stack Overflow answerers are aware that some of their snippets are outdated.
However, 19% of the participants report that they rarely or never fix their outdated code.
At least 98% of the answerers never include software licenses in their snippets and 69% never check for licensing conflicts with Stack Overflow's CC BY-SA 3.0 if they copy the code from other sources to Stack Overflow answers.
The visitor survey uses convenient sampling and received 89 responses.
We found that 66% of the participants experienced a problem from cloning and reusing Stack Overflow snippets.
Fifty-six percent of the visitors never reported the problems back to the Stack Overflow post.
Eighty-five percent of the participants are not aware that StackOverflow applies the CC BY-SA 3.0 license, and sixty-two percent never give attributions to Stack Overflow posts or answers they copied the code from.
Moreover, 66% of the participants do not check for licensing conflicts between the copied Stack Overflow code and their software.
With these findings, we suggest Stack Overflow raise awareness of their users, both answerers and visitors, to the problem of outdated and license-violating code snippets.
Optical flow, semantic segmentation, and surface normals represent different information modalities, yet together they bring better cues for scene understanding problems.
In this paper, we study the influence between the three modalities: how one impacts on the others and their efficiency in combination.
We employ a modular approach using a convolutional refinement network which is trained supervised but isolated from RGB images to enforce joint modality features.
To assist the training process, we create a large-scale synthetic outdoor dataset that supports dense annotation of semantic segmentation, optical flow, and surface normals.
The experimental results show positive influence among the three modalities, especially for objects' boundaries, region consistency, and scene structures.
In this paper, we propose a novel 3D human pose estimation algorithm from a single image based on neural networks.
We adopted the structure of the relational networks in order to capture the relations among different body parts.
In our method, each pair of different body parts generates features, and the average of the features from all the pairs are used for 3D pose estimation.
In addition, we propose a dropout method that can be used in relational modules, which inherently imposes robustness to the occlusions.
The proposed network achieves state-of-the-art performance for 3D pose estimation in Human 3.6M dataset, and it effectively produces plausible results even in the existence of missing joints.
Recent work has shown that it is possible to train deep neural networks that are verifiably robust to norm-bounded adversarial perturbations.
Most of these methods are based on minimizing an upper bound on the worst-case loss over all possible adversarial perturbations.
While these techniques show promise, they remain hard to scale to larger networks.
Through a comprehensive analysis, we show how a careful implementation of a simple bounding technique, interval bound propagation (IBP), can be exploited to train verifiably robust neural networks that beat the state-of-the-art in verified accuracy.
While the upper bound computed by IBP can be quite weak for general networks, we demonstrate that an appropriate loss and choice of hyper-parameters allows the network to adapt such that the IBP bound is tight.
This results in a fast and stable learning algorithm that outperforms more sophisticated methods and achieves state-of-the-art results on MNIST, CIFAR-10 and SVHN.
It also allows us to obtain the first verifiably robust model on a downscaled version of ImageNet.
A/B tests are randomized experiments frequently used by companies that offer services on the Web for assessing the impact of new features.
During an experiment, each user is randomly redirected to one of two versions of the website, called treatments.
Several response models were proposed to describe the behavior of a user in a social network website, where the treatment assigned to her neighbors must be taken into account.
However, there is no consensus as to which model should be applied to a given dataset.
In this work, we propose a new response model, derive theoretical limits for the estimation error of several models, and obtain empirical results for cases where the response model was misspecified.
Persistent Homology (PH) allows tracking homology features like loops, holes and their higher-dimensional analogs, along with a single-parameter family of nested spaces.
Currently, computing descriptors for complex data characterized by multiple functions is becoming an important task in several applications, including physics, chemistry, medicine, geography, etc.
Multiparameter Persistent Homology (MPH) generalizes persistent homology opening to the exploration and analysis of shapes endowed with multiple filtering functions.
Still, computational constraints prevent MPH to be feasible over real-sized data.
In this paper, we consider discrete Morse Theory as a tool to simplify the computation of MPH on a multiparameter dataset.
We propose a new algorithm, well suited for parallel and distributed implementations and we provide the first evaluation of the impact on MPH computations of a preprocessing approach.
Traffic speed is a key indicator for the efficiency of an urban transportation system.
Accurate modeling of the spatiotemporally varying traffic speed thus plays a crucial role in urban planning and development.
This paper addresses the problem of efficient fine-grained traffic speed prediction using big traffic data obtained from static sensors.
Gaussian processes (GPs) have been previously used to model various traffic phenomena, including flow and speed.
However, GPs do not scale with big traffic data due to their cubic time complexity.
In this work, we address their efficiency issues by proposing local GPs to learn from and make predictions for correlated subsets of data.
The main idea is to quickly group speed variables in both spatial and temporal dimensions into a finite number of clusters, so that future and unobserved traffic speed queries can be heuristically mapped to one of such clusters.
A local GP corresponding to that cluster can then be trained on the fly to make predictions in real-time.
We call this method localization.
We use non-negative matrix factorization for localization and propose simple heuristics for cluster mapping.
We additionally leverage on the expressiveness of GP kernel functions to model road network topology and incorporate side information.
Extensive experiments using real-world traffic data collected in the two U.S. cities of Pittsburgh and Washington, D.C., show that our proposed local GPs significantly improve both runtime performances and prediction accuracies compared to the baseline global and local GPs.
Exponential integrators are special time discretization methods where the traditional linear system solves used by implicit schemes are replaced with computing the action of matrix exponential-like functions on a vector.
A very general formulation of exponential integrators is offered by the Exponential Propagation Iterative methods of Runge-Kutta type (EPIRK) family of schemes.
The use of Jacobian approximations is an important strategy to drastically reduce the overall computational costs of implicit schemes while maintaining the quality of their solutions.
This paper extends the EPIRK class to allow the use of inexact Jacobians as arguments of the matrix exponential-like functions.
Specifically, we develop two new families of methods: EPIRK-W integrators that can accommodate any approximation of the Jacobian, and EPIRK-K integrators that rely on a specific Krylov-subspace projection of the exact Jacobian.
Classical order conditions theories are constructed for these families.
A practical EPIRK-W method of order three and an EPIRK-K method of order four are developed.
Numerical experiments indicate that the methods proposed herein are computationally favorable when compared to existing exponential integrators.
Pedestrian detection has achieved great improvements in recent years, while complex occlusion handling is still one of the most important problems.
To take advantage of the body parts and context information for pedestrian detection, we propose the part and context network (PCN) in this work.
PCN specially utilizes two branches which detect the pedestrians through body parts semantic and context information, respectively.
In the Part Branch, the semantic information of body parts can communicate with each other via recurrent neural networks.
In the Context Branch, we adopt a local competition mechanism for adaptive context scale selection.
By combining the outputs of all branches, we develop a strong complementary pedestrian detector with a lower miss rate and better localization accuracy, especially for occlusion pedestrian.
Comprehensive evaluations on two challenging pedestrian detection datasets (i.e.Caltech and INRIA) well demonstrated the effectiveness of the proposed PCN.
One of the challenges of using machine learning techniques with medical data is the frequent dearth of source image data on which to train.
A representative example is automated lung cancer diagnosis, where nodule images need to be classified as suspicious or benign.
In this work we propose an automatic synthetic lung nodule image generator.
Our 3D shape generator is designed to augment the variety of 3D images.
Our proposed system takes root in autoencoder techniques, and we provide extensive experimental characterization that demonstrates its ability to produce quality synthetic images.
Nowadays, a hot challenge for supermarket chains is to offer personalized services for their customers.
Next basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services.
Current approaches are not capable to capture at the same time the different factors influencing the customer's decision process: co-occurrency, sequentuality, periodicity and recurrency of the purchased items.
To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors.
We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to to understand the level of the customer's stocks and recommend the set of most necessary items.
By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions.
A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
In the quest for knowledge about how to make good process models, recent research focus is shifting from studying the quality of process models to studying the process of process modeling (often abbreviated as PPM) itself.
This paper reports on our efforts to visualize this specific process in such a way that relevant characteristics of the modeling process can be observed graphically.
By recording each modeling operation in a modeling process, one can build an event log that can be used as input for the PPMChart Analysis plug-in we implemented in ProM.
The graphical representation this plug-in generates allows for the discovery of different patterns of the process of process modeling.
It also provides different views on the process of process modeling (by configuring and filtering the charts).
In research activities regarding Magnetic Resonance Imaging in medicine, simulation tools with a universal approach are rare.
Usually, simulators are developed and used which tend to be restricted to a particular, small range of applications.
This led to the design and implementation of a new simulator PARSPIN, the subject of this thesis.
In medical applications, the Bloch equation is a well-suited mathematical model of the underlying physics with a wide scope.
In this thesis, it is shown how analytical solutions of the Bloch equation can be found, which promise substantial execution time advantages over numerical solution methods.
From these analytical solutions of the Bloch equation, a new formalism for the description and the analysis of complex imaging experiments is derived, the K-t formalism.
It is shown that modern imaging methods can be better explained by the K-t formalism than by observing and analysing the magnetization of each spin of a spin ensemble.
Various approaches for a numerical simulation of Magnetic Resonance imaging are discussed.
It is shown that a simulation tool based on the K-t formalism promises a substantial gain in execution time.
Proper spatial discretization according to the sampling theorem, a topic rarely discussed in literature, is universally derived from the K-t formalism in this thesis.
A spin-based simulator is an application with high demands to computing facilities even on modern hardware.
In this thesis, two approaches for a parallelized software architecture are designed, analysed and evaluated with regard to a reduction of execution time.
A number of possible applications in research and education are demonstrated.
For a choice of imaging experiments, results produced both experimentally and by simulation are compared.
Deep generative models are tremendously successful in learning low-dimensional latent representations that well-describe the data.
These representations, however, tend to much distort relationships between points, i.e. pairwise distances tend to not reflect semantic similarities well.
This renders unsupervised tasks, such as clustering, difficult when working with the latent representations.
We demonstrate that taking the geometry of the generative model into account is sufficient to make simple clustering algorithms work well over latent representations.
Leaning on the recent finding that deep generative models constitute stochastically immersed Riemannian manifolds, we propose an efficient algorithm for computing geodesics (shortest paths) and computing distances in the latent space, while taking its distortion into account.
We further propose a new architecture for modeling uncertainty in variational autoencoders, which is essential for understanding the geometry of deep generative models.
Experiments show that the geodesic distance is very likely to reflect the internal structure of the data.
In this paper, we extend state of the art Model Predictive Control (MPC) approaches to generate safe bipedal walking on slippery surfaces.
In this setting, we formulate walking as a trade off between realizing a desired walking velocity and preserving robust foot-ground contact.
Exploiting this formulation inside MPC, we show that safe walking on various flat terrains can be achieved by compromising three main attributes, i. e. walking velocity tracking, the Zero Moment Point (ZMP) modulation, and the Required Coefficient of Friction (RCoF) regulation.
Simulation results show that increasing the walking velocity increases the possibility of slippage, while reducing the slippage possibility conflicts with reducing the tip-over possibility of the contact and vice versa.
Simulators are the most dominant and eminent tool for analyzing and investigating different type of networks.
The simulations can be executed with less cost as compared to large scale experiment as less computational resources are required and if the simulation model is carefully designed then it can be more practical than any well brought-up mathematical model.
Generally P2P research is based on the principle of simulate first and then experiment in the real world and there is no reason that simulation results cannot be reproducible.
A lack of standard documentation makes verification of results harder as well as due to such poor documentation implementation of well-known overlay algorithms was very difficult.
This Paper describes different types of existing P2P simulators as well as provides a survey and comparison of existing P2P simulators and extracting the best simulator among them.
Generative adversarial networks have been able to generate striking results in various domains.
This generation capability can be general while the networks gain deep understanding regarding the data distribution.
In many domains, this data distribution consists of anomalies and normal data, with the anomalies commonly occurring relatively less, creating datasets that are imbalanced.
The capabilities that generative adversarial networks offer can be leveraged to examine these anomalies and help alleviate the challenge that imbalanced datasets propose via creating synthetic anomalies.
This anomaly generation can be specifically beneficial in domains that have costly data creation processes as well as inherently imbalanced datasets.
One of the domains that fits this description is the host-based intrusion detection domain.
In this work, ADFA-LD dataset is chosen as the dataset of interest containing system calls of small foot-print next generation attacks.
The data is first converted into images, and then a Cycle-GAN is used to create images of anomalous data from images of normal data.
The generated data is combined with the original dataset and is used to train a model to detect anomalies.
By doing so, it is shown that the classification results are improved, with the AUC rising from 0.55 to 0.71, and the anomaly detection rate rising from 17.07% to 80.49%.
The results are also compared to SMOTE, showing the potential presented by generative adversarial networks in anomaly generation.
This chapter derives the properties of light from the properties of processing, including its ability to be both a wave and a particle, to respond to objects it doesn't physically touch, to take all paths to a destination, to choose a route after it arrives, and to spin both ways at once as it moves.
Here a photon is an entity program spreading as a processing wave of instances.
It becomes a "particle" if any part of it overloads the grid network that runs it, causing the photon program to reboot and restart at a new node.
The "collapse of the wave function" is how quantum processing creates what we call a physical photon.
This informational approach gives insights into issues like the law of least action, entanglement, superposition, counterfactuals, the holographic principle and the measurement problem.
The conceptual cost is that physical reality is a quantum processing output, i.e. virtual.
Dynamic topic modeling facilitates the identification of topical trends over time in temporal collections of unstructured documents.
We introduce a novel unsupervised neural dynamic topic model named as Recurrent Neural Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each time influence the topic discovery in the subsequent time steps.
We account for the temporal ordering of documents by explicitly modeling a joint distribution of latent topical dependencies over time, using distributional estimators with temporal recurrent connections.
Applying RNN-RSM to 19 years of articles on NLP research, we demonstrate that compared to state-of-the art topic models, RNNRSM shows better generalization, topic interpretation, evolution and trends.
We also introduce a metric (named as SPAN) to quantify the capability of dynamic topic model to capture word evolution in topics over time.
Effective information analysis generally boils down to properly identifying the structure or geometry of the data, which is often represented by a graph.
In some applications, this structure may be partly determined by design constraints or pre-determined sensing arrangements, like in road transportation networks for example.
In general though, the data structure is not readily available and becomes pretty difficult to define.
In particular, the global smoothness assumptions, that most of the existing works adopt, are often too general and unable to properly capture localized properties of data.
In this paper, we go beyond this classical data model and rather propose to represent information as a sparse combination of localized functions that live on a data structure represented by a graph.
Based on this model, we focus on the problem of inferring the connectivity that best explains the data samples at different vertices of a graph that is a priori unknown.
We concentrate on the case where the observed data is actually the sum of heat diffusion processes, which is a quite common model for data on networks or other irregular structures.
We cast a new graph learning problem and solve it with an efficient nonconvex optimization algorithm.
Experiments on both synthetic and real world data finally illustrate the benefits of the proposed graph learning framework and confirm that the data structure can be efficiently learned from data observations only.
We believe that our algorithm will help solving key questions in diverse application domains such as social and biological network analysis where it is crucial to unveil proper geometry for data understanding and inference.
Computational models for sarcasm detection have often relied on the content of utterances in isolation.
However, speaker's sarcastic intent is not always obvious without additional context.
Focusing on social media discussions, we investigate two issues: (1) does modeling of conversation context help in sarcasm detection and (2) can we understand what part of conversation context triggered the sarcastic reply.
To address the first issue, we investigate several types of Long Short-Term Memory (LSTM) networks that can model both the conversation context and the sarcastic response.
We show that the conditional LSTM network (Rocktaschel et al., 2015) and LSTM networks with sentence level attention on context and response outperform the LSTM model that reads only the response.
To address the second issue, we present a qualitative analysis of attention weights produced by the LSTM models with attention and discuss the results compared with human performance on the task.
The distributed computing is done on many systems to solve a large scale problem.
The growing of high-speed broadband networks in developed and developing countries, the continual increase in computing power, and the rapid growth of the Internet have changed the way.
In it the society manages information and information services.
Historically, the state of computing has gone through a series of platform and environmental changes.
Distributed computing holds great assurance for using computer systems effectively.
As a result, supercomputer sites and data centers have changed from providing high performance floating point computing capabilities to concurrently servicing huge number of requests from billions of users.
The distributed computing system uses multiple computers to solve large-scale problems over the Internet.
It becomes data-intensive and network-centric.
The applications of distributed computing have become increasingly wide-spread.
In distributed computing, the main stress is on the large scale resource sharing and always goes for the best performance.
In this article, we have reviewed the work done in the area of distributed computing paradigms.
The main stress is on the evolving area of cloud computing.
Most researchers acknowledge an intrinsic hierarchy in the scholarly journals ('journal rank') that they submit their work to, and adjust not only their submission but also their reading strategies accordingly.
On the other hand, much has been written about the negative effects of institutionalizing journal rank as an impact measure.
So far, contributions to the debate concerning the limitations of journal rank as a scientific impact assessment tool have either lacked data, or relied on only a few studies.
In this review, we present the most recent and pertinent data on the consequences of our current scholarly communication system with respect to various measures of scientific quality (such as utility/citations, methodological soundness, expert ratings or retractions).
These data corroborate previous hypotheses: using journal rank as an assessment tool is bad scientific practice.
Moreover, the data lead us to argue that any journal rank (not only the currently-favored Impact Factor) would have this negative impact.
Therefore, we suggest that abandoning journals altogether, in favor of a library-based scholarly communication system, will ultimately be necessary.
This new system will use modern information technology to vastly improve the filter, sort and discovery functions of the current journal system.
The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects.
Classifiers are the tools that implement the actual functional mapping from these measurements---also called features or inputs---to the so-called class label---or output.
The fields of pattern recognition and machine learning study ways of constructing such classifiers.
The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class?
This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem.
In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner.
Policy enforcers are sophisticated runtime components that can prevent failures by enforcing the correct behavior of the software.
While a single enforcer can be easily designed focusing only on the behavior of the application that must be monitored, the effect of multiple enforcers that enforce different policies might be hard to predict.
So far, mechanisms to resolve interferences between enforcers have been based on priority mechanisms and heuristics.
Although these methods provide a mechanism to take decisions when multiple enforcers try to affect the execution at a same time, they do not guarantee the lack of interference on the global behavior of the system.
In this paper we present a verification strategy that can be exploited to discover interferences between sets of enforcers and thus safely identify a-priori the enforcers that can co-exist at run-time.
In our evaluation, we experimented our verification method with several policy enforcers for Android and discovered some incompatibilities.
In the context of personalized medicine, text mining methods pose an interesting option for identifying disease-gene associations, as they can be used to generate novel links between diseases and genes which may complement knowledge from structured databases.
The most straightforward approach to extract such links from text is to rely on a simple assumption postulating an association between all genes and diseases that co-occur within the same document.
However, this approach (i) tends to yield a number of spurious associations, (ii) does not capture different relevant types of associations, and (iii) is incapable of aggregating knowledge that is spread across documents.
Thus, we propose an approach in which disease-gene co-occurrences and gene-gene interactions are represented in an RDF graph.
A machine learning-based classifier is trained that incorporates features extracted from the graph to separate disease-gene pairs into valid disease-gene associations and spurious ones.
On the manually curated Genetic Testing Registry, our approach yields a 30 points increase in F1 score over a plain co-occurrence baseline.
A distributed consensus algorithm for estimating the maximum value of the initial measurements in a sensor network with communication noise is proposed.
In the absence of communication noise, max estimation can be done by updating the state value with the largest received measurements in every iteration at each sensor.
In the presence of communication noise, however, the maximum estimate will incorrectly drift and the estimate at each sensor will diverge.
As a result, a soft-max approximation together with a non-linear consensus algorithm is introduced herein.
A design parameter controls the trade-off between the soft-max error and convergence speed.
An analysis of this trade-off gives a guideline towards how to choose the design parameter for the max estimate.
We also show that if some prior knowledge of the initial measurements is available, the consensus process can converge faster by using an optimal step size in the iterative algorithm.
A shifted non-linear bounded transmit function is also introduced for faster convergence when sensor nodes have some prior knowledge of the initial measurements.
Simulation results corroborating the theory are also provided.
In this paper we consider regular low-density parity-check codes over a binary-symmetric channel in the decoding regime.
We prove that up to a certain noise threshold the bit-error probability of the bit-sampling decoder converges in mean to zero over the code ensemble and the channel realizations.
To arrive at this result we show that the bit-error probability of the sampling decoder is equal to the derivative of a Bethe free entropy.
The method that we developed is new and is based on convexity of the free entropy and loop calculus.
Convexity is needed to exchange limit and derivative and the loop series enables us to express the difference between the bit-error probability and the Bethe free entropy.
We control the loop series using combinatorial techniques and a first moment method.
We stress that our method is versatile and we believe that it can be generalized for LDPC codes with general degree distributions and for asymmetric channels.
Myerson derived a simple and elegant solution to the single-parameter revenue-maximization problem in his seminal work on optimal auction design assuming the usual model of quasi-linear utilities.
In this paper, we consider a slight generalization of this usual model---from linear to convex "perceived" payments.
This more general problem does not appear to admit a solution as simple and elegant as Myerson's.
While some of Myerson's results extend to our setting, like his payment formula (suitably adjusted), others do not.
For example, we observe that the solutions to the Bayesian and the robust (i.e., non-Bayesian) optimal auction design problems in the convex perceived payment setting do not coincide like they do in the case of linear payments.
We therefore study the two problems in turn.
We derive an upper and a heuristic lower bound on expected revenue in our setting.
These bounds are easily computed pointwise, and yield monotonic allocation rules, so can be supported by Myerson payments (suitably adjusted).
In this way, our bounds yield heuristics that approximate the optimal robust auction, assuming convex perceived payments.
We close with experiments, the final set of which massages the output of one of the closed-form heuristics for the robust problem into an extremely fast, near-optimal heuristic solution to the Bayesian optimal auction design problem.
Different theories posit different sources for feelings of well-being and happiness.
Appraisal theory grounds our emotional responses in our goals and desires and their fulfillment, or lack of fulfillment.
Self Determination theory posits that the basis for well-being rests on our assessment of our competence, autonomy, and social connection.
And surveys that measure happiness empirically note that people require their basic needs to be met for food and shelter, but beyond that tend to be happiest when socializing, eating or having sex.
We analyze a corpus of private microblogs from a well-being application called ECHO, where users label each written post about daily events with a happiness score between 1 and 9.
Our goal is to ground the linguistic descriptions of events that users experience in theories of well-being and happiness, and then examine the extent to which different theoretical accounts can explain the variance in the happiness scores.
We show that recurrent event types, such as OBLIGATION and INCOMPETENCE, which affect people's feelings of well-being are not captured in current lexical or semantic resources.
We focus on robust and efficient iterative solvers for the pressure Poisson equation in incompressible Navier-Stokes problems.
Preconditioned Krylov subspace methods are popular for these problems, with BiCGStab and GMRES(m) most frequently used for nonsymmetric systems.
BiCGStab is popular because it has cheap iterations, but it may fail for stiff problems, especially early on as the initial guess is far from the solution.
Restarted GMRES is better, more robust, in this phase, but restarting may lead to very slow convergence.
Therefore, we evaluate the rGCROT method for these systems.
This method recycles a selected subspace of the search space (called recycle space) after a restart.
This generally improves the convergence drastically compared with GMRES(m).
Recycling subspaces is also advantageous for subsequent linear systems, if the matrix changes slowly or is constant.
However, rGCROT iterations are still expensive in memory and computation time compared with those of BiCGStab.
Hence, we propose a new, hybrid approach that combines the cheap iterations of BiCGStab with the robustness of rGCROT.
For the first few time steps the algorithm uses rGCROT and builds an effective recycle space, and then it recycles that space in the rBiCGStab solver.
We evaluate rGCROT on a turbulent channel flow problem, and we evaluate both rGCROT and the new, hybrid combination of rGCROT and rBiCGStab on a porous medium flow problem.
We see substantial performance gains on both problems.
This paper gives an introduction to the problem of mapping simple polygons with autonomous agents.
We focus on minimalistic agents that move from vertex to vertex along straight lines inside a polygon, using their sensors to gather local observations at each vertex.
Our attention revolves around the question whether a given configuration of sensors and movement capabilities of the agents allows them to capture enough data in order to draw conclusions regarding the global layout of the polygon.
In particular, we study the problem of reconstructing the visibility graph of a simple polygon by an agent moving either inside or on the boundary of the polygon.
Our aim is to provide insight about the algorithmic challenges faced by an agent trying to map a polygon.
We present an overview of techniques for solving this problem with agents that are equipped with simple sensorial capabilities.
We illustrate these techniques on examples with sensors that mea- sure angles between lines of sight or identify the previous location.
We give an overview over related problems in combinatorial geometry as well as graph exploration.
A model of a geometric algorithm is introduced and methodology of its operation is presented for the dynamic partitioning of data spaces.
In this paper, we consider the problem of machine reading task when the questions are in the form of keywords, rather than natural language.
In recent years, researchers have achieved significant success on machine reading comprehension tasks, such as SQuAD and TriviaQA.
These datasets provide a natural language question sentence and a pre-selected passage, and the goal is to answer the question according to the passage.
However, in the situation of interacting with machines by means of text, people are more likely to raise a query in form of several keywords rather than a complete sentence.
The keyword-based query comprehension is a new challenge, because small variations to a question may completely change its semantical information, thus yield different answers.
In this paper, we propose a novel neural network system that consists a Demand Optimization Model based on a passage-attention neural machine translation and a Reader Model that can find the answer given the optimized question.
The Demand Optimization Model optimizes the original query and output multiple reconstructed questions, then the Reader Model takes the new questions as input and locate the answers from the passage.
To make predictions robust, an evaluation mechanism will score the reconstructed questions so the final answer strike a good balance between the quality of both the Demand Optimization Model and the Reader Model.
Experimental results on several datasets show that our framework significantly improves multiple strong baselines on this challenging task.
The analysis of the use of social media for innovative entrepreneurship in the context has received little attention in the literature, especially in the context of Knowledge Intensive Business Services (KIBS).
Therefore, this paper focuses on bridging this gap by applying text mining and sentiment analysis techniques to identify the innovative entrepreneurship reflected by these companies in their social media.
Finally, we present and analyze the results of our quantitative analysis of 23.483 posts based on eleven Spanish and Italian consultancy KIBS Twitter Usernames and Keywords using data interpretation techniques such as clustering and topic modeling.
This paper suggests that there is a significant gap between the perceived potential of social media and the entrepreneurial behaviors at the social context in business-to-business (B2B) companies.
Scripts define knowledge about how everyday scenarios (such as going to a restaurant) are expected to unfold.
One of the challenges to learning scripts is the hierarchical nature of the knowledge.
For example, a suspect arrested might plead innocent or guilty, and a very different track of events is then expected to happen.
To capture this type of information, we propose an autoencoder model with a latent space defined by a hierarchy of categorical variables.
We utilize a recently proposed vector quantization based approach, which allows continuous embeddings to be associated with each latent variable value.
This permits the decoder to softly decide what portions of the latent hierarchy to condition on by attending over the value embeddings for a given setting.
Our model effectively encodes and generates scripts, outperforming a recent language modeling-based method on several standard tasks, and allowing the autoencoder model to achieve substantially lower perplexity scores compared to the previous language modeling-based method.
In this paper, automated user verification techniques for smartphones are investigated.
A unique non-commercial dataset, the University of Maryland Active Authentication Dataset 02 (UMDAA-02) for multi-modal user authentication research is introduced.
This paper focuses on three sensors - front camera, touch sensor and location service while providing a general description for other modalities.
Benchmark results for face detection, face verification, touch-based user identification and location-based next-place prediction are presented, which indicate that more robust methods fine-tuned to the mobile platform are needed to achieve satisfactory verification accuracy.
The dataset will be made available to the research community for promoting additional research.
Dense Multi-GPU systems have recently gained a lot of attention in the HPC arena.
Traditionally, MPI runtimes have been primarily designed for clusters with a large number of nodes.
However, with the advent of MPI+CUDA applications and CUDA-Aware MPI runtimes like MVAPICH2 and OpenMPI, it has become important to address efficient communication schemes for such dense Multi-GPU nodes.
This coupled with new application workloads brought forward by Deep Learning frameworks like Caffe and Microsoft CNTK pose additional design constraints due to very large message communication of GPU buffers during the training phase.
In this context, special-purpose libraries like NVIDIA NCCL have been proposed for GPU-based collective communication on dense GPU systems.
In this paper, we propose a pipelined chain (ring) design for the MPI_Bcast collective operation along with an enhanced collective tuning framework in MVAPICH2-GDR that enables efficient intra-/inter-node multi-GPU communication.
We present an in-depth performance landscape for the proposed MPI_Bcast schemes along with a comparative analysis of NVIDIA NCCL Broadcast and NCCL-based MPI_Bcast.
The proposed designs for MVAPICH2-GDR enable up to 14X and 16.6X improvement, compared to NCCL-based solutions, for intra- and inter-node broadcast latency, respectively.
In addition, the proposed designs provide up to 7% improvement over NCCL-based solutions for data parallel training of the VGG network on 128 GPUs using Microsoft CNTK.
It is well-known that the spacetime diagrams of some cellular automata have a fractal structure: for instance Pascal's triangle modulo 2 generates a Sierpinski triangle.
Explaining the fractal structure of the spacetime diagrams of cellular automata is a much explored topic, but virtually all of the results revolve around a special class of automata, whose typical features include irreversibility, an alphabet with a ring structure, a global evolution that is a ring homomorphism, and a property known as (weakly) p-Fermat.
The class of automata that we study in this article has none of these properties.
Their cell structure is weaker, as it does not come with a multiplication, and they are far from being p-Fermat, even weakly.
However, they do produce fractal spacetime diagrams, and we explain why and how.
Specialized classifiers, namely those dedicated to a subset of classes, are often adopted in real-world recognition systems.
However, integrating such classifiers is nontrivial.
Existing methods, e.g. weighted average, usually implicitly assume that all constituents of an ensemble cover the same set of classes.
Such methods can produce misleading predictions when used to combine specialized classifiers.
This work explores a novel approach.
Instead of combining predictions from individual classifiers directly, it first decomposes the predictions into sets of pairwise preferences, treating them as transition channels between classes, and thereon constructs a continuous-time Markov chain, and use the equilibrium distribution of this chain as the final prediction.
This way allows us to form a coherent picture over all specialized predictions.
On large public datasets, the proposed method obtains considerable improvement compared to mainstream ensemble methods, especially when the classifier coverage is highly unbalanced.
We consider two-way amplify and forward relaying, where multiple full-duplex user pairs exchange information via a shared full-duplex massive multiple-input multiple-output (MIMO) relay.
We derive closed-form lower bound for the spectral efficiency with zero-forcing processing at the relay, by using minimum mean squared error channel estimation.
The zero-forcing lower bound for the system model considered herein, which is valid for arbitrary number of antennas, is not yet derived in the massive MIMO relaying literature.
We numerically demonstrate the accuracy of the derived lower bound and the performance improvement achieved using zero-forcing processing.
We also numerically demonstrate the spectral gains achieved by a full-duplex system over a half-duplex one for various antenna regimes.
Halal is a notion that applies to both objects and actions, and means permissible according to Islamic law.
It may be most often associated with food and the rules of selecting, slaughtering, and cooking animals.
In the globalized world, halal can be found in street corners of New York and beauty shops of Manila.
In this study, we explore the cultural diversity of the concept, as revealed through social media, and specifically the way it is expressed by different populations around the world, and how it relates to their perception of (i) religious and (ii) governmental authority, and (iii) personal health.
Here, we analyze two Instagram datasets, using Halal in Arabic (325,665 posts) and in English (1,004,445 posts), which provide a global view of major Muslim populations around the world.
We find a great variety in the use of halal within Arabic, English, and Indonesian-speaking populations, with animal trade emphasized in first (making up 61% of the language's stream), food in second (80%), and cosmetics and supplements in third (70%).
The commercialization of the term halal is a powerful signal of its detraction from its traditional roots.
We find a complex social engagement around posts mentioning religious terms, such that when a food-related post is accompanied by a religious term, it on average gets more likes in English and Indonesian, but not in Arabic, indicating a potential shift out of its traditional moral framing.
Inference of space-time varying signals on graphs emerges naturally in a plethora of network science related applications.
A frequently encountered challenge pertains to reconstructing such dynamic processes, given their values over a subset of vertices and time instants.
The present paper develops a graph-aware kernel-based kriged Kalman filter that accounts for the spatio-temporal variations, and offers efficient online reconstruction, even for dynamically evolving network topologies.
The kernel-based learning framework bypasses the need for statistical information by capitalizing on the smoothness that graph signals exhibit with respect to the underlying graph.
To address the challenge of selecting the appropriate kernel, the proposed filter is combined with a multi-kernel selection module.
Such a data-driven method selects a kernel attuned to the signal dynamics on-the-fly within the linear span of a pre-selected dictionary.
The novel multi-kernel learning algorithm exploits the eigenstructure of Laplacian kernel matrices to reduce computational complexity.
Numerical tests with synthetic and real data demonstrate the superior reconstruction performance of the novel approach relative to state-of-the-art alternatives.
Many wireless protocols wait a small and random amount of time which is called jitter before sending a packet to avoid high contention and packet collision.
Jitter has been already proposed for many routing protocols including AODV and LOADng.
However, since they do not consider any link quality parameters or metrics (such as ETX) in routing, they fail to be efficient in metric-based routing protocols.
A metric-based jitter mechanism is proposed in this paper and a closed form expression is derived that enables us to obtain the probability of delay inversion for all jitter mechanisms available.
Simulation results are also presented to show the performance of different jitter mechanisms.
Context: Refactoring is recognized as an effective practice to maintain evolving software systems.
For software libraries, we study how library developers refactor their Application Programming Interfaces (APIs), especially when it impacts client users by breaking an API of the library.
Objective: Our work aims to understand how clients that use a library API are affected by refactoring activities.
We target popular libraries that potentially impact more library client users.
Method: We distinguish between library APIs based on their client-usage (refereed to as client-used APIs) in order to understand the extent to which API breakages relate to refactorings.
Our tool-based approach allows for a large-scale study across eight libraries (i.e., totaling 183 consecutive versions) with around 900 clients projects.
Results: We find that library maintainers are less likely to break client-used API classes.
Quantitatively, we find that refactoring activities break less than 37% of all client-used APIs.
In a more qualitative analysis, we show two documented cases of where non-refactoring API breaking changes are motivated other maintenance issues (i.e., bug fix and new features) and involve more complex refactoring operations.
Conclusion: Using our automated approach, we find that library developers are less likely to break APIs and tend to break client-used APIs when addressing these maintenance issues.
This paper proposes a method for direct torque control of Brushless DC (BLDC) motors.
Evaluating the trapezium of back-EMF is needed, and is done via a sliding mode observer employing just one measurement of stator current.
The effect of the proposed estimation algorithm is reducing the impact of switching noise and consequently eliminating the required filter.
Furthermore, to overcome the uncertainties related to BLDC motors, Recursive Least Square (RLS) is regarded as a real-time estimator of inertia and viscous damping coefficients of the BLDC motor.
By substituting the estimated load torque in mechanical dynamic equations, the rotor speed can be calculated.
Also, to increase the robustness and decrease the rise time of the system, Modified Model Reference Adaptive System (MMRAS) is applied in order to design a new speed controller.
Simulation results confirm the validity of this recommended method.
Deep Learning algorithms have recently become the de-facto paradigm for various prediction problems, which include many privacy-preserving applications like online medical image analysis.
Presumably, the privacy of data in a deep learning system is a serious concern.
There have been several efforts to analyze and exploit the information leakages from deep learning architectures to compromise data privacy.
In this paper, however, we attempt to provide an evaluation strategy for such information leakages through deep neural network architectures by considering a case study on Convolutional Neural Network (CNN) based image classifier.
The approach takes the aid of low-level hardware information, provided by Hardware Performance Counters (HPCs), during the execution of a CNN classifier and a simple hypothesis testing in order to produce an alarm if there exists any information leakage on the actual input.
Human vision possesses strong invariance in image recognition.
The cognitive capability of deep convolutional neural network (DCNN) is close to the human visual level because of hierarchical coding directly from raw image.
Owing to its superiority in feature representation, DCNN has exhibited remarkable performance in scene recognition of high-resolution remote sensing (HRRS) images and classification of hyper-spectral remote sensing images.
In-depth investigation is still essential for understanding why DCNN can accurately identify diverse ground objects via its effective feature representation.
Thus, we train the deep neural network called AlexNet on our large scale remote sensing image recognition benchmark.
At the neuron level in each convolution layer, we analyze the general properties of DCNN in HRRS image recognition by use of a framework of visual stimulation-characteristic response combined with feature coding-classification decoding.
Specifically, we use histogram statistics, representational dissimilarity matrix, and class activation mapping to observe the selective and invariance representations of DCNN in HRRS image recognition.
We argue that selective and invariance representations play important roles in remote sensing images tasks, such as classification, detection, and segment.
Also selective and invariance representations are significant to design new DCNN liked models for analyzing and understanding remote sensing images.
Once self-driving car becomes a reality and passengers are no longer worry about it, they will need to find new ways of entertainment.
However, retrieving entertainment contents at the Data Center (DC) can hinder content delivery service due to high delay of car-to-DC communication.
To address these challenges, we propose a deep learning based caching for self-driving car, by using Deep Learning approaches deployed on the Multi-access Edge Computing (MEC) structure.
First, at DC, Multi-Layer Perceptron (MLP) is used to predict the probabilities of contents to be requested in specific areas.
To reduce the car-DC delay, MLP outputs are logged into MEC servers attached to roadside units.
Second, in order to cache entertainment contents stylized for car passengers' features such as age and gender, Convolutional Neural Network (CNN) is used to predict age and gender of passengers.
Third, each car requests MLP output from MEC server and compares its CNN and MLP outputs by using k-means and binary classification.
Through this, the self-driving car can identify the contents need to be downloaded from the MEC server and cached.
Finally, we formulate deep learning based caching in the self-driving car that enhances entertainment services as an optimization problem whose goal is to minimize content downloading delay.
To solve the formulated problem, a Block Successive Majorization-Minimization (BS-MM) technique is applied.
The simulation results show that the accuracy of our prediction for the contents need to be cached in the areas of the self-driving car is achieved at 98.04% and our approach can minimize delay.
Graph based entropy, an index of the diversity of events in their distribution to parts of a co-occurrence graph, is proposed for detecting signs of structural changes in the data that are informative in explaining latent dynamics of consumers behavior.
For obtaining graph-based entropy, connected subgraphs are first obtained from the graph of co-occurrences of items in the data.
Then, the distribution of items occurring in events in the data to these sub-graphs is reflected on the value of graph-based entropy.
For the data on the position of sale, a change in this value is regarded as a sign of the appearance, the separation, the disappearance, or the uniting of consumers interests.
These phenomena are regarded as the signs of dynamic changes in consumers behavior that may be the effects of external events and information.
Experiments show that graph-based entropy outperforms baseline methods that can be used for change detection, in explaining substantial changes and their signs in consumers preference of items in supermarket stores.
In this paper, a new data-driven multiscale material modeling method, which we refer to as deep material network, is developed based on mechanistic homogenization theory of representative volume element (RVE) and advanced machine learning techniques.
We propose to use a collection of connected mechanistic building blocks with analytical homogenization solutions which avoids the loss of essential physics in generic neural networks, and this concept is demonstrated for 2-dimensional RVE problems and network depth up to 7.
Based on linear elastic RVE data from offline direct numerical simulations, the material network can be effectively trained using stochastic gradient descent with backpropagation algorithm, enhanced by model compression methods.
Importantly, the trained network is valid for any local material laws without the need for additional calibration or micromechanics assumption.
Its extrapolations to unknown material and loading spaces for a wide range of problems are validated through numerical experiments, including linear elasticity with high contrast of phase properties, nonlinear history-dependent plasticity and finite-strain hyperelasticity under large deformations.
By discovering a proper topological representation of RVE with fewer degrees of freedom, this intelligent material model is believed to open new possibilities of high-fidelity efficient concurrent simulations for a large-scale heterogeneous structure.
It also provides a mechanistic understanding of structure-property relations across material length scales and enables the development of parameterized microstructural database for material design and manufacturing.
"How common is interactive visualization on the web?"
"What is the most popular visualization design?"
"How prevalent are pie charts really?"
These questions intimate the role of interactive visualization in the real (online) world.
In this paper, we present our approach (and findings) to answering these questions.
First, we introduce Beagle, which mines the web for SVG-based visualizations and automatically classifies them by type (i.e., bar, pie, etc.).
With Beagle, we extract over 41,000 visualizations across five different tools and repositories, and classify them with 86% accuracy, across 24 visualization types.
Given this visualization collection, we study usage across tools.
We find that most visualizations fall under four types: bar charts, line charts, scatter charts, and geographic maps.
Though controversial, pie charts are relatively rare in practice.
Our findings also indicate that users may prefer tools that emphasize a succinct set of visualization types, and provide diverse expert visualization examples.
We address the problem of finding realistic geometric corrections to a foreground object such that it appears natural when composited into a background image.
To achieve this, we propose a novel Generative Adversarial Network (GAN) architecture that utilizes Spatial Transformer Networks (STNs) as the generator, which we call Spatial Transformer GANs (ST-GANs).
ST-GANs seek image realism by operating in the geometric warp parameter space.
In particular, we exploit an iterative STN warping scheme and propose a sequential training strategy that achieves better results compared to naive training of a single generator.
One of the key advantages of ST-GAN is its applicability to high-resolution images indirectly since the predicted warp parameters are transferable between reference frames.
We demonstrate our approach in two applications: (1) visualizing how indoor furniture (e.g. from product images) might be perceived in a room, (2) hallucinating how accessories like glasses would look when matched with real portraits.
Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases.
Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features.
Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations.
To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking.
Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak.
When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior.
A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations.
Cross-lingual word embeddings are becoming increasingly important in multilingual NLP.
Recently, it has been shown that these embeddings can be effectively learned by aligning two disjoint monolingual vector spaces through linear transformations, using no more than a small bilingual dictionary as supervision.
In this work, we propose to apply an additional transformation after the initial alignment step, which moves cross-lingual synonyms towards a middle point between them.
By applying this transformation our aim is to obtain a better cross-lingual integration of the vector spaces.
In addition, and perhaps surprisingly, the monolingual spaces also improve by this transformation.
This is in contrast to the original alignment, which is typically learned such that the structure of the monolingual spaces is preserved.
Our experiments confirm that the resulting cross-lingual embeddings outperform state-of-the-art models in both monolingual and cross-lingual evaluation tasks.
This study considers the 3D human pose estimation problem in a single RGB image by proposing a conditional random field (CRF) model over 2D poses, in which the 3D pose is obtained as a byproduct of the inference process.
The unary term of the proposed CRF model is defined based on a powerful heat-map regression network, which has been proposed for 2D human pose estimation.
This study also presents a regression network for lifting the 2D pose to 3D pose and proposes the prior term based on the consistency between the estimated 3D pose and the 2D pose.
To obtain the approximate solution of the proposed CRF model, the N-best strategy is adopted.
The proposed inference algorithm can be viewed as sequential processes of bottom-up generation of 2D and 3D pose proposals from the input 2D image based on deep networks and top-down verification of such proposals by checking their consistencies.
To evaluate the proposed method, we use two large-scale datasets: Human3.6M and HumanEva.
Experimental results show that the proposed method achieves the state-of-the-art 3D human pose estimation performance.
A computing environment is proposed, based on batch spreadsheet processing, which produces a spreadsheet display from plain text input files of commands, similar to the way documents are created using LaTeX.
In this environment, besides the usual spreadsheet rows and columns of cells, variables can be defined and are stored in a separate symbol table.
Cell and symbol formulas may contain cycles, and cycles which converge can be used to implement iterative algorithms.
Formulas are specified using the syntax of the C programming language, and all of C's numeric operators are supported, with operators such as ++, +=, etc. being implicitly cyclic.
User-defined functions can be written in C and are accessed using a dynamic link library.
The environment can be combined with a GUI front-end processor to enable easier interaction and graphics including plotting.
Human activity recognition based on wearable sensor data has been an attractive research topic due to its application in areas such as healthcare and smart environments.
In this context, many works have presented remarkable results using accelerometer, gyroscope and magnetometer data to represent the activities categories.
However, current studies do not consider important issues that lead to skewed results, making it hard to assess the quality of sensor-based human activity recognition and preventing a direct comparison of previous works.
These issues include the samples generation processes and the validation protocols used.
We emphasize that in other research areas, such as image classification and object detection, these issues are already well-defined, which brings more efforts towards the application.
Inspired by this, we conduct an extensive set of experiments that analyze different sample generation processes and validation protocols to indicate the vulnerable points in human activity recognition based on wearable sensor data.
For this purpose, we implement and evaluate several top-performance methods, ranging from handcrafted-based approaches to convolutional neural networks.
According to our study, most of the experimental evaluations that are currently employed are not adequate to perform the activity recognition in the context of wearable sensor data, in which the recognition accuracy drops considerably when compared to an appropriate evaluation approach.
To the best of our knowledge, this is the first study that tackles essential issues that compromise the understanding of the performance in human activity recognition based on wearable sensor data.
DNA sequences are fundamental for encoding genetic information.
The genetic information may not only be understood by symbolic sequences but also from the hidden signals inside the sequences.
The symbolic sequences need to be transformed into numerical sequences so the hidden signals can be revealed by signal processing techniques.
All current transformation methods encode DNA sequences into numerical values of the same length.
These representations have limitations in the applications of genomic signal compression, encryption, and steganography.
We propose an integer chaos game representation (iCGR) of DNA sequences and a lossless encoding method DNA sequences by the iCGR.
In the iCGR method, a DNA sequence is represented by the iterated function of the nucleotides and their positions in the sequence.
Then the DNA sequence can be uniquely encoded and recovered using three integers from iCGR.
One integer is the sequence length and the other two integers represent the accumulated distributions of nucleotides in the sequence.
The integer encoding scheme can compress a DNA sequence by 2 bits per nucleotide.
The integer representation of DNA sequences provides a prospective tool for sequence compression, encryption, and steganography.
The Python programs in this study are freely available to the public at https://github.com/cyinbox/iCGR
At the heart of deep learning we aim to use neural networks as function approximators - training them to produce outputs from inputs in emulation of a ground truth function or data creation process.
In many cases we only have access to input-output pairs from the ground truth, however it is becoming more common to have access to derivatives of the target output with respect to the input - for example when the ground truth function is itself a neural network such as in network compression or distillation.
Generally these target derivatives are not computed, or are ignored.
This paper introduces Sobolev Training for neural networks, which is a method for incorporating these target derivatives in addition the to target values while training.
By optimising neural networks to not only approximate the function's outputs but also the function's derivatives we encode additional information about the target function within the parameters of the neural network.
Thereby we can improve the quality of our predictors, as well as the data-efficiency and generalization capabilities of our learned function approximation.
We provide theoretical justifications for such an approach as well as examples of empirical evidence on three distinct domains: regression on classical optimisation datasets, distilling policies of an agent playing Atari, and on large-scale applications of synthetic gradients.
In all three domains the use of Sobolev Training, employing target derivatives in addition to target values, results in models with higher accuracy and stronger generalisation.
In this paper, we address the shape-from-shading problem by training deep networks with synthetic images.
Unlike conventional approaches that combine deep learning and synthetic imagery, we propose an approach that does not need any external shape dataset to render synthetic images.
Our approach consists of two synergistic processes: the evolution of complex shapes from simple primitives, and the training of a deep network for shape-from-shading.
The evolution generates better shapes guided by the network training, while the training improves by using the evolved shapes.
We show that our approach achieves state-of-the-art performance on a shape-from-shading benchmark.
We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, based on a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain.
Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale.
Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.
We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain much faster temporal response properties (shorter temporal delays) compared to a uniform distribution.
Specifically, these kernels converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales, thereby allowing for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter.
We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.
In this paper, we propose to infer music genre embeddings from audio datasets carrying semantic information about genres.
We show that such embeddings can be used for disambiguating genre tags (identification of different labels for the same genre, tag translation from a tag system to another, inference of hierarchical taxonomies on these genre tags).
These embeddings are built by training a deep convolutional neural network genre classifier with large audio datasets annotated with a flat tag system.
We show empirically that they makes it possible to retrieve the original taxonomy of a tag system, spot duplicates tags and translate tags from a tag system to another.
Incidental scene text detection, especially for multi-oriented text regions, is one of the most challenging tasks in many computer vision applications.
Different from the common object detection task, scene text often suffers from a large variance of aspect ratio, scale, and orientation.
To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective.
We design a novel Inception-Text module and introduce deformable PSROI pooling to deal with multi-oriented text detection.
Extensive experiments on ICDAR2015, RCTW-17, and MSRA-TD500 datasets demonstrate our method's superiority in terms of both effectiveness and efficiency.
Our proposed method achieves 1st place result on ICDAR2015 challenge and the state-of-the-art performance on other datasets.
Moreover, we have released our implementation as an OCR product which is available for public access.
Inspired by CapsNet's routing-by-agreement mechanism, with its ability to learn object properties, and by center-of-mass calculations from physics, we propose a CapsNet architecture with object coordinate atoms and an LSTM network for evaluation.
The first is based on CapsNet but uses a new routing algorithm to find the objects' approximate positions in the image coordinate system, and the second is a parameterized affine transformation network that can predict future positions from past positions by learning the translation transformation from 2D object coordinates generated from the first network.
We demonstrate the learned translation transformation is transferable to another dataset without the need to train the transformation network again.
Only the CapsNet needs training on the new dataset.
As a result, our work shows that object recognition and motion prediction can be separated, and that motion prediction can be transferred to another dataset with different object types.
We present a probabilistic approach to generate a small, query-able summary of a dataset for interactive data exploration.
Departing from traditional summarization techniques, we use the Principle of Maximum Entropy to generate a probabilistic representation of the data that can be used to give approximate query answers.
We develop the theoretical framework and formulation of our probabilistic representation and show how to use it to answer queries.
We then present solving techniques and give three critical optimizations to improve preprocessing time and query accuracy.
Lastly, we experimentally evaluate our work using a 5 GB dataset of flights within the United States and a 210 GB dataset from an astronomy particle simulation.
While our current work only supports linear queries, we show that our technique can successfully answer queries faster than sampling while introducing, on average, no more error than sampling and can better distinguish between rare and nonexistent values.
User data is the primary input of digital advertising, the fuel of free Internet as we know it.
As a result, web entities invest a lot in elaborate tracking mechanisms to acquire more and more user data that can sell to data markets and advertisers.
The primary identification mechanism of web is through cookies, where each entity assigns a userID on the user's side.
However, each tracker knows the same user with a different ID.
So how can the collected data be sold and merged with the associated user data of the buyer?
To address this, Cookie Synchronization (CSync) came to the rescue.
CSync facilitates an information sharing channel between third parties that may or may not have direct access to the website the user visits.
With CSync, they merge the user data they own in the background, but also reconstruct the browsing history of a user bypassing the same origin policy.
In this paper, we perform a first to our knowledge in-depth study of CSync in the wild, using a year-long dataset that includes web browsing activity from 850 real mobile users.
Through our study, we aim to understand the characteristics of the CSync protocol and the impact it has to the users privacy.
Our results show that 97% of the regular web users are exposed to CSync: most of them within the first week of their browsing.
In addition, the average user receives ~1 synchronization per 68 GET requests, and the median userID gets leaked, on average, to 3.5 different online entities.
In addition, we see that CSync increases the number of entities that track the user by a factor of 6.7.
Finally, we propose a novel, machine learning-based method for CSync detection, which can be effective when the synced IDs are obscured.
This paper proposes a novel algorithm to optimally size and place storage in low voltage (LV) networks based on a linearized multiperiod optimal power flow method which we call forward backward sweep optimal power flow (FBS-OPF).
We show that this method has good convergence properties, its solution deviates slightly from the optimum and makes the storage sizing and placement problem tractable for longer investment horizons.
We demonstrate the usefulness of our method by assessing the economic viability of distributed and centralized storage in LV grids with a high photovoltaic penetration (PV).
As a main result, we quantify that for the CIGRE LV test grid distributed storage configurations are preferable, since they allow for less PV curtailment due to grid constraints.
Recent terrorist attacks carried out on behalf of ISIS on American and European soil by lone wolf attackers or sleeper cells remind us of the importance of understanding the dynamics of radicalization mediated by social media communication channels.
In this paper, we shed light on the social media activity of a group of twenty-five thousand users whose association with ISIS online radical propaganda has been manually verified.
By using a computational tool known as dynamical activity-connectivity maps, based on network and temporal activity patterns, we investigate the dynamics of social influence within ISIS supporters.
We finally quantify the effectiveness of ISIS propaganda by determining the adoption of extremist content in the general population and draw a parallel between radical propaganda and epidemics spreading, highlighting that information broadcasters and influential ISIS supporters generate highly-infectious cascades of information contagion.
Our findings will help generate effective countermeasures to combat the group and other forms of online extremism.
Relational databases are valuable resources for learning novel and interesting relations and concepts.
Relational learning algorithms learn the Datalog definition of new relations in terms of the existing relations in the database.
In order to constraint the search through the large space of candidate definitions, users must tune the algorithm by specifying a language bias.
Unfortunately, specifying the language bias is done via trial and error and is guided by the expert's intuitions.
Hence, it normally takes a great deal of time and effort to effectively use these algorithms.
In particular, it is hard to find a user that knows computer science concepts, such as database schema, and has a reasonable intuition about the target relation in special domains, such as biology.
We propose AutoMode, a system that leverages information in the schema and content of the database to automatically induce the language bias used by popular relational learning systems.
We show that AutoMode delivers the same accuracy as using manually-written language bias by imposing only a slight overhead on the running time of the learning algorithm.
We investigate how well continuous-time fictitious play in two-player games performs in terms of average payoff, particularly compared to Nash equilibrium payoff.
We show that in many games, fictitious play outperforms Nash equilibrium on average or even at all times, and moreover that any game is linearly equivalent to one in which this is the case.
Conversely, we provide conditions under which Nash equilibrium payoff dominates fictitious play payoff.
A key step in our analysis is to show that fictitious play dynamics asymptotically converges the set of coarse correlated equilibria (a fact which is implicit in the literature).
Understanding and reasoning about cooking recipes is a fruitful research direction towards enabling machines to interpret procedural text.
In this work, we introduce RecipeQA, a dataset for multimodal comprehension of cooking recipes.
It comprises of approximately 20K instructional recipes with multiple modalities such as titles, descriptions and aligned set of images.
With over 36K automatically generated question-answer pairs, we design a set of comprehension and reasoning tasks that require joint understanding of images and text, capturing the temporal flow of events and making sense of procedural knowledge.
Our preliminary results indicate that RecipeQA will serve as a challenging test bed and an ideal benchmark for evaluating machine comprehension systems.
The data and leaderboard are available at http://hucvl.github.io/recipeqa.
In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms.
Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner.
These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks.
The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient.
Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function.
We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.
We present an algorithm for computing a Smith form with multipliers of a regular matrix polynomial over a field.
This algorithm differs from previous ones in that it computes a local Smith form for each irreducible factor in the determinant separately and then combines them into a global Smith form, whereas other algorithms apply a sequence of unimodular row and column operations to the original matrix.
The performance of the algorithm in exact arithmetic is reported for several test cases.
The local descriptors have gained wide range of attention due to their enhanced discriminative abilities.
It has been proved that the consideration of multi-scale local neighborhood improves the performance of the descriptor, though at the cost of increased dimension.
This paper proposes a novel method to construct a local descriptor using multi-scale neighborhood by finding the local directional order among the intensity values at different scales in a particular direction.
Local directional order is the multi-radius relationship factor (i.e.) in a particular direction.
The proposed local directional order pattern (LDOP) for a particular pixel is computed by finding the relationship between the center pixel and local directional order indexes.
It is required to transform the center value in the range of neighboring orders.
Finally, the histogram of LDOP is computed over whole image to construct the descriptor.
In contrast to the state-of-the-art descriptors, the dimension of the proposed descriptor does not depend upon the number of neighbors involved to compute the order; it only depends upon the number of directions.
The introduced descriptor is evaluated over the image retrieval framework and compared with the state-of-the-art descriptors over challenging face databases such as PaSC, LFW, PubFig, FERET, AR, AT&T, and ExtendedYale.
The experimental results confirm the superiority and robustness of the LDOP descriptor.
We believe that there is no real data protection without our own tools.
Therefore, our permanent aim is to have more of our own codes.
In order to achieve that, it is necessary that a lot of young researchers become interested in cryptography.
We believe that the encoding of cryptographic algorithms is an important step in that direction, and it is the main reason why in this paper we present a software implementation of finding the inverse element, the operation which is essentially related to both ECC (Elliptic Curve Cryptography) and the RSA schemes of digital signature.
Big data analytics is gaining massive momentum in the last few years.
Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications.Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake and stock movement prediction.
Extending traditional database systems to support the above analysis is intriguing but challenging.
First, it is almost impossible to implement all machine learning models in the database engines.
Second, expertise knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users.
In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models, and facilitate complex analytics on top of cloud platforms.
Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy.
Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.
Automatic lane tracking involves estimating the underlying signal from a sequence of noisy signal observations.
Many models and methods have been proposed for lane tracking, and dynamic targets tracking in general.
The Kalman Filter is a widely used method that works well on linear Gaussian models.
But this paper shows that Kalman Filter is not suitable for lane tracking, because its Gaussian observation model cannot faithfully represent the procured observations.
We propose using a Particle Filter on top of a novel multiple mode observation model.
Experiments show that our method produces superior performance to a conventional Kalman Filter.
In imperfect-information games, the optimal strategy in a subgame may depend on the strategy in other, unreached subgames.
Thus a subgame cannot be solved in isolation and must instead consider the strategy for the entire game as a whole, unlike perfect-information games.
Nevertheless, it is possible to first approximate a solution for the whole game and then improve it by solving individual subgames.
This is referred to as subgame solving.
We introduce subgame-solving techniques that outperform prior methods both in theory and practice.
We also show how to adapt them, and past subgame-solving techniques, to respond to opponent actions that are outside the original action abstraction; this significantly outperforms the prior state-of-the-art approach, action translation.
Finally, we show that subgame solving can be repeated as the game progresses down the game tree, leading to far lower exploitability.
These techniques were a key component of Libratus, the first AI to defeat top humans in heads-up no-limit Texas hold'em poker.
In the last years, a large number of RDF data sets has become available on the Web.
However, due to the semi-structured nature of RDF data, missing values affect answer completeness of queries that are posed against this data.
To overcome this limitation, we propose RDF-Hunter, a novel hybrid query processing approach that brings together machine and human computation to execute queries against RDF data.
We develop a novel quality model and query engine in order to enable RDF-Hunter to on the fly decide which parts of a query should be executed through conventional technology or crowd computing.
To evaluate RDF-Hunter, we created a collection of 50 SPARQL queries against the DBpedia data set, executed them using our hybrid query engine, and analyzed the accuracy of the outcomes obtained from the crowd.
The experiments clearly show that the overall approach is feasible and produces query results that reliably and significantly enhance completeness of automatic query processing responses.
The objective of this paper is to design an efficient vehicle license plate recognition System and to implement it for automatic parking inventory system.
The system detects the vehicle first and then captures the image of the front view of the vehicle.
Vehicle license plate is localized and characters are segmented.
For finding the place of plate, a novel and real time method is expressed.
A new and robust technique based on directional chain code is used for character recognition.
The resulting vehicle number is then compared with the available database of all the vehicles so as to come up with information about the vehicle type and to charge entrance cost accordingly.
The system is then allowed to open parking barrier for the vehicle and generate entrance cost receipt.
The vehicle information (such as entrance time, date, and cost amount) is also stored in the database to maintain the record.
The hardware and software integrated system is implemented and a working prototype model is developed.
Under the available database, the average accuracy of locating vehicle license plate obtained 100%.
Using 70% samples of character for training, we tested our scheme on whole samples and obtained 100% correct recognition rate.
Further we tested our character recognition stage on Persian vehicle data set and we achieved 99% correct recognition.
Book covers communicate information to potential readers, but can that same information be learned by computers?
We propose using a deep Convolutional Neural Network (CNN) to predict the genre of a book based on the visual clues provided by its cover.
The purpose of this research is to investigate whether relationships between books and their covers can be learned.
However, determining the genre of a book is a difficult task because covers can be ambiguous and genres can be overarching.
Despite this, we show that a CNN can extract features and learn underlying design rules set by the designer to define a genre.
Using machine learning, we can bring the large amount of resources available to the book cover design process.
In addition, we present a new challenging dataset that can be used for many pattern recognition tasks.
The large spectrum available in the millimeter-Wave (mmWave) band has emerged as a promising solution for meeting the huge capacity requirements of the 5th generation (5G) wireless networks.
However, to fully harness the potential of mmWave communications, obstacles such as severe path loss, channel sparsity and hardware complexity should be overcome.
In this paper, we introduce a generalized reconfigurable antenna multiple-input multiple-output (MIMO) architecture that takes advantage of lens-based reconfigurable antennas.
The considered antennas can support multiple radiation patterns simultaneously by using a single RF chain.
The degrees of freedom provided by the reconfigurable antennas are used to, first, combat channel sparsity in MIMO mmWave systems.
Further, to suppress high path loss and shadowing at mmWave frequencies, we use a rate-one space-time block code.
Our analysis and simulations show that the proposed reconfigurable MIMO architecture achieves full-diversity gain by using linear receivers and without requiring channel state information at the transmitter.
Moreover, simulations show that the proposed architecture outperforms traditional MIMO transmission schemes in mmWave channel settings.
The remarkable technological advance in well-equipped wearable devices is pushing an increasing production of long first-person videos.
However, since most of these videos have long and tedious parts, they are forgotten or never seen.
Despite a large number of techniques proposed to fast-forward these videos by highlighting relevant moments, most of them are image based only.
Most of these techniques disregard other relevant sensors present in the current devices such as high-definition microphones.
In this work, we propose a new approach to fast-forward videos using psychoacoustic metrics extracted from the soundtrack.
These metrics can be used to estimate the annoyance of a segment allowing our method to emphasize moments of sound pleasantness.
The efficiency of our method is demonstrated through qualitative results and quantitative results as far as of speed-up and instability are concerned.
We explain how the prototype automatic chess problem composer, Chesthetica, successfully composed a rare and interesting chess problem using the new Digital Synaptic Neural Substrate (DSNS) computational creativity approach.
This problem represents a greater challenge from a creative standpoint because the checkmate is not always clear and the method of winning even less so.
Creating a decisive chess problem of this type without the aid of an omniscient 7-piece endgame tablebase (and one that also abides by several chess composition conventions) would therefore be a challenge for most human players and composers working on their own.
The fact that a small computer with relatively low processing power and memory was sufficient to compose such a problem using the DSNS approach in just 10 days is therefore noteworthy.
In this report we document the event and result in some detail.
It lends additional credence to the DSNS as a viable new approach in the field of computational creativity.
In particular, in areas where human-like creativity is required for targeted or specific problems with no clear path to the solution.
Length-matching is an important technique to bal- ance delays of bus signals in high-performance PCB routing.
Existing routers, however, may generate very dense meander segments.
Signals propagating along these meander segments exhibit a speedup effect due to crosstalk between the segments of the same wire, thus leading to mismatch of arrival times even under the same physical wire length.
In this paper, we present a post-processing method to enlarge the width and the distance of meander segments and hence distribute them more evenly on the board so that crosstalk can be reduced.
In the proposed framework, we model the sharing of available routing areas after removing dense meander segments from the initial routing, as well as the generation of relaxed meander segments and their groups for wire length compensation.
This model is transformed into an ILP problem and solved for a balanced distribution of wire patterns.
In addition, we adjust the locations of long wire segments according to wire priorities to swap free spaces toward critical wires that need much length compensation.
To reduce the problem space of the ILP model, we also introduce a progressive fixing technique so that wire patterns are grown gradually from the edge of the routing toward the center area.
Experimental results show that the proposed method can expand meander segments significantly even under very tight area constraints, so that the speedup effect can be alleviated effectively in high- performance PCB designs.
In this paper, a two-hop decode-and-forward cognitive radio system with deployed interference alignment is considered.
The relay node is energy-constrained and scavenges the energy from the interference signals.
In the literature, there are two main energy harvesting protocols, namely, time-switching relaying and power-splitting relaying.
We first demonstrate how to design the beamforming matrices for the considered primary and secondary networks.
Then, the system capacity under perfect and imperfect channel state information scenarios, considering different portions of time-switching and power-splitting protocols, is estimated.
We investigate the association between musical chords and lyrics by analyzing a large dataset of user-contributed guitar tablatures.
Motivated by the idea that the emotional content of chords is reflected in the words used in corresponding lyrics, we analyze associations between lyrics and chord categories.
We also examine the usage patterns of chords and lyrics in different musical genres, historical eras, and geographical regions.
Our overall results confirms a previously known association between Major chords and positive valence.
We also report a wide variation in this association across regions, genres, and eras.
Our results suggest possible existence of different emotional associations for other types of chords.
Non-intrusive load monitoring (NILM), also known as energy disaggregation, is a blind source separation problem where a household's aggregate electricity consumption is broken down into electricity usages of individual appliances.
In this way, the cost and trouble of installing many measurement devices over numerous household appliances can be avoided, and only one device needs to be installed.
The problem has been well-known since Hart's seminal paper in 1992, and recently significant performance improvements have been achieved by adopting deep networks.
In this work, we focus on the idea that appliances have on/off states, and develop a deep network for further performance improvements.
Specifically, we propose a subtask gated network that combines the main regression network with an on/off classification subtask network.
Unlike typical multitask learning algorithms where multiple tasks simply share the network parameters to take advantage of the relevance among tasks, the subtask gated network multiply the main network's regression output with the subtask's classification probability.
When standby-power is additionally learned, the proposed solution surpasses the state-of-the-art performance for most of the benchmark cases.
The subtask gated network can be very effective for any problem that inherently has on/off states.
A smart city provides its people with high standard of living through advanced technologies and transport is one of the major foci.
With the advent of autonomous vehicles (AVs), an AV-based public transportation system has been proposed recently, which is capable of providing new forms of transportation services with high efficiency, high flexibility, and low cost.
For the benefit of passengers, multitenancy can increase market competition leading to lower service charge and higher quality of service.
In this paper, we study the pricing issue of the multi-tenant AV public transportation system and three types of services are defined.
The pricing process for each service type is modeled as a combinatorial auction, in which the service providers, as bidders, compete for offering transportation services.
The winners of the auction are determined through an integer linear program.
To prevent the bidders from raising their bids for higher returns, we propose a strategy-proof Vickrey-Clarke-Groves-based charging mechanism, which can maximize the social welfare, to settle the final charges for the customers.
We perform extensive simulations to verify the analytical results and evaluate the performance of the charging mechanism.
The size of a website's active user base directly affects its value.
Thus, it is important to monitor and influence a user's likelihood to return to a site.
Essential to this is predicting when a user will return.
Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods.
We observe that both techniques are severely limited when applied to this problem.
Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions.
RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time.
We develop a novel RNN survival model that removes the limitations of the state of the art methods.
We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation.
An important challenge in neuroevolution is to evolve complex neural networks with multiple modes of behavior.
Indirect encodings can potentially answer this challenge.
Yet in practice, indirect encodings do not yield effective multimodal controllers.
Thus, this paper introduces novel multimodal extensions to HyperNEAT, a popular indirect encoding.
A previous multimodal HyperNEAT approach called situational policy geometry assumes that multiple brains benefit from being embedded within an explicit geometric space.
However, experiments here illustrate that this assumption unnecessarily constrains evolution, resulting in lower performance.
Specifically, this paper introduces HyperNEAT extensions for evolving many brains without assuming geometric relationships between them.
The resulting Multi-Brain HyperNEAT can exploit human-specified task divisions to decide when each brain controls the agent, or can automatically discover when brains should be used, by means of preference neurons.
A further extension called module mutation allows evolution to discover the number of brains, enabling multimodal behavior with even less expert knowledge.
Experiments in several multimodal domains highlight that multi-brain approaches are more effective than HyperNEAT without multimodal extensions, and show that brains without a geometric relation to each other outperform situational policy geometry.
The conclusion is that Multi-Brain HyperNEAT provides several promising techniques for evolving complex multimodal behavior.
The human activity recognition in the IoT environment plays the central role in the ambient assisted living, where the human activities can be represented as a concatenated event stream generated from various smart objects.
From the concatenated event stream, each activity should be distinguished separately for the human activity recognition to provide services that users may need.
In this regard, accurately segmenting the entire stream at the precise boundary of each activity is indispensable high priority task to realize the activity recognition.
Multiple human activities in an IoT environment generate varying event stream patterns, and the unpredictability of these patterns makes them include redundant or missing events.
In dealing with this complex segmentation problem, we figured out that the dynamic and confusing patterns cause major problems due to: inclusive event stream, redundant events, and shared events.
To address these problems, we exploited the contextual relationships associated with the activity status about either ongoing or terminated/started.
To discover the intrinsic relationships between the events in a stream, we utilized the LSTM model by rendering it for the activity segmentation.
Then, the inferred boundaries were revised by our validation algorithm for a bit shifted boundaries.
Our experiments show the surprising result of high accuracy above 95%, on our own testbed with various smart objects.
This is superior to the prior works that even do not assume the environment with multi-user activities, where their accuracies are slightly above 80% in their test environment.
It proves that our work is feasible enough to be applied in the IoT environment.
Cache-aided coded multicast leverages side information at wireless edge caches to efficiently serve multiple unicast demands via common multicast transmissions, leading to load reductions that are proportional to the aggregate cache size.
However, the increasingly dynamic, unpredictable, and personalized nature of the content that users consume challenges the efficiency of existing caching-based solutions in which only exact content reuse is explored.
This paper generalizes the cache-aided coded multicast problem to specifically account for the correlation among content files, such as, for example, the one between updated versions of dynamic data.
It is shown that (i) caching content pieces based on their correlation with the rest of the library, and (ii) jointly compressing requested files using cached information as references during delivery, can provide load reductions that go beyond those achieved with existing schemes.
This is accomplished via the design of a class of correlation-aware achievable schemes, shown to significantly outperform state-of-the-art correlation-unaware solutions.
Our results show that as we move towards real-time and/or personalized media dominated services, where exact cache hits are almost non-existent but updates can exhibit high levels of correlation, network cached information can still be useful as references for network compression.
Semantic segmentation and vision-based geolocalization in aerial images are challenging tasks in computer vision.
Due to the advent of deep convolutional nets and the availability of relatively low cost UAVs, they are currently generating a growing attention in the field.
We propose a novel multi-task multi-stage neural network that is able to handle the two problems at the same time, in a single forward pass.
The first stage of our network predicts pixelwise class labels, while the second stage provides a precise location using two branches.
One branch uses a regression network, while the other is used to predict a location map trained as a segmentation task.
From a structural point of view, our architecture uses encoder-decoder modules at each stage, having the same encoder structure re-used.
Furthermore, its size is limited to be tractable on an embedded GPU.
We achieve commercial GPS-level localization accuracy from satellite images with spatial resolution of 1 square meter per pixel in a city-wide area of interest.
On the task of semantic segmentation, we obtain state-of-the-art results on two challenging datasets, the Inria Aerial Image Labeling dataset and Massachusetts Buildings.
This article analyses the difference in timing between the online availability of articles and their corresponding print publication and how it affects two bibliometric indicators: Journal Impact Factor (JIF) and Immediacy Index.
This research examined 18,526 articles, the complete collection of articles and reviews published by a set of 61 journals on Urology and Nephrology in 2013 and 2014.
The findings suggest that Advance Online Publication (AOP) accelerates the citation of articles and affects the JIF and Immediacy Index values.
Regarding the JIF values, the comparison between journals with or without AOP showed statistically significant differences (P=0.001, Mann-Whitney U test).
The Spearman's correlation between the JIF and the median online-to-print publication delay was not statistically significant.
As to the Immediacy Index, a significant Spearman's correlation (rs=0.280, P=0.029) was found regarding the median online-to-print publication delays for journals published in 2014, although no statistically significant correlation was found for those published in 2013.
Most journals examined (n=52 out of 61) published their articles in AOP.
The analysis also showed different publisher practices: eight journals did not include the online posting dates in the full-text and nine journals published articles showing two different online posting dates--the date provided on the journal website and another provided by Elsevier's Science Direct.
These practices suggest the need for transparency and standardization of the AOP dates of scientific articles for calculating bibliometric indicators for journals.
Multi-agent predictive modeling is an essential step for understanding physical, social and team-play systems.
Recently, Interaction Networks (INs) were proposed for the task of modeling multi-agent physical systems, INs scale with the number of interactions in the system (typically quadratic or higher order in the number of agents).
In this paper we introduce VAIN, a novel attentional architecture for multi-agent predictive modeling that scales linearly with the number of agents.
We show that VAIN is effective for multi-agent predictive modeling.
Our method is evaluated on tasks from challenging multi-agent prediction domains: chess and soccer, and outperforms competing multi-agent approaches.
This paper proposes a new algorithm for Gaussian process classification based on posterior linearisation (PL).
In PL, a Gaussian approximation to the posterior density is obtained iteratively using the best possible linearisation of the conditional mean of the labels and accounting for the linearisation error.
Considering three widely-used likelihood functions, in general, PL provides lower classification errors in real data sets than expectation propagation and Laplace algorithms.
The success of blockchain as the underlying technology for cryptocurrencies has opened up possibilities for its use in other application domains as well.
The main advantages of blockchain for its potential use in other domains are its inherent security mechanisms and immunity to different attacks.
A blockchain relies on a consensus method for agreeing on any new data.
Most of the consensus methods which are currently used for the blockchain of different cryptocurrencies require high computational power and thus are not apt for resource constrained systems.
In this article, we discuss and survey the various blockchain based consensus methods that are applicable to resource constrained IoT devices and networks.
A typical IoT network consists of several devices which have limited computational and communications capabilities.
Most often, these devices cannot perform the intensive computations and are starved for bandwidth.
Therefore, we discuss the possible measures that can be taken to reduce the computational power and convergence time for the underlying consensus methods.
We also discuss some of the alternatives to the public blockchain like private blockchain and tangle, and their potential adoption for IoT networks.
Furthermore, we discuss the existing consensus methods and blockchain implementations and explore the possibility of utilizing them to realize a blockchain based IoT network.
Some of the open research challenges are also put forward.
Energy efficiency is a crucial performance metric in sensor networks, directly determining the network lifetime.
Consequently, a key factor in WSN is to improve overall energy efficiency to extend the network lifetime.
Although many algorithms have been presented to optimize the energy factor, energy efficiency is still one of the major problems of WSNs, especially when there is a need to sample an area with different types of loads.
Unlike other energy-efficient schemes for hierarchical sampling, our hypothesis is that it is achievable, in terms of prolonging the network lifetime, to adaptively re-modify CHs sensing rates (the processing and transmitting stages in particular) in some specific regions that are triggered significantly less than other regions.
In order to do so we introduce the Adaptive Distributed Hierarchical Sensing (ADHS) algorithm.
This algorithm employs a homogenous sensor network in a distributed fashion and changes the sampling rates of the CHs based on the variance of the sampled data without damaging significantly the accuracy of the sensed area.
Automated facial identification and facial expression recognition have been topics of active research over the past few decades.
Facial and expression recognition find applications in human-computer interfaces, subject tracking, real-time security surveillance systems and social networking.
Several holistic and geometric methods have been developed to identify faces and expressions using public and local facial image databases.
In this work we present the evolution in facial image data sets and the methodologies for facial identification and recognition of expressions such as anger, sadness, happiness, disgust, fear and surprise.
We observe that most of the earlier methods for facial and expression recognition aimed at improving the recognition rates for facial feature-based methods using static images.
However, the recent methodologies have shifted focus towards robust implementation of facial/expression recognition from large image databases that vary with space (gathered from the internet) and time (video recordings).
The evolution trends in databases and methodologies for facial and expression recognition can be useful for assessing the next-generation topics that may have applications in security systems or personal identification systems that involve "Quantitative face" assessments.
In this paper, we describe a synthesis algorithm for safety specifications described as circuits.
Our algorithm is based on fixpoint computations, abstraction and refinement, it uses binary decision diagrams as symbolic data structure.
We evaluate our tool on the benchmarks provided by the organizers of the synthesis competition organized within the SYNT'14 workshop.
Human mobility is known to be distributed across several orders of magnitude of physical distances , which makes it generally difficult to endogenously find or define typical and meaningful scales.
Relevant analyses, from movements to geographical partitions, seem to be relative to some ad-hoc scale, or no scale at all.
Relying on geotagged data collected from photo-sharing social media, we apply community detection to movement networks constrained by increasing percentiles of the distance distribution.
Using a simple parameter-free discontinuity detection algorithm, we discover clear phase transitions in the community partition space.
The detection of these phases constitutes the first objective method of characterising endogenous, natural scales of human movement.
Our study covers nine regions, ranging from cities to countries of various sizes and a transnational area.
For all regions, the number of natural scales is remarkably low (2 or 3).
Further, our results hint at scale-related behaviours rather than scale-related users.
The partitions of the natural scales allow us to draw discrete multi-scale geographical boundaries, potentially capable of providing key insights in fields such as epidemiology or cultural contagion where the introduction of spatial boundaries is pivotal.
Data similarity (or distance) computation is a fundamental research topic which underpins many high-level applications based on similarity measures in machine learning and data mining.
However, in large-scale real-world scenarios, the exact similarity computation has become daunting due to "3V" nature (volume, velocity and variety) of big data.
In such cases, the hashing techniques have been verified to efficiently conduct similarity estimation in terms of both theory and practice.
Currently, MinHash is a popular technique for efficiently estimating the Jaccard similarity of binary sets and furthermore, weighted MinHash is generalized to estimate the generalized Jaccard similarity of weighted sets.
This review focuses on categorizing and discussing the existing works of weighted MinHash algorithms.
In this review, we mainly categorize the Weighted MinHash algorithms into quantization-based approaches, "active index"-based ones and others, and show the evolution and inherent connection of the weighted MinHash algorithms, from the integer weighted MinHash algorithms to real-valued weighted MinHash ones (particularly the Consistent Weighted Sampling scheme).
Also, we have developed a python toolbox for the algorithms, and released it in our github.
Based on the toolbox, we experimentally conduct a comprehensive comparative study of the standard MinHash algorithm and the weighted MinHash ones.
Stream processing has reached the mainstream in the last years, as a new generation of open source distributed stream processing systems, designed for scaling horizontally on commodity hardware, has brought the capability for processing high volume and high velocity data streams to companies of all sizes.
In this work we propose a combination of temporal logic and property-based testing (PBT) for dealing with the challenges of testing programs that employ this programming model.
We formalize our approach in a discrete time temporal logic for finite words, with some additions to improve the expressiveness of properties, which includes timeouts for temporal operators and a binding operator for letters.
In particular we focus on testing Spark Streaming programs written with the Spark API for the functional language Scala, using the PBT library ScalaCheck.
For that we add temporal logic operators to a set of new ScalaCheck generators and properties, as part of our testing library sscheck.
Under consideration in Theory and Practice of Logic Programming (TPLP).
We present a fresh and broad yet simple approach towards information retrieval in general and diagnostics in particular by applying the theory of complex networks on multidimensional, dynamic images.
We demonstrate a successful use of our method with the time series generated from high content thermal imaging videos of patients suffering from the aqueous deficient dry eye (ADDE) disease.
Remarkably, network analyses of thermal imaging time series of contact lens users and patients upon whom Laser-Assisted in situ Keratomileusis (Lasik) surgery has been conducted, exhibit pronounced similarity with results obtained from ADDE patients.
We also propose a general framework for the transformation of multidimensional images to networks for futuristic biometry.
Our approach is general and scalable to other fluctuation-based devices where network parameters derived from fluctuations, act as effective discriminators and diagnostic markers.
A significant amount of research literature is dedicated to interference mitigation in Wireless Mesh Networks (WMNs), with a special emphasis on designing channel allocation (CA) schemes which alleviate the impact of interference on WMN performance.
But having countless CA schemes at one's disposal makes the task of choosing a suitable CA for a given WMN extremely tedious and time consuming.
In this work, we propose a new interference estimation and CA performance prediction algorithm called CALM, which is inspired by social theory.
We borrow the sociological idea of a "sui generis" social reality, and apply it to WMNs with significant success.
To achieve this, we devise a novel Sociological Idea Borrowing Mechanism that facilitates easy operationalization of sociological concepts in other domains.
Further, we formulate a heuristic Mixed Integer Programming (MIP) model called NETCAP which makes use of link quality estimates generated by CALM to offer a reliable framework for network capacity prediction.
We demonstrate the efficacy of CALM by evaluating its theoretical estimates against experimental data obtained through exhaustive simulations on ns-3 802.11g environment, for a comprehensive CA test-set of forty CA schemes.
We compare CALM with three existing interference estimation metrics, and demonstrate that it is consistently more reliable.
CALM boasts of accuracy of over 90% in performance testing, and in stress testing too it achieves an accuracy of 88%, while the accuracy of other metrics drops to under 75%.
It reduces errors in CA performance prediction by as much as 75% when compared to other metrics.
Finally, we validate the expected network capacity estimates generated by NETCAP, and show that they are quite accurate, deviating by as low as 6.4% on an average when compared to experimentally recorded results in performance testing.
Many of the creative and figurative elements that make language exciting are lost in translation in current natural language generation engines.
In this paper, we explore a method to harvest templates from positive and negative reviews in the restaurant domain, with the goal of vastly expanding the types of stylistic variation available to the natural language generator.
We learn hyperbolic adjective patterns that are representative of the strongly-valenced expressive language commonly used in either positive or negative reviews.
We then identify and delexicalize entities, and use heuristics to extract generation templates from review sentences.
We evaluate the learned templates against more traditional review templates, using subjective measures of "convincingness", "interestingness", and "naturalness".
Our results show that the learned templates score highly on these measures.
Finally, we analyze the linguistic categories that characterize the learned positive and negative templates.
We plan to use the learned templates to improve the conversational style of dialogue systems in the restaurant domain.
Blockchain has the potential to revolutionize the way we store, use, and process data.
Information on most blockchains can be viewed by every node hosting the blockchain, which means that most blockchains cannot handle private data.
Decentralized databases exist that guarantee privacy by encrypting user data with the user's private key, but this prevents easy data sharing.
However, in many real world applications, from student data to medical records, it is desirable that user data is anonymously searchable.
In this paper we present a novel system that gives users ownership over their data while at the same time enabling them to make their data searchable within previously agreed upon limits.
Our system implements a strong notion of ownership using a self-sovereign identity system and a weak notion of ownership using multiple centralized databases together with a blockchain and a tumbling process.
We discuss applications of our methods to university's student records and medical data.
Deep neural networks have been shown to achieve state-of-the-art performance in several machine learning tasks.
Stochastic Gradient Descent (SGD) is the preferred optimization algorithm for training these networks and asynchronous SGD (ASGD) has been widely adopted for accelerating the training of large-scale deep networks in a distributed computing environment.
However, in practice it is quite challenging to tune the training hyperparameters (such as learning rate) when using ASGD so as achieve convergence and linear speedup, since the stability of the optimization algorithm is strongly influenced by the asynchronous nature of parameter updates.
In this paper, we propose a variant of the ASGD algorithm in which the learning rate is modulated according to the gradient staleness and provide theoretical guarantees for convergence of this algorithm.
Experimental verification is performed on commonly-used image classification benchmarks: CIFAR10 and Imagenet to demonstrate the superior effectiveness of the proposed approach, compared to SSGD (Synchronous SGD) and the conventional ASGD algorithm.
Conventional image motion based structure from motion methods first compute optical flow, then solve for the 3D motion parameters based on the epipolar constraint, and finally recover the 3D geometry of the scene.
However, errors in optical flow due to regularization can lead to large errors in 3D motion and structure.
This paper investigates whether performance and consistency can be improved by avoiding optical flow estimation in the early stages of the structure from motion pipeline, and it proposes a new direct method based on image gradients (normal flow) only.
The main idea lies in a reformulation of the positive-depth constraint, which allows the use of well-known minimization techniques to solve for 3D motion.
The 3D motion estimate is then refined and structure estimated adding a regularization based on depth.
Experimental comparisons on standard synthetic datasets and the real-world driving benchmark dataset KITTI using three different optic flow algorithms show that the method achieves better accuracy in all but one case.
Furthermore, it outperforms existing normal flow based 3D motion estimation techniques.
Finally, the recovered 3D geometry is shown to be also very accurate.
Bilinear models provide rich representations compared with linear models.
They have been applied in various visual tasks, such as object recognition, segmentation, and visual question-answering, to get state-of-the-art performances taking advantage of the expanded representations.
However, bilinear representations tend to be high-dimensional, limiting the applicability to computationally complex tasks.
We propose low-rank bilinear pooling using Hadamard product for an efficient attention mechanism of multimodal learning.
We show that our model outperforms compact bilinear pooling in visual question-answering tasks with the state-of-the-art results on the VQA dataset, having a better parsimonious property.
Among existing privacy-preserving approaches, Differential Privacy (DP) is a powerful tool that can provide privacy-preserving noisy query answers over statistical databases and has been widely adopted in many practical fields.
In particular, as a privacy machine of DP, Randomized Aggregable Privacy-Preserving Ordinal Response (RAPPOR) enables strong privacy, efficient, and high-utility guarantees for each client string in data crowdsourcing.
However, as for Internet of Things(IoT), such as smart gird, data are often processed in batches.
Therefore, developing a new random response algorithm that can support batch-processing tend to make it more efficient and suitable for IoT applications than existing random response algorithms.
In this paper, we propose a new randomized response algorithm that can achieve differential-privacy and utility guar-antees for consumer's behaviors, and process a batch of data at each time.
Firstly, by applying sparse coding in this algorithm, a behavior signature dictionary is created from the aggregated energy consumption data in fog.
Then, we add noise into the behavior signature dictionary by classical randomized response techniques and achieve the differential privacy after data re-aggregation.
Through the security analysis with the principle of differential privacy and experimental results verification, we find that our Algorithm can preserve consumer's privacy with-out comprising utility.
In motion analysis and understanding it is important to be able to fit a suitable model or structure to the temporal series of observed data, in order to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is available, such a structure has to be learned from the data in a bottom-up fashion.
In recent times, volumetric approaches in which the motion is captured from a number of cameras and a voxel-set representation of the body is built from the camera views, have gained ground due to attractive features such as inherent view-invariance and robustness to occlusions.
Automatic, unsupervised segmentation of moving bodies along entire sequences, in a temporally-coherent and robust way, has the potential to provide a means of constructing a bottom-up model of the moving body, and track motion cues that may be later exploited for motion classification.
Spectral methods such as locally linear embedding (LLE) can be useful in this context, as they preserve "protrusions", i.e., high-curvature regions of the 3D volume, of articulated shapes, while improving their separation in a lower dimensional space, making them in this way easier to cluster.
In this paper we therefore propose a spectral approach to unsupervised and temporally-coherent body-protrusion segmentation along time sequences.
Volumetric shapes are clustered in an embedding space, clusters are propagated in time to ensure coherence, and merged or split to accommodate changes in the body's topology.
Experiments on both synthetic and real sequences of dense voxel-set data are shown.
This supports the ability of the proposed method to cluster body-parts consistently over time in a totally unsupervised fashion, its robustness to sampling density and shape quality, and its potential for bottom-up model construction
In this paper, we discuss the formalized approach for generating and estimating symbols (and alphabets), which can be communicated by the wide range of non-verbal means based on specific user requirements (medium, priorities, type of information that needs to be conveyed).
The short characterization of basic terms and parameters of such symbols (and alphabets) with approaches to generate them are given.
Then the framework, experimental setup, and some machine learning methods to estimate usefulness and effectiveness of the nonverbal alphabets and systems are presented.
The previous results demonstrate that usage of multimodal data sources (like wearable accelerometer, heart monitor, muscle movements sensors, braincomputer interface) along with machine learning approaches can provide the deeper understanding of the usefulness and effectiveness of such alphabets and systems for nonverbal and situated communication.
The symbols (and alphabets) generated and estimated by such methods may be useful in various applications: from synthetic languages and constructed scripts to multimodal nonverbal and situated interaction between people and artificial intelligence systems through Human-Computer Interfaces, such as mouse gestures, touchpads, body gestures, eyetracking cameras, wearables, and brain-computing interfaces, especially in applications for elderly care and people with disabilities.
Optimization is becoming a crucial element in industrial applications involving sustainable alternative energy systems.
During the design of such systems, the engineer/decision maker would often encounter noise factors (e.g. solar insolation and ambient temperature fluctuations) when their system interacts with the environment.
In this chapter, the sizing and design optimization of the solar powered irrigation system was considered.
This problem is multivariate, noisy, nonlinear and multiobjective.
This design problem was tackled by first using the Fuzzy Type II approach to model the noise factors.
Consequently, the Bacterial Foraging Algorithm (BFA) (in the context of a weighted sum framework) was employed to solve this multiobjective fuzzy design problem.
This method was then used to construct the approximate Pareto frontier as well as to identify the best solution option in a fuzzy setting.
Comprehensive analyses and discussions were performed on the generated numerical results with respect to the implemented solution methods.
This methodology paper addresses high-performance high-productivity programming on spatial architectures.
Spatial architectures are efficient for executing dataflow algorithms, yet for high-performance programming, the productivity is low and verification is painful.
We show that coding and verification are the biggest obstacle to the wide adoption of spatial architectures.
We propose a new programming methodology, T2S (Temporal to Spatial), to remove this obstacle.
A programmer specifies a temporal definition and a spatial mapping.
The temporal definition defines the functionality to compute, while the spatial mapping defines how to decompose the functionality and map the decomposed pieces onto a spatial architecture.
The specification precisely controls a compiler to actually implement the loop and data transformations specified in the mapping.
The specification is loop-nest- and matrix-oriented, and thus lends itself to the compiler for automatic, static verification.
Many generic, strategic loop and data optimizations can be systematically expressed.
Consequently, high performance is expected with substantially higher productivity: compared with high-performance programming in today's high-level synthesis (HLS) languages or hardware description languages (HDLs), the engineering effort on coding and verification is expected to be reduced from months to hours, a reduction of 2 or 3 orders of magnitude.
Proof-of-Stake systems randomly choose, on each round, one of the participants as a consensus leader that extends the chain with the next block such that the selection probability is proportional to the owned stake.
However, distributed random number generation is notoriously difficult.
Systems that derive randomness from the previous blocks are completely insecure; solutions that provide secure random selection are inefficient due to their high communication complexity; and approaches that balance security and performance exhibit selection bias.
When block creation is rewarded with new stake, even a minor bias can have a severe cumulative effect.
In this paper, we propose Robust Round Robin, a new consensus scheme that addresses this selection problem.
We create reliable long-term identities by bootstrapping from an existing infrastructure, such as Intel's SGX processors, or by mining them starting from an initial fair distribution.
For leader selection we use a deterministic approach.
On each round, we select a set of the previously created identities as consensus leader candidates in round robin manner.
Because simple round-robin alone is vulnerable to attacks and offers poor liveness, we complement such deterministic selection policy with a lightweight endorsement mechanism that is an interactive protocol between the leader candidates and a small subset of other system participants.
Our solution has low good efficiency as it requires no expensive distributed randomness generation and it provides block creation fairness which is crucial in deployments that reward it with new stake.
Feature selection is a dimensionality reduction technique that selects a subset of representative features from high dimensional data by eliminating irrelevant and redundant features.
Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignores correlation between features.
These works first map data onto a low-dimensional subspace and then select features by posing a sparsity constraint on the transformation matrix.
However, they are restricted by design to linear data transformation, a potential drawback given that the underlying correlation structures of data are often non-linear.
To leverage a more sophisticated embedding, we propose an autoencoder-based unsupervised feature selection approach that leverages a single-layer autoencoder for a joint framework of feature selection and manifold learning.
More specifically, we enforce column sparsity on the weight matrix connecting the input layer and the hidden layer, as in previous work.
Additionally, we include spectral graph analysis on the projected data into the learning process to achieve local data geometry preservation from the original data space to the low-dimensional feature space.
Extensive experiments are conducted on image, audio, text, and biological data.
The promising experimental results validate the superiority of the proposed method.
Although a number of solutions exist for the problems of coverage, search and target localization---commonly addressed separately---whether there exists a unified strategy that addresses these objectives in a coherent manner without being application-specific remains a largely open research question.
In this paper, we develop a receding-horizon ergodic control approach, based on hybrid systems theory, that has the potential to fill this gap.
The nonlinear model predictive control algorithm plans real-time motions that optimally improve ergodicity with respect to a distribution defined by the expected information density across the sensing domain.
We establish a theoretical framework for global stability guarantees with respect to a distribution.
Moreover, the approach is distributable across multiple agents, so that each agent can independently compute its own control while sharing statistics of its coverage across a communication network.
We demonstrate the method in both simulation and in experiment in the context of target localization, illustrating that the algorithm is independent of the number of targets being tracked and can be run in real-time on computationally limited hardware platforms.
For source sequences of length L symbols we proposed to use a more realistic value to the usual benchmark of number of code letters by source letters.
Our idea is based on a quantifier of information fluctuation of a source, F(U), which corresponds to the second central moment of the random variable that measures the information content of a source symbol.
An alternative interpretation of typical sequences is additionally provided through this approach.
Lossless Feedback Delay Networks (FDNs) are commonly used as a design prototype for artificial reverberation algorithms.
The lossless property is dependent on the feedback matrix, which connects the output of a set of delays to their inputs, and the lengths of the delays.
Both, unitary and triangular feedback matrices are known to constitute lossless FDNs, however, the most general class of lossless feedback matrices has not been identified.
In this contribution, it is shown that the FDN is lossless for any set of delays, if all irreducible components of the feedback matrix are diagonally similar to a unitary matrix.
The necessity of the generalized class of feedback matrices is demonstrated by examples of FDN designs proposed in literature.
This paper proposes a novel method to optimize bandwidth usage for object detection in critical communication scenarios.
We develop two operating models of active information seeking.
The first model identifies promising regions in low resolution imagery and progressively requests higher resolution regions on which to perform recognition of higher semantic quality.
The second model identifies promising regions in low resolution imagery while simultaneously predicting the approximate location of the object of higher semantic quality.
From this general framework, we develop a car recognition system via identification of its license plate and evaluate the performance of both models on a car dataset that we introduce.
Results are compared with traditional JPEG compression and demonstrate that our system saves up to one order of magnitude of bandwidth while sacrificing little in terms of recognition performance.
We study the commutative positive varieties of languages closed under various operations: shuffle, renaming and product over one-letter alphabets.
Imputing incomplete medical tests and predicting patient outcomes are crucial for guiding the decision making for therapy, such as after an Achilles Tendon Rupture (ATR).
We formulate the problem of data imputation and prediction for ATR relevant medical measurements into a recommender system framework.
By applying MatchBox, which is a collaborative filtering approach, on a real dataset collected from 374 ATR patients, we aim at offering personalized medical data imputation and prediction.
In this work, we show the feasibility of this approach and discuss potential research directions by conducting initial qualitative evaluations.
Spectral clustering is one of the most popular clustering approaches with the capability to handle some challenging clustering problems.
Most spectral clustering methods provide a nonlinear map from the data manifold to a subspace.
Only a little work focuses on the explicit linear map which can be viewed as the unsupervised distance metric learning.
In practice, the selection of the affinity matrix exhibits a tremendous impact on the unsupervised learning.
While much success of affinity learning has been achieved in recent years, some issues such as noise reduction remain to be addressed.
In this paper, we propose a novel method, dubbed Adaptive Affinity Matrix (AdaAM), to learn an adaptive affinity matrix and derive a distance metric from the affinity.
We assume the affinity matrix to be positive semidefinite with ability to quantify the pairwise dissimilarity.
Our method is based on posing the optimization of objective function as a spectral decomposition problem.
We yield the affinity from both the original data distribution and the widely-used heat kernel.
The provided matrix can be regarded as the optimal representation of pairwise relationship on the manifold.
Extensive experiments on a number of real-world data sets show the effectiveness and efficiency of AdaAM.
Multimedia streaming services over spoken dialog systems have become ubiquitous.
User-entity affinity modeling is critical for the system to understand and disambiguate user intents and personalize user experiences.
However, fully voice-based interaction demands quantification of novel behavioral cues to determine user affinities.
In this work, we propose using play duration cues to learn a matrix factorization based collaborative filtering model.
We first binarize play durations to obtain implicit positive and negative affinity labels.
The Bayesian Personalized Ranking objective and learning algorithm are employed in our low-rank matrix factorization approach.
To cope with uncertainties in the implicit affinity labels, we propose to apply a weighting function that emphasizes the importance of high confidence samples.
Based on a large-scale database of Alexa music service records, we evaluate the affinity models by computing Spearman correlation between play durations and predicted affinities.
Comparing different data utilizations and weighting functions, we find that employing both positive and negative affinity samples with a convex weighting function yields the best performance.
Further analysis demonstrates the model's effectiveness on individual entity level and provides insights on the temporal dynamics of observed affinities.
Reinforcement learning (RL) is an area of research that has blossomed tremendously in recent years and has shown remarkable potential for artificial intelligence based opponents in computer games.
This success is primarily due to the vast capabilities of convolutional neural networks, that can extract useful features from noisy and complex data.
Games are excellent tools to test and push the boundaries of novel RL algorithms because they give valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences.
Real-time strategy games (RTS) is a genre that has tremendous complexity and challenges the player in short and long-term planning.
There is much research that focuses on applied RL in RTS games, and novel advances are therefore anticipated in the not too distant future.
However, there are to date few environments for testing RTS AIs.
Environments in the literature are often either overly simplistic, such as microRTS, or complex and without the possibility for accelerated learning on consumer hardware like StarCraft II.
This paper introduces the Deep RTS game environment for testing cutting-edge artificial intelligence algorithms for RTS games.
Deep RTS is a high-performance RTS game made specifically for artificial intelligence research.
It supports accelerated learning, meaning that it can learn at a magnitude of 50 000 times faster compared to existing RTS games.
Deep RTS has a flexible configuration, enabling research in several different RTS scenarios, including partially observable state-spaces and map complexity.
We show that Deep RTS lives up to our promises by comparing its performance with microRTS, ELF, and StarCraft II on high-end consumer hardware.
Using Deep RTS, we show that a Deep Q-Network agent beats random-play agents over 70% of the time.
Deep RTS is publicly available at https://github.com/cair/DeepRTS.
A brain computer interface (BCI) is a system which provides direct communication between the mind of a person and the outside world by using only brain activity (EEG).
The event-related potential (ERP)-based BCI problem consists of a binary pattern recognition.
Linear discriminant analysis (LDA) is widely used to solve this type of classification problems, but it fails when the number of features is large relative to the number of observations.
In this work we propose a penalized version of the sparse discriminant analysis (SDA), called Kullback-Leibler penalized sparse discriminant analysis (KLSDA).
This method inherits both the discriminative feature selection and classification properties of SDA and it also improves SDA performance through the addition of Kullback-Leibler class discrepancy information.
The KLSDA method is design to automatically select the optimal regularization parameters.
Numerical experiments with two real ERP-EEG datasets show that this new method outperforms standard SDA.
We describe a novel family of models of multi- layer feedforward neural networks in which the activation functions are encoded via penalties in the training problem.
Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimiza- tion problem.
The new framework allows for algo- rithms such as block-coordinate descent methods to be applied, in which each step is composed of a simple (no hidden layer) supervised learning problem that is parallelizable across data points and/or layers.
Experiments indicate that the pro- posed models provide excellent initial guesses for weights for standard neural networks.
In addi- tion, the model provides avenues for interesting extensions, such as robustness against noisy in- puts and optimizing over parameters in activation functions.
Recently, researchers started to pay attention to the detection of temporal shifts in the meaning of words.
However, most (if not all) of these approaches restricted their efforts to uncovering change over time, thus neglecting other valuable dimensions such as social or political variability.
We propose an approach for detecting semantic shifts between different viewpoints--broadly defined as a set of texts that share a specific metadata feature, which can be a time-period, but also a social entity such as a political party.
For each viewpoint, we learn a semantic space in which each word is represented as a low dimensional neural embedded vector.
The challenge is to compare the meaning of a word in one space to its meaning in another space and measure the size of the semantic shifts.
We compare the effectiveness of a measure based on optimal transformations between the two spaces with a measure based on the similarity of the neighbors of the word in the respective spaces.
Our experiments demonstrate that the combination of these two performs best.
We show that the semantic shifts not only occur over time, but also along different viewpoints in a short period of time.
For evaluation, we demonstrate how this approach captures meaningful semantic shifts and can help improve other tasks such as the contrastive viewpoint summarization and ideology detection (measured as classification accuracy) in political texts.
We also show that the two laws of semantic change which were empirically shown to hold for temporal shifts also hold for shifts across viewpoints.
These laws state that frequent words are less likely to shift meaning while words with many senses are more likely to do so.
Author name ambiguity in a digital library may affect the findings of research that mines authorship data of the library.
This study evaluates author name disambiguation in DBLP, a widely used but insufficiently evaluated digital library for its disambiguation performance.
In doing so, this study takes a triangulation approach that author name disambiguation for a digital library can be better evaluated when its performance is assessed on multiple labeled datasets with comparison to baselines.
Tested on three types of labeled data containing 5,000 ~ 700K disambiguated names and 6M pairs of disambiguated names, DBLP is shown to assign author names quite accurately to distinct authors, resulting in pairwise precision, recall, and F1 measures around 0.90 or above overall.
DBLP's author name disambiguation performs well even on large ambiguous name blocks but deficiently on distinguishing authors with the same names.
When compared to other disambiguation algorithms, DBLP's disambiguation performance is quite competitive, possibly due to its hybrid disambiguation approach combining algorithmic disambiguation and manual error correction.
A discussion follows on strengths and weaknesses of labeled datasets used in this study for future efforts to evaluate author name disambiguation on a digital library scale.
Stories can have tremendous power -- not only useful for entertainment, they can activate our interests and mobilize our actions.
The degree to which a story resonates with its audience may be in part reflected in the emotional journey it takes the audience upon.
In this paper, we use machine learning methods to construct emotional arcs in movies, calculate families of arcs, and demonstrate the ability for certain arcs to predict audience engagement.
The system is applied to Hollywood films and high quality shorts found on the web.
We begin by using deep convolutional neural networks for audio and visual sentiment analysis.
These models are trained on both new and existing large-scale datasets, after which they can be used to compute separate audio and visual emotional arcs.
We then crowdsource annotations for 30-second video clips extracted from highs and lows in the arcs in order to assess the micro-level precision of the system, with precision measured in terms of agreement in polarity between the system's predictions and annotators' ratings.
These annotations are also used to combine the audio and visual predictions.
Next, we look at macro-level characterizations of movies by investigating whether there exist `universal shapes' of emotional arcs.
In particular, we develop a clustering approach to discover distinct classes of emotional arcs.
Finally, we show on a sample corpus of short web videos that certain emotional arcs are statistically significant predictors of the number of comments a video receives.
These results suggest that the emotional arcs learned by our approach successfully represent macroscopic aspects of a video story that drive audience engagement.
Such machine understanding could be used to predict audience reactions to video stories, ultimately improving our ability as storytellers to communicate with each other.
We propose a new differentially-private decision forest algorithm that minimizes both the number of queries required, and the sensitivity of those queries.
To do so, we build an ensemble of random decision trees that avoids querying the private data except to find the majority class label in the leaf nodes.
Rather than using a count query to return the class counts like the current state-of-the-art, we use the Exponential Mechanism to only output the class label itself.
This drastically reduces the sensitivity of the query -- often by several orders of magnitude -- which in turn reduces the amount of noise that must be added to preserve privacy.
Our improved sensitivity is achieved by using "smooth sensitivity", which takes into account the specific data used in the query rather than assuming the worst-case scenario.
We also extend work done on the optimal depth of random decision trees to handle continuous features, not just discrete features.
This, along with several other improvements, allows us to create a differentially private decision forest with substantially higher predictive power than the current state-of-the-art.
Magnetic induction (MI) based communication and power transfer systems have gained an increased attention in the recent years.
Typical applications for these systems lie in the area of wireless charging, near-field communication, and wireless sensor networks.
For an optimal system performance, the power efficiency needs to be maximized.
Typically, this optimization refers to the impedance matching and tracking of the split-frequencies.
However, an important role of magnitude and phase of the input signal has been mostly overlooked.
Especially for the wireless power transfer systems with multiple transmitter coils, the optimization of the transmit signals can dramatically improve the power efficiency.
In this work, we propose an iterative algorithm for the optimization of the transmit signals for a transmitter with three orthogonal coils and multiple single coil receivers.
The proposed scheme significantly outperforms the traditional baseline algorithms in terms of power efficiency.
Network function virtualization (NFV) based service function chaining (SFC) allows the provisioning of various security and traffic engineering applications in a cloud network.
Inefficient deployment of network functions can lead to security violations and performance overhead.
In an OpenFlow enabled cloud, the key problem with current mechanisms is that several packet field match and flow rule action sets associated with the network functions are non-overlapping and can be parallelized for performance enhancement.
We introduce Network Function Parallelism (NFP) SFC-NFP for OpenFlow network.
Our solution utilizes network function parallelism over the OpenFlow rules to improve SFC performance in the cloud network.
We have utilized the DPDK platform with an OpenFlow switch (OVS) for experimental analysis.
Our solution achieves a 1.40-1.90x reduction in latency for SFC in an OpenStack cloud network managed by the SDN framework.
We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs.
Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points.
The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based parsing systems.
We also demonstrate that the proposed parser achieves state-of-the-art performance in the downstream tasks of Parsing Evaluation using Textual Entailments (PETE) and Unbounded Dependency Recovery.
This provides further support for the claim that TAG is a viable formalism for problems that require rich structural analysis of sentences.
In this paper we propose a class of propagation models for multiple competing products over a social network.
We consider two propagation mechanisms: social conversion and self conversion, corresponding, respectively, to endogenous and exogenous factors.
A novel concept, the product-conversion graph, is proposed to characterize the interplay among competing products.
According to the chronological order of social and self conversions, we develop two Markov-chain models and, based on the independence approximation, we approximate them with two respective difference equations systems.
Theoretical analysis on these two approximation models reveals the dependency of the systems' asymptotic behavior on the structures of both the product-conversion graph and the social network, as well as the initial condition.
In addition to the theoretical work, accuracy of the independence approximation and the asymptotic behavior of the Markov-chain model are investigated via numerical analysis, for the case where social conversion occurs before self conversion.
Finally, we propose a class of multi-player and multi-stage competitive propagation games and discuss the seeding-quality trade-off, as well as the allocation of seeding resources among the individuals.
We investigate the unique Nash equilibrium at each stage and analyze the system's behavior when every player is adopting the policy at the Nash equilibrium.
We present novel graph kernels for graphs with node and edge labels that have ordered neighborhoods, i.e.
when neighbor nodes follow an order.
Graphs with ordered neighborhoods are a natural data representation for evolving graphs where edges are created over time, which induces an order.
Combining convolutional subgraph kernels and string kernels, we design new scalable algorithms for generation of explicit graph feature maps using sketching techniques.
We obtain precise bounds for the approximation accuracy and computational complexity of the proposed approaches and demonstrate their applicability on real datasets.
In particular, our experiments demonstrate that neighborhood ordering results in more informative features.
For the special case of general graphs, i.e.graphs without ordered neighborhoods, the new graph kernels yield efficient and simple algorithms for the comparison of label distributions between graphs.
We investigate the secret key generation in the multiterminal source model, where the users discuss under limited rate.
For the minimally connected hypergraphical sources, we give an explicit formula of the maximum achievable secret key rate, called the secrecy capacity, under any given total discussion rate.
Besides, we also partially characterize the region of achievable secret key rate and discussion rate tuple.
When specializes to the hypertree sources, our results give rise to a complete characterization of the region.
In order to obtain reliable accuracy estimates for automatic MOOC dropout predictors, it is important to train and test them in a manner consistent with how they will be used in practice.
Yet most prior research on MOOC dropout prediction has measured test accuracy on the same course used for training the classifier, which can lead to overly optimistic accuracy estimates.
In order to understand better how accuracy is affected by the training+testing regime, we compared the accuracy of a standard dropout prediction architecture (clickstream features + logistic regression) across 4 different training paradigms.
Results suggest that (1) training and testing on the same course ("post-hoc") can overestimate accuracy by several percentage points; (2) dropout classifiers trained on proxy labels based on students' persistence are surprisingly competitive with post-hoc training (87.33% versus 90.20% AUC averaged over 8 weeks of 40 HarvardX MOOCs); and (3) classifier performance does not vary significantly with the academic discipline.
Finally, we also research new dropout prediction architectures based on deep, fully-connected, feed-forward neural networks and find that (4) networks with as many as 5 hidden layers can statistically significantly increase test accuracy over that of logistic regression.
In this work we give a concise definition of information loss from a system-theoretic point of view.
Based on this definition, we analyze the information loss in static input-output systems subject to a continuous-valued input.
For a certain class of multiple-input, multiple-output systems the information loss is quantified.
An interpretation of this loss is accompanied by upper bounds which are simple to evaluate.
Finally, a class of systems is identified for which the information loss is necessarily infinite.
Quantizers and limiters are shown to belong to this class.
In this paper, we consider the secure transmission design for a multiple-input single-output Femtocell overlaid with a Macrocell in co-channel deployment.
The Femtocell base station sends confidential messages to information receiving Femtocell users (FUs) and energy signals to energy receiving (ER) FUs while limiting the interference to Macrocell users (MUs).
The ER FUs have the potential to wiretap the confidential messages.
By taking fairness into account, we propose a sum logarithmic secrecy rate maximization beamforming design problem under the interference constraints for MUs and energy harvesting (EH) constraints for ER FUs.
The formulated design problem is nontrivial to solve due to the nonconvexity which lies in the objective and the constraints.
To tackle the design problem, a semidefinite relaxation and successive convex approximation based algorithm is proposed.
Simulation results demonstrate the effectiveness of the proposed beamforming design.
This is a purely pedagogical paper with no new results.
The goal of the paper is to give a fairly self-contained introduction to Judea Pearl's do-calculus, including proofs of his 3 rules.
This paper exemplifies the implementation of an efficient Information Retrieval (IR) System to compute the similarity between a dataset and a query using Fuzzy Logic.
TREC dataset has been used for the same purpose.
The dataset is parsed to generate keywords index which is used for the similarity comparison with the user query.
Each query is assigned a score value based on its fuzzy similarity with the index keywords.
The relevant documents are retrieved based on the score value.
The performance and accuracy of the proposed fuzzy similarity model is compared with Cosine similarity model using Precision-Recall curves.
The results prove the dominance of Fuzzy Similarity based IR system.
MANY TECHNIQUES for synthesizing digital hardware from C-like languages have been proposed, but none have emerged as successful as Verilog or VHDL for register-transfer-level design.
This paper looks at two of the fundamental challenges: concurrency and timing control.
Traditionally, formal languages are defined as sets of words.
More recently, the alternative coalgebraic or coinductive representation as infinite tries, i.e., prefix trees branching over the alphabet, has been used to obtain compact and elegant proofs of classic results in language theory.
In this article, we study this representation in the Isabelle proof assistant.
We define regular operations on infinite tries and prove the axioms of Kleene algebra for those operations.
Thereby, we exercise corecursion and coinduction and confirm the coinductive view being profitable in formalizations, as it improves over the set-of-words view with respect to proof automation.
In this paper, we investigate few memristor-based analog circuits namely the phase shift oscillator, integrator, and differentiator which have been explored numerously using the traditional lumped components.
We use LTspice-IV platform for simulation of the above-said circuits.
The investigation resorts to the nonlinear dopant drift model of memristor and the window function portrayed in the literature for nonlinearity realization.
The results of our investigations depict good agreement with the conventional lumped component based phase shift oscillator, integrator, and differentiator circuits.
The results are evident to showcase the potential of the memristor as a promising candidate for the next generation analog circuits.
Reading comprehension has been widely studied.
One of the most representative reading comprehension tasks is Stanford Question Answering Dataset (SQuAD), on which machine is already comparable with human.
On the other hand, accessing large collections of multimedia or spoken content is much more difficult and time-consuming than plain text content for humans.
It's therefore highly attractive to develop machines which can automatically understand spoken content.
In this paper, we propose a new listening comprehension task - Spoken SQuAD.
On the new task, we found that speech recognition errors have catastrophic impact on machine comprehension, and several approaches are proposed to mitigate the impact.
One of the most important assets of any company is being able to easily access information on itself and on its business.
In this line, it has been observed that this important information is often stored in one of the millions of spreadsheets created every year, due to simplicity in using and manipulating such an artifact.
Unfortunately, in many cases it is quite difficult to retrieve the intended information from a spreadsheet: information is often stored in a huge unstructured matrix, with no care for readability or comprehensiveness.
In an attempt to aid users in the task of extracting information from a spreadsheet, researchers have been working on models, languages and tools to query.
In this paper we present an empirical study evaluating such proposals assessing their usage to query spreadsheets.
We investigate the use of the Google Query Function, textual model-driven querying, and visual model-driven querying.
To compare these different querying approaches we present an empirical study whose results show that the end-users' productivity increases when using model-driven queries, specially using its visual representation.
Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks.
However, the performance of ConvNets would degrade when encountering the domain shift.
The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions.
Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal.
In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations.
Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction.
Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space.
A domain critic module (DCM) is set up for discriminating the feature space of both domains.
We optimize the DAM and DCM via an adversarial loss without using any target domain label.
Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.
We describe CITlab's recognition system for the ANWRESH-2014 competition attached to the 14. International Conference on Frontiers in Handwriting Recognition, ICFHR 2014.
The task comprises word recognition from segmented historical documents.
The core components of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC).
The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.
In remote sensing, each sensor can provide complementary or reinforcing information.
It is valuable to fuse outputs from multiple sensors to boost overall performance.
Previous supervised fusion methods often require accurate labels for each pixel in the training data.
However, in many remote sensing applications, pixel-level labels are difficult or infeasible to obtain.
In addition, outputs from multiple sensors may have different levels of resolution or modalities (such as rasterized hyperspectral imagery versus LiDAR 3D point clouds).
This paper presents a Multiple Instance Multi-Resolution Fusion (MIMRF) framework that can fuse multi-resolution and multi-modal sensor outputs while learning from ambiguously and imprecisely labeled training data.
Experiments were conducted on the MUUFL Gulfport hyperspectral and LiDAR data set and a remotely-sensed soybean and weed data set.
Results show improved, consistent performance on scene understanding and agricultural applications when compared to traditional fusion methods.
Rate-Splitting Multiple Access (RSMA) is a general and powerful multiple access framework for downlink multi-antenna systems, and contains Space-Division Multiple Access (SDMA) and Non-Orthogonal Multiple Access (NOMA) as special cases.
RSMA relies on linearly precoded rate-splitting with Successive Interference Cancellation (SIC) to decode part of the interference and treat the remaining part of the interference as noise.
Recently, RSMA has been shown to outperform both SDMA and NOMA rate-wise in a wide range of network loads (underloaded and overloaded regimes) and user deployments (with a diversity of channel directions, channel strengths and qualities of Channel State Information at the Transmitter).
Moreover, RSMA was shown to provide spectral efficiency and QoS enhancements over NOMA at a lower computational complexity for the transmit scheduler and the receivers.
In this paper, we build upon those results and investigate the energy efficiency of RSMA compared to SDMA and NOMA.
Considering a multiple-input single-output broadcast channel, we show that RSMA is more energy-efficient than SDMA and NOMA in a wide range of user deployments (with a diversity of channel directions and channel strengths).
We conclude that RSMA is more spectrally and energy-efficient than SDMA and NOMA.
We discuss Bayesian inference (BI) for the probabilistic identification of material parameters.
This contribution aims to shed light on the use of BI for the identification of elastoplastic material parameters.
For this purpose a single spring is considered, for which the stress-strain curves are artificially created.
Besides offering a didactic introduction to BI, this paper proposes an approach to incorporate statistical errors both in the measured stresses, and in the measured strains.
It is assumed that the uncertainty is only due to measurement errors and the material is homogeneous.
Furthermore, a number of possible misconceptions on BI are highlighted based on the purely elastic case.
In this work, a recently proposed Head-Related Transfer Function (HRTF)-based Robust Least-Squares Frequency-Invariant (RLSFI) beamformer design is analyzed with respect to its robustness against localization errors, which lead to a mismatch between the HRTFs corresponding to the actual target source position and the HRTFs which have been used for the beamformer design.
The impact of this mismatch on the performance of the HRTF-based RLSFI beamformer is evaluated, including a comparison to the free-field-based beamformer design, using signal-based measures and word error rates for an off-the-shelf speech recognizer.
Music history, referring to the records of users' listening or downloading history in online music services, is the primary source for music service providers to analyze users' preferences on music and thus to provide personalized recommendations to users.
In order to engage users into the service and to improve user experience, it would be beneficial to provide visual analyses of one user's music history as well as visualized recommendations to that user.
In this paper, we take a user-centric approach to the design of such visual analyses.
We start by investigating user needs on such visual analyses and recommendations, then propose several different visualization schemes, and perform a pilot study to collect user feedback on the designed schemes.
We further conduct user studies to verify the utility of the proposed schemes, and the results not only demonstrate the effectiveness of our proposed visualization, but also provide important insights to guide the visualization design in the future.
In this paper, we investigated a C-arm tomographic technique as a new three dimensional (3D) kidney imaging method for nephrolithiasis and kidney stone detection over view angle less than 180o.
Our C-arm tomographic technique provides a series of two dimensional (2D) images with a single scan over 40o view angle.
Experimental studies were performed with a kidney phantom that was formed from a pig kidney with two embedded kidney stones.
Different reconstruction methods were developed for C-arm tomographic technique to generate 3D kidney information including: point by point back projection (BP), filtered back projection (FBP), simultaneous algebraic reconstruction technique (SART) and maximum likelihood expectation maximization (MLEM).
Computer simulation study was also done with simulated 3D spherical object to evaluate the reconstruction results.
Preliminary results demonstrated the capability of our C-arm tomographic technique to generate 3D kidney information for kidney stone detection with low exposure of radiation.
The kidney stones are visible on reconstructed planes with identifiable shapes and sizes.
This paper considers a multi-source multi-relay network, in which relay nodes employ a coding scheme based on random linear network coding on source packets and generate coded packets.
If a destination node collects enough coded packets, it can recover the packets of all source nodes.
The links between source-to-relay nodes and relay-to-destination nodes are modeled as packet erasure channels.
Improved bounds on the probability of decoding failure are presented, which are markedly close to simulation results and notably better than previous bounds.
Examples demonstrate the tightness and usefulness of the new bounds over the old bounds.
A robust and informative local shape descriptor plays an important role in mesh registration.
In this regard, spectral descriptors that are based on the spectrum of the Laplace-Beltrami operator have gained a spotlight among the researchers for the last decade due to their desirable properties, such as isometry invariance.
Despite such, however, spectral descriptors often fail to give a correct similarity measure for non-isometric cases where the metric distortion between the models is large.
Hence, they are in general not suitable for the registration problems, except for the special cases when the models are near-isometry.
In this paper, we investigate a way to develop shape descriptors for non-isometric registration tasks by embedding the spectral shape descriptors into a different metric space where the Euclidean distance between the elements directly indicates the geometric dissimilarity.
We design and train a Siamese deep neural network to find such an embedding, where the embedded descriptors are promoted to rearrange based on the geometric similarity.
We found our approach can significantly enhance the performance of the conventional spectral descriptors for the non-isometric registration tasks, and outperforms recent state-of-the-art method reported in literature.
Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks.
They are, however, most suited for supervised learning from large amounts of labeled data.
Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques.
These attempts require different architectures and training methods.
In this work we present a novel approach for unsupervised training of Convolutional networks that is based on contrasting between spatial regions within images.
This criterion can be employed within conventional neural networks and trained using standard techniques such as SGD and back-propagation, thus complementing supervised methods.
Many challenges in natural language processing require generating text, including language translation, dialogue generation, and speech recognition.
For all of these problems, text generation becomes more difficult as the text becomes longer.
Current language models often struggle to keep track of coherence for long pieces of text.
Here, we attempt to have the model construct and use an outline of the text it generates to keep it focused.
We find that the usage of an outline improves perplexity.
We do not find that using the outline improves human evaluation over a simpler baseline, revealing a discrepancy in perplexity and human perception.
Similarly, hierarchical generation is not found to improve human evaluation scores.
Case-Based Reasoning (CBR) has been widely used to generate good software effort estimates.
The predictive performance of CBR is a dataset dependent and subject to extremely large space of configuration possibilities.
Regardless of the type of adaptation technique, deciding on the optimal number of similar cases to be used before applying CBR is a key challenge.
In this paper we propose a new technique based on Bisecting k-medoids clustering algorithm to better understanding the structure of a dataset and discovering the the optimal cases for each individual project by excluding irrelevant cases.
Results obtained showed that understanding of the data characteristic prior prediction stage can help in automatically finding the best number of cases for each test project.
Performance figures of the proposed estimation method are better than those of other regular K-based CBR methods.
This paper presents an efficient automatic color image segmentation method using a seeded region growing and merging method based on square elemental regions.
Our segmentation method consists of the three steps: generating seed regions, merging the regions, and applying a pixel-wise boundary determination algorithm to the resultant polygonal regions.
The major features of our method are as follows: the use of square elemental regions instead of pixels as the processing unit, a seed generation method based on enhanced gradient values, a seed region growing method exploiting local gradient values, a region merging method using a similarity measure including a homogeneity distance based on Tsallis entropy, and a termination condition of region merging using an estimated desired number of regions.
Using square regions as the processing unit substantially reduces the time complexity of the algorithm and makes the performance stable.
The experimental results show that our method exhibits stable performance for a variety of natural images, including heavily textured areas, and produces good segmentation results using the same parameter values.
The results of our method are fairly comparable to, and in some respects better than, those of existing algorithms.
Research into the stylistic properties of translations is an issue which has received some attention in computational stylistics.
Previous work by Rybicki (2006) on the distinguishing of character idiolects in the work of Polish author Henryk Sienkiewicz and two corresponding English translations using Burrow's Delta method concluded that idiolectal differences could be observed in the source texts and this variation was preserved to a large degree in both translations.
This study also found that the two translations were also highly distinguishable from one another.
Burrows (2002) examined English translations of Juvenal also using the Delta method, results of this work suggest that some translators are more adept at concealing their own style when translating the works of another author whereas other authors tend to imprint their own style to a greater extent on the work they translate.
Our work examines the writing of a single author, Norwegian playwright Henrik Ibsen, and these writings translated into both German and English from Norwegian, in an attempt to investigate the preservation of characterization, defined here as the distinctiveness of textual contributions of characters.
Feature selection (FS) is a process which attempts to select more informative features.
In some cases, too many redundant or irrelevant features may overpower main features for classification.
Feature selection can remedy this problem and therefore improve the prediction accuracy and reduce the computational overhead of classification algorithms.
The main aim of feature selection is to determine a minimal feature subset from a problem domain while retaining a suitably high accuracy in representing the original features.
In this paper, Principal Component Analysis (PCA), Rough PCA, Unsupervised Quick Reduct (USQR) algorithm and Empirical Distribution Ranking (EDR) approaches are applied to discover discriminative features that will be the most adequate ones for classification.
Efficiency of the approaches is evaluated using standard classification metrics.
Unsupervised learning for visual perception of 3D geometry is of great interest to autonomous systems.
Recent works on unsupervised learning have made considerable progress on geometry perception; however, they perform poorly on dynamic objects and scenarios with dark and noisy environments.
In contrast, supervised learning algorithms, which are robust, require large labeled geometric data-set.
This paper introduces SIGNet, a novel framework that provides robust geometry perception without requiring geometrically informative labels.
Specifically, SIGNet integrates semantic information to make unsupervised robust geometric predictions for dynamic objects in low lighting and noisy environments.
SIGNet is shown to improve upon the state of art unsupervised learning for geometry perception by 30% (in squared relative error for depth prediction).
In particular, SIGNet improves the dynamic object class performance by 39% in depth prediction and 29% in flow prediction.
We propose a novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations.
It is straightforward to implement, and very computationally efficient, also it has little memory requirements.
We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU.
OPLU activation function ensures norm preservance of the backpropagated gradients, therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.
User Interfaces (UIs) intensively rely on event-driven programming: widgets send UI events, which capture users' interactions, to dedicated objects called controllers.
Controllers use several UI listeners that handle these events to produce UI commands.
First, we reveal the presence of design smells in the code that describes and controls UIs.
Second, we demonstrate that specific code analyses are necessary to analyze and refactor UI code, because of its coupling with the rest of the code.
We conducted an empirical study on four large Java Swing and SWT open-source software systems.
We study to what extent the number of UI commands that a UI listener can produce has an impact on the change- and fault-proneness of the UI listener code.
We develop a static code analysis for detecting UI commands in the code.
We identify a new type of design smell, called Blob Listener that characterizes UI listeners that can produce more than two UI commands.
We propose a systematic static code analysis procedure that searches for Blob Listeners that we implement in InspectorGuidget.
We conducted experiments on the four software systems for which we manually identified 53 instances of Blob Listener.
InspectorGuidget successfully detected 52 Blob Listeners out of 53.
The results exhibit a precision of 81.25% and a recall of 98.11%.
We then developed a semi-automatically and behavior-preserving refactoring process to remove Blob Listeners.
49.06% of the 53 Blob Listeners were automatically refactored.
Patches for JabRef, and FreeCol have been accepted and merged.
Discussions with developers of the four software systems assess the relevance of the Blob Listener.
This work shows that UI code also suffers from design smells that have to be identified and characterized.
We argue that studies have to be conducted to find other UI design smells and tools that analyze UI code must be developed.
Entity-oriented search deals with a wide variety of information needs, from displaying direct answers to interacting with services.
In this work, we aim to understand what are prominent entity-oriented search intents and how they can be fulfilled.
We develop a scheme of entity intent categories, and use them to annotate a sample of queries.
Specifically, we annotate unique query refiners on the level of entity types.
We observe that, on average, over half of those refiners seek to interact with a service, while over a quarter of the refiners search for information that may be looked up in a knowledge base.
Segmentation is a fundamental task for extracting semantically meaningful regions from an image.
The goal of segmentation algorithms is to accurately assign object labels to each image location.
However, image-noise, shortcomings of algorithms, and image ambiguities cause uncertainty in label assignment.
Estimating the uncertainty in label assignment is important in multiple application domains, such as segmenting tumors from medical images for radiation treatment planning.
One way to estimate these uncertainties is through the computation of posteriors of Bayesian models, which is computationally prohibitive for many practical applications.
On the other hand, most computationally efficient methods fail to estimate label uncertainty.
We therefore propose in this paper the Active Mean Fields (AMF) approach, a technique based on Bayesian modeling that uses a mean-field approximation to efficiently compute a segmentation and its corresponding uncertainty.
Based on a variational formulation, the resulting convex model combines any label-likelihood measure with a prior on the length of the segmentation boundary.
A specific implementation of that model is the Chan-Vese segmentation model (CV), in which the binary segmentation task is defined by a Gaussian likelihood and a prior regularizing the length of the segmentation boundary.
Furthermore, the Euler-Lagrange equations derived from the AMF model are equivalent to those of the popular Rudin-Osher-Fatemi (ROF) model for image denoising.
Solutions to the AMF model can thus be implemented by directly utilizing highly-efficient ROF solvers on log-likelihood ratio fields.
We qualitatively assess the approach on synthetic data as well as on real natural and medical images.
For a quantitative evaluation, we apply our approach to the icgbench dataset.
Weighted automata are non-deterministic automata where the transitions are equipped with weights.
They can model quantitative aspects of systems like costs or energy consumption.
The value of a run can be computed, for example, as the maximum, limit average, or discounted sum of transition weights.
In multi-weighted automata, transitions carry several weights and can model, for example, the ratio between rewards and costs, or the efficiency of use of a primary resource under some upper bound constraint on a secondary resource.
Here, we introduce a general model for multi-weighted automata as well as a multiweighted MSO logic.
In our main results, we show that this multi-weighted MSO logic and multi-weighted automata are expressively equivalent both for finite and infinite words.
The translation process is effective, leading to decidability results for our multi-weighted MSO logic.
Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments.
In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling.
Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model.
For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections into the model.
Moreover, we contribute a dataset for relation estimation and modeling studies.
We evaluate our method in comparison with several baselines on object estimation, out-of-context object detection, relation estimation, and affordance estimation tasks.
Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to generate.
Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel.
For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible.
While such multiple-unicast models are unsolved in general, several network coding based schemes were suggested.
In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is a function of a few packets.
The packets sent can be received by the designated receiver (with some probability) or heard and stored by other receivers.
Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second it is limited by a given Time to Expire (TTE) parameter.
We model the transmission process as an infinite-horizon Markov Decision Process (MDP).
Since the large state space renders exact solutions computationally impractical, we introduce policy restricted and induced MDPs with significantly reduced state space, and prove that with proper reward function they have equal optimal value function (hence equal optimal throughput).
We then derive a reinforcement learning algorithm, which learns the optimal policy for the induced MDP.
This optimal strategy of the induced MDP, once applied to the policy restricted one, significantly improves over uncoded schemes.
Next, we enhance the algorithm by means of analysis of the structural properties of the resulting reward functional.
We demonstrate that our method scales well in the number of users, and automatically adapts to the packet loss rates, unknown in advance.
In addition, the performance is compared to the recent bound by Wang, which assumes much stronger coding (e.g., intra-session and buffering of coded packets), yet is shown to be comparable.
Given a web-scale graph that grows over time, how should its edges be stored and processed on multiple machines for rapid and accurate estimation of the count of triangles?
The count of triangles (i.e., cliques of size three) has proven useful in many applications, including anomaly detection, community detection, and link recommendation.
For triangle counting in large and dynamic graphs, recent work has focused largely on streaming algorithms and distributed algorithms.
To achieve the advantages of both approaches, we propose DiSLR, a distributed streaming algorithm that estimates the counts of global triangles and local triangles associated with each node.
Making one pass over the input stream, DiSLR carefully processes and stores the edges across multiple machines so that the redundant use of computational and storage resources is minimized.
Compared to its best competitors, DiSLR is (a) Accurate: giving up to 39X smaller estimation error, (b) Fast: up to 10.4X faster, scaling linearly with the number of edges in the input stream, and (c) Theoretically sound: yielding unbiased estimates with variances decreasing faster as the number of machines is scaled up.
Convolutional Neural Networks (CNNs) can be shifted across 2D images or 3D videos to segment them.
They have a fixed input size and typically perceive only small local contexts of the pixels to be classified as foreground or background.
In contrast, Multi-Dimensional Recurrent NNs (MD-RNNs) can perceive the entire spatio-temporal context of each pixel in a few sweeps through all pixels, especially when the RNN is a Long Short-Term Memory (LSTM).
Despite these theoretical advantages, however, unlike CNNs, previous MD-LSTM variants were hard to parallelize on GPUs.
Here we re-arrange the traditional cuboid order of computations in MD-LSTM in pyramidal fashion.
The resulting PyraMiD-LSTM is easy to parallelize, especially for 3D data such as stacks of brain slice images.
PyraMiD-LSTM achieved best known pixel-wise brain image segmentation results on MRBrainS13 (and competitive results on EM-ISBI12).
Optimizing floating-point arithmetic is vital because it is ubiquitous, costly, and used in compute-heavy workloads.
Implementing precise optimizations correctly, however, is difficult, since developers must account for all the esoteric properties of floating-point arithmetic to ensure that their transformations do not alter the output of a program.
Manual reasoning is error prone and stifles incorporation of new optimizations.
We present an approach to automate reasoning about floating-point optimizations using satisfiability modulo theories (SMT) solvers.
We implement the approach in LifeJacket, a system for automatically verifying precise floating-point optimizations for the LLVM assembly language.
We have used LifeJacket to verify 43 LLVM optimizations and to discover eight incorrect ones, including three previously unreported problems.
LifeJacket is an open source extension of the Alive system for optimization verification.
Formal modelling of Multi-Agent Systems (MAS) is a challenging task due to high complexity, interaction, parallelism and continuous change of roles and organisation between agents.
In this paper we record our research experience on formal modelling of MAS.
We review our research throughout the last decade, by describing the problems we have encountered and the decisions we have made towards resolving them and providing solutions.
Much of this work involved membrane computing and classes of P Systems, such as Tissue and Population P Systems, targeted to the modelling of MAS whose dynamic structure is a prominent characteristic.
More particularly, social insects (such as colonies of ants, bees, etc.), biology inspired swarms and systems with emergent behaviour are indicative examples for which we developed formal MAS models.
Here, we aim to review our work and disseminate our findings to fellow researchers who might face similar challenges and, furthermore, to discuss important issues for advancing research on the application of membrane computing in MAS modelling.
Long training times for high-accuracy deep neural networks (DNNs) impede research into new DNN architectures and slow the development of high-accuracy DNNs.
In this paper we present FireCaffe, which successfully scales deep neural network training across a cluster of GPUs.
We also present a number of best practices to aid in comparing advancements in methods for scaling and accelerating the training of deep neural networks.
The speed and scalability of distributed algorithms is almost always limited by the overhead of communicating between servers; DNN training is not an exception to this rule.
Therefore, the key consideration here is to reduce communication overhead wherever possible, while not degrading the accuracy of the DNN models that we train.
Our approach has three key pillars.
First, we select network hardware that achieves high bandwidth between GPU servers -- Infiniband or Cray interconnects are ideal for this.
Second, we consider a number of communication algorithms, and we find that reduction trees are more efficient and scalable than the traditional parameter server approach.
Third, we optionally increase the batch size to reduce the total quantity of communication during DNN training, and we identify hyperparameters that allow us to reproduce the small-batch accuracy while training with large batch sizes.
When training GoogLeNet and Network-in-Network on ImageNet, we achieve a 47x and 39x speedup, respectively, when training on a cluster of 128 GPUs.
Future wireless standards such as 5G envision dense wireless networks with large number of simultaneously connected devices.
In this context, interference management becomes critical in achieving high spectral efficiency.
Orthogonal signaling, which limits the number of users utilizing the resource simultaneously, gives a sum-rate that remains constant with increasing number of users.
An alternative approach called interference alignment promises a throughput that scales linearly with the number of users.
However, this approach requires very high SNR or long time duration for sufficient channel variation, and therefore may not be feasible in real wireless systems.
We explore ways to manage interference in large networks with delay and power constraints.
Specifically, we devise an interference phase alignment strategy that combines precoding and scheduling without using power control to exploit the diversity inherent in a system with large number of users.
We show that this scheme achieves a sum-rate that scales almost logarithmically with the number of users.
We also show that no scheme using single symbol phase alignment, which is asymmetric complex signaling restricted to a single complex symbol, can achieve better than logarithmic scaling of the sum-rate.
Face morphing attack is proved to be a serious threat to the existing face recognition systems.
Although a few face morphing detection methods have been put forward, the face morphing accomplice's facial restoration remains a challenging problem.
In this paper, a face-demorphing generative adversarial network (FD-GAN) is proposed to restore the accomplice's facial image.
It utilizes a symmetric dual network architecture and two levels of restoration losses to separate the identity feature of the morphing accomplice.
By exploiting the captured face image (containing the criminal's identity) from the face recognition system and the morphed image stored in the e-passport system (containing both criminal and accomplice's identities), the FD-GAN can effectively restore the accomplice's facial image.
Experimental results and analysis demonstrate the effectiveness of the proposed scheme.
It has great potential to be implemented for detecting the face morphing accomplice in a real identity verification scenario.
Many state-of-the-art algorithms for solving hard combinatorial problems in artificial intelligence (AI) include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance.
Knowledge about the resulting runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts.
Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs.
To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations.
In an empirical study involving five algorithms for SAT solving and AI planning, we show that neural networks predict the true RTDs of unseen instances better than previous methods, and can even do so when only few runtime observations are available per training instance.
This paper presents the "Leipzig Corpus Miner", a technical infrastructure for supporting qualitative and quantitative content analysis.
The infrastructure aims at the integration of 'close reading' procedures on individual documents with procedures of 'distant reading', e.g. lexical characteristics of large document collections.
Therefore information retrieval systems, lexicometric statistics and machine learning procedures are combined in a coherent framework which enables qualitative data analysts to make use of state-of-the-art Natural Language Processing techniques on very large document collections.
Applicability of the framework ranges from social sciences to media studies and market research.
As an example we introduce the usage of the framework in a political science study on post-democracy and neoliberalism.
Today in fast technology development in wireless mobile adhoc network there is vast scope for research.
As it is known that wireless communication for mobile network has many application areas like routing services, security services etc.
The mobile adhoc network is the wireless network for communication in which the mobile nodes are organized without any centralized administrator.
There are many Manet routing protocols like reactive, proactive, hybrid etc.
In this paper the reactive Manet routing protocol like DSR is simulated for traffic analysis for 50 mobile nodes for IP traffic flows.
Also throughput is analyzed for DSR and ER-DSR protocol.
And finally the memory utilized during simulation of DSR and ER-DSR is evaluated in order to compare both.
The true distribution parameterizations of commonly used image datasets are inaccessible.
Rather than designing metrics for feature spaces with unknown characteristics, we propose to measure GAN performance by evaluating on explicitly parameterized, synthetic data distributions.
As a case study, we examine the performance of 16 GAN variants on six multivariate distributions of varying dimensionalities and training set sizes.
In this learning environment, we observe that: GANs exhibit similar performance trends across dimensionalities; learning depends on the underlying distribution and its complexity; the number of training samples can have a large impact on performance; evaluation and relative comparisons are metric-dependent; diverse sets of hyperparameters can produce a "best" result; and some GANs are more robust to hyperparameter changes than others.
These observations both corroborate findings of previous GAN evaluation studies and make novel contributions regarding the relationship between size, complexity, and GAN performance.
In this paper, we derive a simple method for separating topological noise from topological features using a novel measure for comparing persistence barcodes called persistent entropy.
We present a novel finite element integration method for low order elements on GPUs.
We achieve more than 100GF for element integration on first order discretizations of both the Laplacian and Elasticity operators.
We introduce a Noise-based prior Learning (NoL) approach for training neural networks that are intrinsically robust to adversarial attacks.
We find that the implicit generative modeling of random noise with the same loss function used during posterior maximization, improves a model's understanding of the data manifold furthering adversarial robustness.
We evaluate our approach's efficacy and provide a simplistic visualization tool for understanding adversarial data, using Principal Component Analysis.
Our analysis reveals that adversarial robustness, in general, manifests in models with higher variance along the high-ranked principal components.
We show that models learnt with our approach perform remarkably well against a wide-range of attacks.
Furthermore, combining NoL with state-of-the-art adversarial training extends the robustness of a model, even beyond what it is adversarially trained for, in both white-box and black-box attack scenarios.
We release to the community six large-scale sense-annotated datasets in multiple language to pave the way for supervised multilingual Word Sense Disambiguation.
Our datasets cover all the nouns in the English WordNet and their translations in other languages for a total of millions of sense-tagged sentences.
Experiments prove that these corpora can be effectively used as training sets for supervised WSD systems, surpassing the state of the art for low-resourced languages and providing competitive results for English, where manually annotated training sets are accessible.
The data is available at trainomatic.org.
The performance of value classes is highly dependent on how they are represented in the virtual machine.
Value class instances are immutable, have no identity, and can only refer to other value objects or primitive values and since they should be very lightweight and fast, it is important to optimize them carefully.
In this paper we present a technique to detect and compress common patterns of value class usage to improve memory usage and performance.
The technique identifies patterns of frequent value object references and introduces abbreviated forms for them.
This allows to store multiple inter-referenced value objects in an inlined memory representation, reducing the overhead stemming from meta-data and object references.
Applied to a small prototype and an implementation of the Racket language, we found improvements in memory usage and execution time for several micro-benchmarks.
Since its conception, smart app market has grown exponentially.
Success in the app market depends on many factors among which the quality of the app is a significant contributor, such as energy use.
Nevertheless, smartphones, as a subset of mobile computing devices. inherit the limited power resource constraint.
Therefore, there is a challenge of maintaining the resource while increasing the target app quality.
This paper introduces Learning Automata (LA) as an online learning method to learn and predict the app usage routines of the users.
Such prediction can leverage the app cache functionality of the operating system and thus (i) decreases app launch time and (ii) preserve battery.
Our algorithm, which is an online learning approach, temporally updates and improves the internal states of itself.
In particular, it learns the transition probabilities between app launching.
Each App launching instance updates the transition probabilities related to that App, and this will result in improving the prediction.
We benefit from a real-world lifelogging dataset and our experimental results show considerable success with respect to the two baseline methods that are used currently for smartphone app prediction approaches.
This work provides an in-depth analysis of the relation between the different types of collaboration and research productivity, showing how both are influenced by some personal and organizational variables.
By applying different cross-lagged panel models, we are able to analyze the relationship among research productivity, collaboration and their determinants.
In particular, we show that only collaboration at intramural and domestic level has a positive effect on research productivity.
Differently, all the forms of collaboration are positively affected by research productivity.
The results can favor the reexamination of the theories related to these issues, and inform policies that would be more suited to their management.
Stack Overflow (SO) has been a great source of natural language questions and their code solutions (i.e., question-code pairs), which are critical for many tasks including code retrieval and annotation.
In most existing research, question-code pairs were collected heuristically and tend to have low quality.
In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them).
It is formulated as predicting whether or not a code snippet is a standalone solution to a question.
We propose a novel Bi-View Hierarchical Neural Network which can capture both the programming content and the textual context of a code snippet (i.e., two views) to make a prediction.
On two manually annotated datasets in Python and SQL domain, our framework substantially outperforms heuristic methods with at least 15% higher F1 and accuracy.
Furthermore, we present StaQC (Stack Overflow Question-Code pairs), the largest dataset to date of ~148K Python and ~120K SQL question-code pairs, automatically mined from SO using our framework.
Under various case studies, we demonstrate that StaQC can greatly help develop data-hungry models for associating natural language with programming language.
This paper studies the problem of reconstructing a two-dimensional scalar field using a swarm of networked robots with local communication capabilities.
We consider the communication network of the robots to form either a chain or a grid topology.
We formulate the reconstruction problem as an optimization problem that is constrained by first-order linear dynamics on a large, interconnected system.
To solve this problem, we employ an optimization-based scheme that uses a gradient-based method with an analytical computation of the gradient.
In addition, we derive bounds on the trace of observability Gramian of the system, which helps us to quantify and compare the estimation capability of chain and grid networks.
A comparison based on a performance measure related to the H2 norm of the system is also used to study robustness of the network topologies.
Our resultsare validated using both simulated scalar fields and actual ocean salinity data.
We present two Secure Two Party Computation (STPC) protocols for piecewise function approximation on private data.
The protocols rely on a piecewise approximation of the to-be-computed function easing the implementation in a STPC setting.
The first protocol relies entirely on Garbled Circuit (GC) theory, while the second one exploits a hybrid construction where GC and Homomorphic Encryption (HE) are used together.
In addition to piecewise constant and linear approximation, polynomial interpolation is also considered.
From a communication complexity perspective, the full-GC implementation is preferable when the input and output variables can be represented with a small number of bits, while the hybrid solution is preferable otherwise.
With regard to computational complexity, the full-GC solution is generally more convenient.
Existing face recognition using deep neural networks is difficult to know what kind of features are used to discriminate the identities of face images clearly.
To investigate the effective features for face recognition, we propose a novel face recognition method, called a pairwise relational network (PRN), that obtains local appearance patches around landmark points on the feature map, and captures the pairwise relation between a pair of local appearance patches.
The PRN is trained to capture unique and discriminative pairwise relations among different identities.
Because the existence and meaning of pairwise relations should be identity dependent, we add a face identity state feature, which obtains from the long short-term memory (LSTM) units network with the sequential local appearance patches on the feature maps, to the PRN.
To further improve accuracy of face recognition, we combined the global appearance representation with the pairwise relational feature.
Experimental results on the LFW show that the PRN using only pairwise relations achieved 99.65% accuracy and the PRN using both pairwise relations and face identity state feature achieved 99.76% accuracy.
On the YTF, both the PRN using only pairwise relations and the PRN using pairwise relations and the face identity state feature achieved the state-of-the-art (95.7% and 96.3%).
The PRN also achieved comparable results to the state-of-the-art for both face verification and face identification tasks on the IJB-A, and the state-of-the-art on the IJB-B.
In a secondary spectrum market primaries set prices for their unused channels to the secondaries.
The payoff of a primary depends on the availability of unused channels of its competitors.
We consider a model were a primary can acquire its competitor's channel state information (C-CSI) at a cost.
We formulate a game between two primaries where each primary decides whether to acquire C-CSI or not and then selects its price based on that.
We first characterize the Nash Equilibrium (NE) of this game for a symmetric model where the C-CSI is perfect.
We show that the payoff of a primary is independent of the C-CSI acquisition cost.
We then generalize our analysis to allow for imperfect estimation and cases where the two primaries have different C-CSI costs or different channel availabilities.
Our results show interestingly that the payoff of a primary increases when there is estimation error.
We also show that surprisingly, the expected payoff of a primary may decrease when the C-CSI acquisition cost decreases when primaries have different availabilities.
Self-aligned double patterning (SADP) has become a promising technique to push pattern resolution limit to sub-22nm technology node.
Although SADP provides good overlay controllability, it encounters many challenges in physical design stages to obtain conflict-free layout decomposition.
In this paper, we study the impact on placement by different standard cell layout decomposition strategies.
We propose a SADP friendly standard cell configuration which provides pre-coloring results for standard cells.
These configurations are brought into the placement stage to help ensure layout decomposability and save the extra effort for solving conflicts in later stages.
Augmenting deep neural networks with skip connections, as introduced in the so called ResNet architecture, surprised the community by enabling the training of networks of more than 1000 layers with significant performance gains.
It has been shown that identity skip connections eliminate singularities and improve the optimization landscape of the network.
This paper deciphers ResNet by analyzing the of effect of skip connections in the backward path and sets forth new theoretical results on the advantages of identity skip connections in deep neural networks.
We prove that the skip connections in the residual blocks facilitate preserving the norm of the gradient and lead to well-behaved and stable back-propagation, which is a desirable feature from optimization perspective.
We also show that, perhaps surprisingly, as more residual blocks are stacked, the network becomes more norm-preserving.
Traditionally, norm-preservation is enforced on the network only at beginning of the training, by using initialization techniques.
However, we show that identity skip connection retain norm-preservation during the training procedure.
Our theoretical arguments are supported by extensive empirical evidence.
Can we push for more norm-preservation?
We answer this question by proposing zero-phase whitening of the fully-connected layer and adding norm-preserving transition layers.
Our numerical investigations demonstrate that the learning dynamics and the performance of ResNets can be improved by making it even more norm preserving through changing only a few blocks in very deep residual networks.
Our results and the introduced modification for ResNet, referred to as Procrustes ResNets, can be used as a guide for studying more complex architectures such as DenseNet, training deeper networks, and inspiring new architectures.
Recent excitement in the database community surrounding new applications?analytic, scientific, graph, geospatial, etc.
?has led to an explosion in research on database storage systems.
New storage systems are vital to the database community, as they are at the heart of making database systems perform well in new application domains.
Unfortunately, each such system also represents a substantial engineering effort including a great deal of duplication of mechanisms for features such as transactions and caching.
In this paper, we make the case for RodentStore, an adaptive and declarative storage system providing a high-level interface for describing the physical representation of data.
Specifically, RodentStore uses a declarative storage algebra whereby administrators (or database design tools) specify how a logical schema should be grouped into collections of rows, columns, and/or arrays, and the order in which those groups should be laid out on disk.
We describe the key operators and types of our algebra, outline the general architecture of RodentStore, which interprets algebraic expressions to generate a physical representation of the data, and describe the interface between RodentStore and other parts of a database system, such as the query optimizer and executor.
We provide a case study of the potential use of RodentStore in representing dense geospatial data collected from a mobile sensor network, showing the ease with which different storage layouts can be expressed using some of our algebraic constructs and the potential performance gains that a RodentStore-built storage system can offer.
Using Atomic Force Microscopes (AFM) to manipulate nano-objects is an actual challenge for surface scientists.
Basic haptic interfacesbetween the AFM and experimentalists have already been implemented.
Themulti-sensory renderings (seeing, hearing and feeling) studied from acognitive point of view increase the efficiency of the actual interfaces.
Toallow the experimentalist to feel and touch the nano-world, we add mixedrealities between an AFM and a force feedback device, enriching thus thedirect connection by a modeling engine.
We present in this paper the firstresults from a real-time remote-control handling of an AFM by our ForceFeedback Gestural Device through the example of the approach-retract curve.
In the modern era of radio frequency (RF) spectrum crunch, visible light communication (VLC) is a recent and promising alternative technology that operates at the visible light spectrum.
Thanks to its unlicensed and large bandwidth, VLC can deliver high throughput, better energy efficiency, and low cost data communications.
In this article, a hybrid RF/VLC architecture is considered that can simultaneously provide light- ing and communication coverage across a room.
Considered architecture involves a novel multi-element hemispherical bulb design, which can transmit multiple data streams over light emitting diode (LED) modules.
Simulations considering various VLC transmitter configurations and topologies show that good link quality and high spatial reuse can be maintained in typical indoor communication scenarios.
The regression test selection problem--selecting a subset of a test-suite given a change--has been studied widely over the past two decades.
However, the problem has seen little attention when constrained to high-criticality developments and where a "safe" selection of tests need to be chosen.
Further, no practical approaches have been presented for the programming language Ada.
In this paper, we introduce an approach to solving the selection problem given a combination of both static and dynamic data for a program and a change-set.
We present a change impact analysis for Ada that selects the safe set of tests that need to be re-executed to ensure no regressions.
We have implemented the approach in the commercial, unit-testing tool VectorCAST, and validated it on a number of open-source examples.
On an example of a fully-functioning Ada implementation of a DNS server (IRONSIDES), the experimental results show a 97% reduction in test-case execution.
Signal identification represents the task of a receiver to identify the signal type and its parameters, with applications to both military and commercial communications.
In this paper, we investigate the identification of spatial multiplexing (SM) and Alamouti (AL) space-time block code (STBC) with single carrier frequency division multiple access (SC-FDMA) signals, when the receiver is equipped with a single antenna.
We develop a discriminating feature based on a fourth-order statistic of the received signal, as well as a constant false alarm rate decision criterion which relies on the statistical properties of the feature estimate.
Furthermore, we present the theoretical performance analysis of the proposed identification algorithm.
The algorithm does not require channel or noise power estimation, modulation classification, and block synchronization.
Simulation results show the validity of the proposed algorithm, as well as a very good agreement with the theoretical analysis.
Samples from intimate (non-linear) mixtures are generally modeled as being drawn from a smooth manifold.
Scenarios where the data contains multiple intimate mixtures with some constituent materials in common can be thought of as manifolds which share a boundary.
Two important steps in the processing of such data are (i) to identify (cluster) the different mixture-manifolds present in the data and (ii) to eliminate the non-linearities present the data by mapping each mixture-manifold into some low-dimensional euclidean space (embedding).
Manifold clustering and embedding techniques appear to be an ideal tool for this task, but the present state-of-the-art algorithms perform poorly for hyperspectral data, particularly in the embedding task.
We propose a novel reconstruction-based algorithm for improved clustering and embedding of mixture-manifolds.
The algorithms attempts to reconstruct each target-point as an affine combination of its nearest neighbors with an additional rank penalty on the neighborhood to ensure that only neighbors on the same manifold as the target-point are used in the reconstruction.
The reconstruction matrix generated by using this technique is block-diagonal and can be used for clustering (using spectral clustering) and embedding.
The improved performance of the algorithms vis-a-vis its competitors is exhibited on a variety of simulated and real mixture datasets.
Recent developments in speech synthesis have produced systems capable of outcome intelligible speech, but now researchers strive to create models that more accurately mimic human voices.
One such development is the incorporation of multiple linguistic styles in various languages and accents.
HMM-based Speech Synthesis is of great interest to many researchers, due to its ability to produce sophisticated features with small footprint.
Despite such progress, its quality has not yet reached the level of the predominant unit-selection approaches that choose and concatenate recordings of real speech.
Recent efforts have been made in the direction of improving these systems.
In this paper we present the application of Long-Short Term Memory Deep Neural Networks as a Postfiltering step of HMM-based speech synthesis, in order to obtain closer spectral characteristics to those of natural speech.
The results show how HMM-voices could be improved using this approach.
New high-data-rate multimedia services and applications are evolving continuously and exponentially increasing the demand for wireless capacity of fifth-generation (5G) and beyond.
The existing radio frequency (RF) communication spectrum is insufficient to meet the demands of future high-datarate 5G services.
Optical wireless communication (OWC), which uses an ultra-wide range of unregulated spectrum, has emerged as a promising solution to overcome the RF spectrum crisis.
It has attracted growing research interest worldwide in the last decade for indoor and outdoor applications.
OWC offloads huge data traffic applications from RF networks.
A 100 Gb/s data rate has already been demonstrated through OWC.
It offers services indoors as well as outdoors, and communication distances range from several nm to more than 10000 km.
This paper provides a technology overview and a review on optical wireless technologies, such as visible light communication, light fidelity, optical camera communication, free space optical communication, and light detection and ranging.
We survey the key technologies for understanding OWC and present state-of-the-art criteria in aspects, such as classification, spectrum use, architecture, and applications.
The key contribution of this paper is to clarify the differences among different promising optical wireless technologies and between these technologies and their corresponding similar existing RF technologies
A group of transition probability functions form a Shannon's channel whereas a group of truth functions form a semantic channel.
By the third kind of Bayes' theorem, we can directly convert a Shannon's channel into an optimized semantic channel.
When a sample is not big enough, we can use a truth function with parameters to produce the likelihood function, then train the truth function by the conditional sampling distribution.
The third kind of Bayes' theorem is proved.
A semantic information theory is simply introduced.
The semantic information measure reflects Popper's hypothesis-testing thought.
The Semantic Information Method (SIM) adheres to maximum semantic information criterion which is compatible with maximum likelihood criterion and Regularized Least Squares criterion.
It supports Wittgenstein's view: the meaning of a word lies in its use.
Letting the two channels mutually match, we obtain the Channels' Matching (CM) algorithm for machine learning.
The CM algorithm is used to explain the evolution of the semantic meaning of natural language, such as "Old age".
The semantic channel for medical tests and the confirmation measures of test-positive and test-negative are discussed.
The applications of the CM algorithm to semi-supervised learning and non-supervised learning are simply introduced.
As a predictive model, the semantic channel fits variable sources and hence can overcome class-imbalance problem.
The SIM strictly distinguishes statistical probability and logical probability and uses both at the same time.
This method is compatible with the thoughts of Bayes, Fisher, Shannon, Zadeh, Tarski, Davidson, Wittgenstein, and Popper.It is a competitive alternative to Bayesian inference.
We study the related problems of subspace tracking in the presence of missing data (ST-miss) as well as robust subspace tracking with missing data (RST-miss).
Here "robust" refers to robustness to sparse outliers.
In recent work, we have studied the RST problem without missing data.
In this work, we show that simple modifications of our solution approach for RST also provably solve ST-miss and RST-miss under weaker and similar assumptions respectively.
To our knowledge, our result is the first complete guarantee for both ST-miss and RST-miss.
This means we are able to show that, under assumptions on only the algorithm inputs (input data and/or initialization), the output subspace estimates are close to the true data subspaces at all times.
Our guarantees hold under mild and easily interpretable assumptions and handle time-varying subspaces (unlike all previous work).
We also show that our algorithm and its extensions are fast and have competitive experimental performance when compared with existing methods.
The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public.
As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users.
In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements.
We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement.
The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users.
Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation.
Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness.
We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.
This is the preprint version of our paper on REHAB2015.
A balance measurement software based on Kinect2 sensor is evaluated by comparing to golden standard balance measure platform intuitively.
The software analysis the tracked body data from the user by Kinect2 sensor and get user's center of mass(CoM) as well as its motion route on a plane.
The software is evaluated by several comparison tests, the evaluation results preliminarily prove the reliability of the software.
In software-defined networking (SDN), as data plane scale expands, scalability and reliability of the control plane have become major concerns.
To mitigate such concerns, two kinds of solutions have been proposed separately.
One is multi- controller architecture, i.e., a logically centralized control plane with physically distributed controllers.
The other is control devolution, i.e., delegating control of some flows back to switches.
Most of existing solutions adopt either static switch-controller association or static devolution, which may not adapt well to the traffic variation, leading to high communication costs between switches and controller, and high computation costs of switches.
In this paper, we propose a novel scheme to jointly consider both solutions, i.e., we dynamically associate switches with controllers and dynamically devolve control of flows to switches.
Our scheme is an efficient online algorithm that does not need the statistics of traffic flows.
By adjusting a parameter, we can make a trade- off between costs and queue backlogs.
Theoretical analysis and extensive simulations show that our scheme yields much lower costs or latency compared to other schemes, as well as balanced loads among controllers.
A sufficient condition reported very recently for perfect recovery of a K-sparse vector via orthogonal matching pursuit (OMP) in K iterations is that the restricted isometry constant of the sensing matrix satisfies delta_K+1<1/(sqrt(delta_K+1)+1).
By exploiting an approximate orthogonality condition characterized via the achievable angles between two orthogonal sparse vectors upon compression, this paper shows that the upper bound on delta can be further relaxed to delta_K+1<(sqrt(1+4*delta_K+1)-1)/(2K).This result thus narrows the gap between the so far best known bound and the ultimate performance guarantee delta_K+1<1/(sqrt(delta_K+1)) that is conjectured by Dai and Milenkovic in 2009.
The proposed approximate orthogonality condition is also exploited to derive less restricted sufficient conditions for signal reconstruction in several compressive sensing problems, including signal recovery via OMP in a noisy environment, compressive domain interference cancellation, and support identification via the subspace pursuit algorithm.
Multiple query criteria active learning (MQCAL) methods have a higher potential performance than conventional active learning methods in which only one criterion is deployed for sample selection.
A central issue related to MQCAL methods concerns the development of an integration criteria strategy (ICS) that makes full use of all criteria.
The conventional ICS adopted in relevant research all facilitate the desired effects, but several limitations still must be addressed.
For instance, some of the strategies are not sufficiently scalable during the design process, and the number and type of criteria involved are dictated.
Thus, it is challenging for the user to integrate other criteria into the original process unless modifications are made to the algorithm.
Other strategies are too dependent on empirical parameters, which can only be acquired by experience or cross-validation and thus lack generality; additionally, these strategies are counter to the intention of active learning, as samples need to be labeled in the validation set before the active learning process can begin.
To address these limitations, we propose a novel MQCAL method for classification tasks that employs a third strategy via weighted rank aggregation.
The proposed method serves as a heuristic means to select high-value samples of high scalability and generality and is implemented through a three-step process: (1) the transformation of the sample selection to sample ranking and scoring, (2) the computation of the self-adaptive weights of each criterion, and (3) the weighted aggregation of each sample rank list.
Ultimately, the sample at the top of the aggregated ranking list is the most comprehensively valuable and must be labeled.
Several experiments generating 257 wins, 194 ties and 49 losses against other state-of-the-art MQCALs are conducted to verify that the proposed method can achieve superior results.
We consider the problem of cooperative output regulation for linear multi-agent systems.
A distributed dynamic output feedback design method is presented that solves the cooperative output regulation problem and also ensures that all agents track the desired reference signal without overshoot in their transient response.
Legged robots are becoming popular not only in research, but also in industry, where they can demonstrate their superiority over wheeled machines in a variety of applications.
Either when acting as mobile manipulators or just as all-terrain ground vehicles, these machines need to precisely track the desired base and end-effector trajectories, perform Simultaneous Localization and Mapping (SLAM), and move in challenging environments, all while keeping balance.
A crucial aspect for these tasks is that all onboard sensors must be properly calibrated and synchronized to provide consistent signals for all the software modules they feed.
In this paper, we focus on the problem of calibrating the relative pose between a set of cameras and the base link of a quadruped robot.
This pose is fundamental to successfully perform sensor fusion, state estimation, mapping, and any other task requiring visual feedback.
To solve this problem, we propose an approach based on factor graphs that jointly optimizes the mutual position of the cameras and the robot base using kinematics and fiducial markers.
We also quantitatively compare its performance with other state-of-the-art methods on the hydraulic quadruped robot HyQ.
The proposed approach is simple, modular, and independent from external devices other than the fiducial marker.
We propose a novel framework for the analysis of learning algorithms that allows us to say when such algorithms can and cannot generalize certain patterns from training data to test data.
In particular we focus on situations where the rule that must be learned concerns two components of a stimulus being identical.
We call such a basis for discrimination an identity-based rule.
Identity-based rules have proven to be difficult or impossible for certain types of learning algorithms to acquire from limited datasets.
This is in contrast to human behaviour on similar tasks.
Here we provide a framework for rigorously establishing which learning algorithms will fail at generalizing identity-based rules to novel stimuli.
We use this framework to show that such algorithms are unable to generalize identity-based rules to novel inputs unless trained on virtually all possible inputs.
We demonstrate these results computationally with a multilayer feedforward neural network.
The problem of communicating over the additive white Gaussian noise (AWGN) channel with lattice codes is addressed in this paper.
Theoretically, Voronoi constellations have proved to yield very powerful lattice codes when the fine/coding lattice is AWGN-good and the coarse/shaping lattice has an optimal shaping gain.
However, achieving Shannon capacity with these premises and practically implementable encoding algorithms is in general not an easy task.
In this work, a new way to encode and demap Construction-A Voronoi lattice codes is presented.
As a meaningful application of this scheme, the second part of the paper is focused on Leech constellations of low-density Construction-A (LDA) lattices: LDA Voronoi lattice codes are presented whose numerically measured waterfall region is situated at less than 0.8 dB from Shannon capacity.
These LDA lattice codes are based on dual-diagonal nonbinary low-density parity-check codes.
With this choice, encoding, iterative decoding, and demapping have all linear complexity in the blocklength.
The issue of representing attacks to attacks in argumentation is receiving an increasing attention as a useful conceptual modelling tool in several contexts.
In this paper we present AFRA, a formalism encompassing unlimited recursive attacks within argumentation frameworks.
AFRA satisfies the basic requirements of definition simplicity and rigorous compatibility with Dung's theory of argumentation.
This paper provides a complete development of the AFRA formalism complemented by illustrative examples and a detailed comparison with other recursive attack formalizations.
Recently, many healthcare organizations are adopting CRM as a strategy, which involves using technology to organize, automate, and coordinate business processes, in managing interactions with their patients.
CRM with the Web technology provides healthcare providers the ability to broaden their services beyond usual practices, and thus offers suitable environment using latest technology to achieve superb patient care.
This paper discusses and demonstrates how a new approach in CRM based on Web 2.0 will help the healthcare providers improving their customer support, avoiding conflict, and promoting better health to patient.
With this new approach patients will benefit from the customized personal service with full information access to perform self managed their own health.
It also helps healthcare providers retaining the right customer.
A conceptual framework of the new approach will be discussed.
Even though there are sophisticated AI planning algorithms, many integrated, large-scale projects do not use planning.
One reason seems to be the missing support by engineering tools such as syntax highlighting and visualization.
We propose myPDDL - a modular toolbox for efficiently creating PDDL domains and problems.
To evaluate myPDDL, we compare it to existing knowledge engineering tools for PDDL and experimentally assess its usefulness for novice PDDL users.
Dialog act recognition is an important step for dialog systems since it reveals the intention behind the uttered words.
Most approaches on the task use word-level tokenization.
In contrast, this paper explores the use of character-level tokenization.
This is relevant since there is information at the sub-word level that is related to the function of the words and, thus, their intention.
We also explore the use of different context windows around each token, which are able to capture important elements, such as affixes.
Furthermore, we assess the importance of punctuation and capitalization.
We performed experiments on both the Switchboard Dialog Act Corpus and the DIHANA Corpus.
In both cases, the experiments not only show that character-level tokenization leads to better performance than the typical word-level approaches, but also that both approaches are able to capture complementary information.
Thus, the best results are achieved by combining tokenization at both levels.
Correct inference of genetic regulations inside a cell is one of the greatest challenges in post genomic era for the biologist and researchers.
Several intelligent techniques and models were already proposed to identify the regulatory relations among genes from the biological database like time series microarray data.
Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes.
In this paper, Bat Algorithm (BA) is applied to optimize the model parameters of RNN model of Gene Regulatory Network (GRN).
Initially the proposed method is tested against small artificial network without any noise and the efficiency is observed in term of number of iteration, number of population and BA optimization parameters.
The model is also validated in presence of different level of random noise for the small artificial network and that proved its ability to infer the correct inferences in presence of noise like real world dataset.
In the next phase of this research, BA based RNN is applied to real world benchmark time series microarray dataset of E. coli.
The results prove that it can able to identify the maximum number of true positive regulation but also include some false positive regulations.
Therefore, BA is very suitable for identifying biological plausible GRN with the help RNN model.
We propose a method based on minimum-variance polynomial approximation to extract system poles from a data set of samples of the impulse response of a linear system.
The method is capable of handling the problem under general conditions of sampling and noise characteristics.
The superiority of the proposed method is demonstrated by statistical comparison of its performance with the performances of two exiting methods in the special case of uniform sampling.
This paper deals with the semantic interpretation of information resources (e.g., images, videos, 3D models).
We present a case study of an approach based on semantic and context dependent similarity applied to the industrial design.
Different application contexts are considered and modelled to browse a repository of 3D digital objects according to different perspectives.
The paper briefly summarises the basic concepts behind the semantic similarity approach and illustrates its application and results.
This paper presents a multi-platform, open-source application that aims to protect data stored and shared in existing cloud storage services.
The access to the cryptographic material used to protect data is implemented using the identification and authentication functionalities of national electronic identity (eID) tokens.
All peer to peer dialogs to exchange cryptographic material is implemented using the cloud storage facilities.
Furthermore, we have included a set of mechanisms to prevent files from being permanently lost or damaged due to concurrent modification, deletion and malicious tampering.
We have implemented a prototype in Java that is agnostic relatively to cloud storage providers; it only manages local folders, one of them being the local image of a cloud folder.
We have successfully tested our prototype in Windows, Mac OS X and Linux, with Dropbox, OneDrive, Google Drive and SugarSync.
We present a model for aggregation of product review snippets by joint aspect identification and sentiment analysis.
Our model simultaneously identifies an underlying set of ratable aspects presented in the reviews of a product (e.g., sushi and miso for a Japanese restaurant) and determines the corresponding sentiment of each aspect.
This approach directly enables discovery of highly-rated or inconsistent aspects of a product.
Our generative model admits an efficient variational mean-field inference algorithm.
It is also easily extensible, and we describe several modifications and their effects on model structure and inference.
We test our model on two tasks, joint aspect identification and sentiment analysis on a set of Yelp reviews and aspect identification alone on a set of medical summaries.
We evaluate the performance of the model on aspect identification, sentiment analysis, and per-word labeling accuracy.
We demonstrate that our model outperforms applicable baselines by a considerable margin, yielding up to 32% relative error reduction on aspect identification and up to 20% relative error reduction on sentiment analysis.
Ideas about how to increase the unconscious participation in interaction between 'a human' and 'a computer' are developed in this paper.
Evidence of impact of the unconscious functioning is presented.
The unconscious is characterised as being a responsive, contextual, and autonomous participant of human-computer interaction.
The unconscious participation occurs independently of one's cognitive and educational levels and, if ignored, leads to learning inefficiencies and compulsive behaviours, illustrations of which are provided.
Three practical approaches to a study of subjective user experience are outlined as follows: (a) tracing operant conditioning effects of software, (b) registering signs of brain activity psychological or information processing meaning of which is well-explored and (c) exploring submodality interfaces.
Implications for improvement of current usability study methods, such as eye-tracking, are generally considered.
Conclusions consider advantages and disadvantages of unconscious-embracing design and remind about a loss of human evolutionary choices if unconscious participation is ignored, complicated or blocked in interaction with computer interfaces and built environment.
Recently, great success has been achieved in offline handwritten Chinese character recognition by using deep learning methods.
Chinese characters are mainly logographic and consist of basic radicals, however, previous research mostly treated each Chinese character as a whole without explicitly considering its internal two-dimensional structure and radicals.
In this study, we propose a novel radical analysis network with densely connected architecture (DenseRAN) to analyze Chinese character radicals and its two-dimensional structures simultaneously.
DenseRAN first encodes input image to high-level visual features by employing DenseNet as an encoder.
Then a decoder based on recurrent neural networks is employed, aiming at generating captions of Chinese characters by detecting radicals and two-dimensional structures through attention mechanism.
The manner of treating a Chinese character as a composition of two-dimensional structures and radicals can reduce the size of vocabulary and enable DenseRAN to possess the capability of recognizing unseen Chinese character classes, only if the corresponding radicals have been seen in training set.
Evaluated on ICDAR-2013 competition database, the proposed approach significantly outperforms whole-character modeling approach with a relative character error rate (CER) reduction of 18.54%.
Meanwhile, for the case of recognizing 3277 unseen Chinese characters in CASIA-HWDB1.2 database, DenseRAN can achieve a character accuracy of about 41% while the traditional whole-character method has no capability to handle them.
Mobile apps can access a wide variety of secure information, such as contacts and location.
However, current mobile platforms include only coarse access control mechanisms to protect such data.
In this paper, we introduce interaction-based declassification policies, in which the user's interactions with the app constrain the release of sensitive information.
Our policies are defined extensionally, so as to be independent of the app's implementation, based on sequences of security-relevant events that occur in app runs.
Policies use LTL formulae to precisely specify which secret inputs, read at which times, may be released.
We formalize a semantic security condition, interaction-based noninterference, to define our policies precisely.
Finally, we describe a prototype tool that uses symbolic execution to check interaction-based declassification policies for Android, and we show that it enforces policies correctly on a set of apps.
Counting the frequency of small subgraphs is a fundamental technique in network analysis across various domains, most notably in bioinformatics and social networks.
The special case of triangle counting has received much attention.
Getting results for 4-vertex patterns is highly challenging, and there are few practical results known that can scale to massive sizes.
Indeed, even a highly tuned enumeration code takes more than a day on a graph with millions of edges.
Most previous work that runs for truly massive graphs employ clusters and massive parallelization.
We provide a sampling algorithm that provably and accurately approximates the frequencies of all 4-vertex pattern subgraphs.
Our algorithm is based on a novel technique of 3-path sampling and a special pruning scheme to decrease the variance in estimates.
We provide theoretical proofs for the accuracy of our algorithm, and give formal bounds for the error and confidence of our estimates.
We perform a detailed empirical study and show that our algorithm provides estimates within 1% relative error for all subpatterns (over a large class of test graphs), while being orders of magnitude faster than enumeration and other sampling based algorithms.
Our algorithm takes less than a minute (on a single commodity machine) to process an Orkut social network with 300 million edges.
Fractional order derivatives and integrals (differintegrals) are viewed from a frequency-domain perspective using the formalism of Riesz, providing a computational tool as well as a way to interpret the operations in the frequency domain.
Differintegrals provide a logical extension of current techniques, generalizing the notion of integral and differential operators and acting as kind of frequency-domain filtering that has many of the advantages of a nonlocal linear operator.
Several important properties of differintegrals are presented, and sample applications are given to one- and two-dimensional signals.
Computer code to carry out the computations is made available on the author's website.
Evolutionary clustering aims at capturing the temporal evolution of clusters.
This issue is particularly important in the context of social media data that are naturally temporally driven.
In this paper, we propose a new probabilistic model-based evolutionary clustering technique.
The Temporal Multinomial Mixture (TMM) is an extension of classical mixture model that optimizes feature co-occurrences in the trade-off with temporal smoothness.
Our model is evaluated for two recent case studies on opinion aggregation over time.
We compare four different probabilistic clustering models and we show the superiority of our proposal in the task of instance-oriented clustering.
Creating user defined functions (UDFs) is a powerful method to improve the quality of computer applications, in particular spreadsheets.
However, the only direct way to use UDFs in spreadsheets is to switch from the functional and declarative style of spreadsheet formulas to the imperative VBA, which creates a high entry barrier even for proficient spreadsheet users.
It has been proposed to extend Excel by UDFs declared by a spreadsheet: user defined spreadsheet functions (UDSFs).
In this paper we present a method to create a limited form of UDSFs in Excel without any use of VBA.
Calls to those UDSFs utilize what-if data tables to execute the same part of a worksheet several times, thus turning it into a reusable function definition.
Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes in ambiguous cases and improve translation coherence.
We introduce a context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed.
We experiment with an English-Russian subtitles dataset, and observe that much of what is captured by our model deals with improving pronoun translation.
We measure correspondences between induced attention distributions and coreference relations and observe that the model implicitly captures anaphora.
It is consistent with gains for sentences where pronouns need to be gendered in translation.
Beside improvements in anaphoric cases, the model also improves in overall BLEU, both over its context-agnostic version (+0.7) and over simple concatenation of the context and source sentences (+0.6).
Chaotic compressive sensing is a nonlinear framework for compressive sensing.
Along the framework, this paper proposes a chaotic analog-to-information converter, chaotic modulation, to acquire and reconstruct band-limited sparse analog signals at sub-Nyquist rate.
In the chaotic modulation, the sparse signal is randomized through state modulation of continuous-time chaotic system and one state output is sampled as compressive measurements.
The reconstruction is achieved through the estimation of the sparse coefficients with principle of chaotic impulsive synchronization and Lp-norm regularized nonlinear least squares.
The concept of supreme local Lyapunov exponents (SLLE) is introduced to study the reconstructablity.
It is found that the sparse signals are reconstructable, if the largest SLLE of the error dynamical system is negative.
As examples, the Lorenz system and Liu system excited by the sparse multi-tone signals are taken to illustrate the principle and the performance.
Binary hypothesis testing under the Neyman-Pearson formalism is a statistical inference framework for distinguishing data generated by two different source distributions.
Privacy restrictions may require the curator of the data or the data respondents themselves to share data with the test only after applying a randomizing privacy mechanism.
Using mutual information as the privacy metric and the relative entropy between the two distributions of the output (postrandomization) source classes as the utility metric (motivated by the Chernoff-Stein Lemma), this work focuses on finding an optimal mechanism that maximizes the chosen utility function while ensuring that the mutual information based leakage for both source distributions is bounded.
Focusing on the high privacy regime, an Euclidean information-theoretic (E-IT) approximation to the tradeoff problem is presented.
It is shown that the solution to the E-IT approximation is independent of the alphabet size and clarifies that a mutual information based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihood.
In this paper, we present an approach that is able to handle with Z-numbers in the context of Multi-Criteria Decision Making (MCDM) problems.
Z-numbers are composed of two parts, the first one is a restriction on the values that can be assumed, and the second part is the reliability of the information.
As human beings we communicate with other people by means of natural language using sentences like: the journey time from home to university takes about half hour, very likely.
Firstly, Z-numbers are converted to fuzzy numbers using a standard procedure.
Next, the Z-TODIM and Z-TOPSIS are presented as a direct extension of the fuzzy TODIM and fuzzy TOPSIS, respectively.
The proposed methods are applied to two case studies and compared with the standard approach using crisp values.
Results obtained show the feasibility of the approach.
In addition, a graphical interface was built to handle with both methods Z- TODIM and Z-TOPSIS allowing ease of use for user in other areas of knowledge.
We provide an efficient algorithm for determining how a road network has evolved over time, given two snapshot instances from different dates.
To allow for such determinations across different databases and even against hand drawn maps, we take a strictly topological approach in this paper, so that we compare road networks based strictly on graph-theoretic properties.
Given two road networks of same region from two different dates, our approach allows one to match road network portions that remain intact and also point out added or removed portions.
We analyze our algorithm both theoretically, showing that it runs in polynomial time for non-degenerate road networks even though a related problem is NP-complete, and experimentally, using dated road networks from the TIGER/Line archive of the U.S. Census Bureau.
Self-powered, energy harvesting small cell base stations (SBS) are expected to be an integral part of next-generation wireless networks.
However, due to uncertainties in harvested energy, it is necessary to adopt energy efficient power control schemes to reduce an SBSs' energy consumption and thus ensure quality-of-service (QoS) for users.
Such energy-efficient design can also be done via the use of content caching which reduces the usage of the capacity-limited SBS backhaul. of popular content at SBS can also prove beneficial in this regard by reducing the backhaul usage.
In this paper, an online energy efficient power control scheme is developed for an energy harvesting SBS equipped with a wireless backhaul and local storage.
In our model, energy arrivals are assumed to be Poisson distributed and the popularity distribution of requested content is modeled using Zipf's law.
The power control problem is formulated as a (discounted) infinite horizon dynamic programming problem and solved numerically using the value iteration algorithm.
Using simulations, we provide valuable insights on the impact of energy harvesting and caching on the energy and sum-throughput performance of the SBS as the network size is varied.
Our results also show that the size of cache and energy harvesting equipment at the SBS can be traded off, while still meeting the desired system performance.
Querying graph structured data is a fundamental operation that enables important applications including knowledge graph search, social network analysis, and cyber-network security.
However, the growing size of real-world data graphs poses severe challenges for graph databases to meet the response-time requirements of the applications.
Planning the computational steps of query processing - Query Planning - is central to address these challenges.
In this paper, we study the problem of learning to speedup query planning in graph databases towards the goal of improving the computational-efficiency of query processing via training queries.We present a Learning to Plan (L2P) framework that is applicable to a large class of query reasoners that follow the Threshold Algorithm (TA) approach.
First, we define a generic search space over candidate query plans, and identify target search trajectories (query plans) corresponding to the training queries by performing an expensive search.
Subsequently, we learn greedy search control knowledge to imitate the search behavior of the target query plans.
We provide a concrete instantiation of our L2P framework for STAR, a state-of-the-art graph query reasoner.
Our experiments on benchmark knowledge graphs including DBpedia, YAGO, and Freebase show that using the query plans generated by the learned search control knowledge, we can significantly improve the speed of STAR with negligible loss in accuracy.
One of the key requirements for fifth-generation (5G) cellular networks is their ability to handle densely connected devices with different quality of service (QoS) requirements.
In this article, we present multi-service oriented multiple access (MOMA), an integrated access scheme for massive connections with diverse QoS profiles and/or traffic patterns originating from both handheld devices and machine-to-machine (M2M) transmissions.
MOMA is based on a) stablishing separate classes of users based on relevant criteria that go beyond the simple handheld/M2M split, b) class dependent hierarchical spreading of the data signal and c) a mix of multiuser and single-user detection schemes at the receiver.
Practical implementations of the MOMA principle are provided for base stations (BSs) that are equipped with a large number of antenna elements.
Finally, it is shown that such a massive-multiple-input-multiple-output (MIMO) scenario enables the achievement of all the benefits of MOMA even with a simple receiver structure that allows to concentrate the receiver complexity where effectively needed.
Two key factors dominate the development of effective production grade machine learning models.
First, it requires a local software implementation and iteration process.
Second, it requires distributed infrastructure to efficiently conduct training and hyperparameter optimization.
While modern machine learning frameworks are very effective at the former, practitioners are often left building ad hoc frameworks for the latter.
We present SigOpt Orchestrate, a library for such simultaneous training in a cloud environment.
We describe the motivating factors and resulting design of this library, feedback from initial testing, and future goals.
Multipath forwarding consists of using multiple paths simultaneously to transport data over the network.
While most such techniques require endpoint modifications, we investigate how multipath forwarding can be done inside the network, transparently to endpoint hosts.
With such a network-centric approach, packet reordering becomes a critical issue as it may cause critical performance degradation.
We present a Software Defined Network architecture which automatically sets up multipath forwarding, including solutions for reordering and performance improvement, both at the sending side through multipath scheduling algorithms, and the receiver side, by resequencing out-of-order packets in a dedicated in-network buffer.
We implemented a prototype with commonly available technology and evaluated it in both emulated and real networks.
Our results show consistent throughput improvements, thanks to the use of aggregated path capacity.
We give comparisons to Multipath TCP, where we show our approach can achieve a similar performance while offering the advantage of endpoint transparency.
This paper presents a new approach for detecting outliers by introducing the notion of object's proximity.
The main idea is that normal point has similar characteristics with several neighbors.
So the point in not an outlier if it has a high degree of proximity and its neighbors are several.
The performance of this approach is illustrated through real datasets
In this paper, we introduce a new channel model we term the q-ary multi-bit channel (QMBC).
This channel models a memory device, where q-ary symbols (q=2^s) are stored in the form of current/voltage levels.
The symbols are read in a measurement process, which provides a symbol bit in each measurement step, starting from the most significant bit.
An error event occurs when not all the symbol bits are known.
To deal with such error events, we use GF(q) low-density parity-check (LDPC) codes and analyze their decoding performance.
We start with iterative-decoding threshold analysis, and derive optimal edge-label distributions for maximizing the decoding threshold.
We later move to finite-length iterative-decoding analysis and propose an edge-labeling algorithm for improved decoding performance.
We then provide finite-length maximum-likelihood decoding analysis for both the standard non-binary random ensemble and LDPC ensembles.
Finally, we demonstrate by simulations that the proposed edge-labeling algorithm improves finite-length decoding performance by orders of magnitude.
In this paper, we model the document revision detection problem as a minimum cost branching problem that relies on computing document distances.
Furthermore, we propose two new document distance measures, word vector-based Dynamic Time Warping (wDTW) and word vector-based Tree Edit Distance (wTED).
Our revision detection system is designed for a large scale corpus and implemented in Apache Spark.
We demonstrate that our system can more precisely detect revisions than state-of-the-art methods by utilizing the Wikipedia revision dumps https://snap.stanford.edu/data/wiki-meta.html and simulated data sets.
This report describes my research activities in the Hasso Plattner Institute and summarizes my Ph.D. plan and several novels, end-to-end trainable approaches for analyzing medical images using deep learning algorithm.
In this report, as an example, we explore different novel methods based on deep learning for brain abnormality detection, recognition, and segmentation.
This report prepared for the doctoral consortium in the AIME-2017 conference.
In Wireless Sensor Networks, the sensor nodes are battery powered small devices designed for long battery life.
These devices also lack in terms of processing capability and memory.
In order to provide high confidentiality to these resource constrained network nodes, a suitable security algorithm is needed to be deployed that can establish a balance between security level and processing overhead.
The objective of this research work is to perform a security analysis and performance evaluation of recently proposed Secure Force algorithm.
This paper shows the comparison of Secure Force 64, 128, and 192 bit architecture on the basis of avalanche effect (key sensitivity), entropy change analysis, image histogram, and computational time.
Moreover, based on the evaluation results, the paper also suggests the possible solutions for the weaknesses of the SF algorithm.
This work proposes a unified heuristic algorithm for a large class of earliness-tardiness (E-T) scheduling problems.
We consider single/parallel machine E-T problems that may or may not consider some additional features such as idle time, setup times and release dates.
In addition, we also consider those problems whose objective is to minimize either the total (average) weighted completion time or the total (average) weighted flow time, which arise as particular cases when the due dates of all jobs are either set to zero or to their associated release dates, respectively.
The developed local search based metaheuristic framework is quite simple, but at the same time relies on sophisticated procedures for efficiently performing local search according to the characteristics of the problem.
We present efficient move evaluation approaches for some parallel machine problems that generalize the existing ones for single machine problems.
The algorithm was tested in hundreds of instances of several E-T problems and particular cases.
The results obtained show that our unified heuristic is capable of producing high quality solutions when compared to the best ones available in the literature that were obtained by specific methods.
Moreover, we provide an extensive annotated bibliography on the problems related to those considered in this work, where we not only indicate the approach(es) used in each publication, but we also point out the characteristics of the problem(s) considered.
Beyond that, we classify the existing methods in different categories so as to have a better idea of the popularity of each type of solution procedure.
Estimating the 6-DoF pose of a camera from a single image relative to a pre-computed 3D point-set is an important task for many computer vision applications.
Perspective-n-Point (PnP) solvers are routinely used for camera pose estimation, provided that a good quality set of 2D-3D feature correspondences are known beforehand.
However, finding optimal correspondences between 2D key-points and a 3D point-set is non-trivial, especially when only geometric (position) information is known.
Existing approaches to the simultaneous pose and correspondence problem use local optimisation, and are therefore unlikely to find the optimal solution without a good pose initialisation, or introduce restrictive assumptions.
Since a large proportion of outliers are common for this problem, we instead propose a globally-optimal inlier set cardinality maximisation approach which jointly estimates optimal camera pose and optimal correspondences.
Our approach employs branch-and-bound to search the 6D space of camera poses, guaranteeing global optimality without requiring a pose prior.
The geometry of SE(3) is used to find novel upper and lower bounds for the number of inliers and local optimisation is integrated to accelerate convergence.
The evaluation empirically supports the optimality proof and shows that the method performs much more robustly than existing approaches, including on a large-scale outdoor data-set.
We propose a novel approach to multi-fingered grasp planning leveraging learned deep neural network models.
We train a convolutional neural network to predict grasp success as a function of both visual information of an object and grasp configuration.
We can then formulate grasp planning as inferring the grasp configuration which maximizes the probability of grasp success.
We efficiently perform this inference using a gradient-ascent optimization inside the neural network using the backpropagation algorithm.
Our work is the first to directly plan high quality multifingered grasps in configuration space using a deep neural network without the need of an external planner.
We validate our inference method performing both multifinger and two-finger grasps on real robots.
Our experimental results show that our planning method outperforms existing planning methods for neural networks; while offering several other benefits including being data-efficient in learning and fast enough to be deployed in real robotic applications.
Sparse connectivity is an important factor behind the success of convolutional neural networks and recurrent neural networks.
In this paper, we consider the problem of learning sparse connectivity for feedforward neural networks (FNNs).
The key idea is that a unit should be connected to a small number of units at the next level below that are strongly correlated.
We use Chow-Liu's algorithm to learn a tree-structured probabilistic model for the units at the current level, use the tree to identify subsets of units that are strongly correlated, and introduce a new unit with receptive field over the subsets.
The procedure is repeated on the new units to build multiple layers of hidden units.
The resulting model is called a TRF-net.
Empirical results show that, when compared to dense FNNs, TRF-net achieves better or comparable classification performance with much fewer parameters and sparser structures.
They are also more interpretable.
In many common scenarios, programmers need to implement functionality that is already provided by some third party library.
This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code.
The key technical idea underlying our approach is to use types to both improve search results and guide synthesis.
Specifically, our method computes similarity metrics between types and uses this information to solve an integer linear programming (ILP) problem in which the objective is to minimize the cost of synthesis.
We have implemented Hunter as an Eclipse plug-in and evaluate it by (a) comparing it against S6, a state-of-the-art code reuse tool, and (b) performing a user study.
Our evaluation shows that Hunter compares favorably with S6 and significantly increases programmer productivity.
The application of psychophysiologicy in human-computer interaction is a growing field with significant potential for future smart personalised systems.
Working in this emerging field requires comprehension of an array of physiological signals and analysis techniques.
Eye tracking is a widely used method for tracking user attention with gaze location, but also provides information on the internal cognitive and contextual state, intention, and the locus of the user's visual attention in interactive settings through a number of eye and eyelid movement related parameters.
This paper presents a short review on the application of eye tracking in human-computer interaction.
This paper aims to serve as a primer for the novice, enabling rapid familiarisation with the latest core concepts.
We put special emphasis on everyday human-computer interface applications to distinguish from the more common clinical or sports uses of psychophysiology.
This paper is an extract from a comprehensive review of the entire field of ambulatory psychophysiology, including 12 similar chapters, plus application guidelines and systematic review.
Thus any citation should be made using the following reference:   B. Cowley, M. Filetti, K. Lukander, J. Torniainen, A. Henelius, L. Ahonen, O. Barral, I. Kosunen, T. Valtonen, M. Huotilainen, N. Ravaja, G. Jacucci.
The Psychophysiology Primer: a guide to methods and a broad review with a focus on human-computer interaction.
Foundations and Trends in Human-Computer Interaction, vol.9, no.3-4, pp.150--307, 2016.
Model based diagnosis finds a growing range of practical applications, and significant performance-wise improvements have been achieved in recent years.
Some of these improvements result from formulating the problem with maximum satisfiability (MaxSAT).
Whereas recent work focuses on analyzing failing observations separately, it is also the case that in practical settings there may exist many failing observations.
This paper first investigates the drawbacks of analyzing failing observations separately.
It then shows that existing solutions do not scale for large systems.
Finally, the paper proposes a novel approach for diagnosing systems with many failing observations.
The proposed approach is based on implicit hitting sets and so is tightly related with the original seminal work on model based diagnosis.
The experimental results demonstrate not only the importance of analyzing multiple observations simultaneously, but also the significance of the implicit hitting set approach.
A good measure of similarity between data points is crucial to many tasks in machine learning.
Similarity and metric learning methods learn such measures automatically from data, but they do not scale well respect to the dimensionality of the data.
In this paper, we propose a method that can learn efficiently similarity measure from high-dimensional sparse data.
The core idea is to parameterize the similarity measure as a convex combination of rank-one matrices with specific sparsity structures.
The parameters are then optimized with an approximate Frank-Wolfe procedure to maximally satisfy relative similarity constraints on the training data.
Our algorithm greedily incorporates one pair of features at a time into the similarity measure, providing an efficient way to control the number of active features and thus reduce overfitting.
It enjoys very appealing convergence guarantees and its time and memory complexity depends on the sparsity of the data instead of the dimension of the feature space.
Our experiments on real-world high-dimensional datasets demonstrate its potential for classification, dimensionality reduction and data exploration.
Reduction operations are extensively employed in many computational problems.
A reduction consists of, given a finite set of numeric elements, combining into a single value all elements in that set, using for this a combiner function.
A parallel reduction, in turn, is the reduction operation concurrently performed when multiple execution units are available.
The current work reports an investigation on this subject and depicts a GPU-based parallel approach for it.
Employing techniques like Loop Unrolling, Persistent Threads and Algebraic Expressions to avoid thread divergence, the presented approach was able to achieve a 2.8x speedup when compared to the work of Catanzaro, using a generic, simple and easily portable code.
Experiments conducted to evaluate the approach show that the strategy is able to perform efficiently in AMD and NVidia's hardware, as well as in OpenCL and CUDA.
We present a framework to learn privacy-preserving encodings of images that inhibit inference of chosen private attributes, while allowing recovery of other desirable information.
Rather than simply inhibiting a given fixed pre-trained estimator, our goal is that an estimator be unable to learn to accurately predict the private attributes even with knowledge of the encoding function.
We use a natural adversarial optimization-based formulation for this---training the encoding function against a classifier for the private attribute, with both modeled as deep neural networks.
The key contribution of our work is a stable and convergent optimization approach that is successful at learning an encoder with our desired properties---maintaining utility while inhibiting inference of private attributes, not just within the adversarial optimization, but also by classifiers that are trained after the encoder is fixed.
We adopt a rigorous experimental protocol for verification wherein classifiers are trained exhaustively till saturation on the fixed encoders.
We evaluate our approach on tasks of real-world complexity---learning high-dimensional encodings that inhibit detection of different scene categories---and find that it yields encoders that are resilient at maintaining privacy.
Relational representations in reinforcement learning allow for the use of structural information like the presence of objects and relationships between them in the description of value functions.
Through this paper, we show that such representations allow for the inclusion of background knowledge that qualitatively describes a state and can be used to design agents that demonstrate learning behavior in domains with large state and actions spaces such as computer games.
Restricted Boltzmann Machines (RBMs) and models derived from them have been successfully used as basic building blocks in deep artificial neural networks for automatic features extraction, unsupervised weights initialization, but also as density estimators.
Thus, their generative and discriminative capabilities, but also their computational time are instrumental to a wide range of applications.
Our main contribution is to look at RBMs from a topological perspective, bringing insights from network science.
Firstly, here we show that RBMs and Gaussian RBMs (GRBMs) are bipartite graphs which naturally have a small-world topology.
Secondly, we demonstrate both on synthetic and real-world datasets that by constraining RBMs and GRBMs to a scale-free topology (while still considering local neighborhoods and data distribution), we reduce the number of weights that need to be computed by a few orders of magnitude, at virtually no loss in generative performance.
Thirdly, we show that, for a fixed number of weights, our proposed sparse models (which by design have a higher number of hidden neurons) achieve better generative capabilities than standard fully connected RBMs and GRBMs (which by design have a smaller number of hidden neurons), at no additional computational costs.
Consider a device-to-device (D2D) fog-radio access network wherein a set of devices are required to store a set of files.
Each device is connected to a subset of the cloud data centers and thus possesses a subset of the data.
This paper investigates the problem of disseminating all files among the devices while reducing the total time of communication, i.e., the completion time, using instantly decodable network coding (IDNC).
While previous studies on the use of IDNC in D2D systems assume a fully connected communication network, this paper tackles the more realistic scenario of a partially connected network in which devices can only target devices in their transmission range.
The paper first formulates the optimal joint optimization of selecting the transmitting device(s) and the file combination(s) and exhibits its intractability.
The completion time is approximated using the celebrated decoding delay approach by deriving the relationship between the quantities in a partially connected network.
The paper introduces the cooperation graph and demonstrates that the relaxed problem is equivalent to a maximum weight clique problem over the newly designed graph wherein the weights are obtained by solving a similar problem on the local IDNC graphs.
Extensive simulations reveal that the proposed solution provides noticeable performance enhancement and outperforms previously proposed IDNC-based schemes.
A new algorithm to generate all Dyck words is presented, which is used in ranking and unranking Dyck words.
We emphasize the importance of using Dyck words in encoding objects related to Catalan numbers.
As a consequence of formulas used in the ranking algorithm we can obtain a recursive formula for the nth Catalan number.
Bots have been playing a crucial role in online platform ecosystems, as efficient and automatic tools to generate content and diffuse information to the social media human population.
In this chapter, we will discuss the role of social bots in content spreading dynamics in social media.
In particular, we will first investigate some differences between diffusion dynamics of content generated by bots, as opposed to humans, in the context of political communication, then study the characteristics of bots behind the diffusion dynamics of social media spam campaigns.
Binarization of degraded historical manuscript images is an important pre-processing step for many document processing tasks.
We formulate binarization as a pixel classification learning task and apply a novel Fully Convolutional Network (FCN) architecture that operates at multiple image scales, including full resolution.
The FCN is trained to optimize a continuous version of the Pseudo F-measure metric and an ensemble of FCNs outperform the competition winners on 4 of 7 DIBCO competitions.
This same binarization technique can also be applied to different domains such as Palm Leaf Manuscripts with good performance.
We analyze the performance of the proposed model w.r.t. the architectural hyperparameters, size and diversity of training data, and the input features chosen.
Diagrammatic models of feeding choices reveal fundamental robotic behaviors.
Successful choices are reinforced by positive feedback, while unsuccessful ones by negative feedback.
This paper will address robotic feeding by casually relating consequential behavior subtended by a strong dependence upon survival.
Unlike its deterministic counterpart, static and stochastic vehicle routing problems (SS-VRP) aim at modeling and solving real-life operational problems by considering uncertainty on data.
We consider the SS-VRPTW-CR introduced in Saint-Guillain et al.(2017).
Like the SS-VRP introduced by Bertsimas (1992), we search for optimal first stage routes for a fleet of vehicles to handle a set of stochastic customer demands, i.e., demands are uncertain and we only know their probabilities.
In addition to capacity constraints, customer demands are also constrained by time windows.
Unlike all SS-VRP variants, the SS-VRPTW-CR does not make any assumption on the time at which a stochastic demand is revealed, i.e., the reveal time is stochastic as well.
To handle this new problem, we introduce waiting locations: Each vehicle is assigned a sequence of waiting locations from which it may serve some associated demands, and the objective is to minimize the expected number of demands that cannot be satisfied in time.
In this paper, we propose two new recourse strategies for the SS-VRPTW-CR, together with their closed-form expressions for efficiently computing their expectations: The first one allows us to take vehicle capacities into account; The second one allows us to optimize routes by avoiding some useless trips.
We propose two algorithms for searching for routes with optimal expected costs: The first one is an extended branch-and-cut algorithm, based on a stochastic integer formulation, and the second one is a local search based heuristic method.
We also introduce a new public benchmark for the SS-VRPTW-CR, based on real-world data coming from the city of Lyon.
We evaluate our two algorithms on this benchmark and empirically demonstrate the expected superiority of the SS-VRPTW-CR anticipative actions over a basic "wait-and-serve" policy.
In this paper we present Foggy, an architectural framework and software platform based on Open Source technologies.
Foggy orchestrates application workload, negotiates resources and supports IoT operations for multi-tier, distributed, heterogeneous and decentralized Cloud Computing systems.
Foggy is tailored for emerging domains such as 5G Networks and IoT, which demand resources and services to be distributed and located close to data sources and users following the Fog Computing paradigm.
Foggy provides a platform for infrastructure owners and tenants (i.e., application providers) offering functionality of negotiation, scheduling and workload placement taking into account traditional requirements (e.g. based on RAM, CPU, disk) and non-traditional ones (e.g. based on networking) as well as diversified constraints on location and access rights.
Economics and pricing of resources can also be considered by the Foggy model in a near future.
The ability of Foggy to find a trade-off between infrastructure owners' and tenants' needs, in terms of efficient and optimized use of the infrastructure while satisfying the application requirements, is demonstrated through three use cases in the video surveillance and vehicle tracking contexts.
This paper describes a context free grammar (CFG) based grammatical relations for Myanmar sentences which combine corpus-based function tagging system.
Part of the challenge of statistical function tagging for Myanmar sentences comes from the fact that Myanmar has free-phrase-order and a complex morphological system.
Function tagging is a pre-processing step to show grammatical relations of Myanmar sentences.
In the task of function tagging, which tags the function of Myanmar sentences with correct segmentation, POS (part-of-speech) tagging and chunking information, we use Naive Bayesian theory to disambiguate the possible function tags of a word.
We apply context free grammar (CFG) to find out the grammatical relations of the function tags.
We also create a functional annotated tagged corpus for Myanmar and propose the grammar rules for Myanmar sentences.
Experiments show that our analysis achieves a good result with simple sentences and complex sentences.
Bimanual gestures are of the utmost importance for the study of motor coordination in humans and in everyday activities.
A reliable detection of bimanual gestures in unconstrained environments is fundamental for their clinical study and to assess common activities of daily living.
This paper investigates techniques for a reliable, unconstrained detection and classification of bimanual gestures.
It assumes the availability of inertial data originating from the two hands/arms, builds upon a previously developed technique for gesture modelling based on Gaussian Mixture Modelling (GMM) and Gaussian Mixture Regression (GMR), and compares different modelling and classification techniques, which are based on a number of assumptions inspired by literature about how bimanual gestures are represented and modelled in the brain.
Experiments show results related to 5 everyday bimanual activities, which have been selected on the basis of three main parameters: (not) constraining the two hands by a physical tool, (not) requiring a specific sequence of single-hand gestures, being recursive (or not).
In the best performing combination of modeling approach and classification technique, five out of five activities are recognized up to an accuracy of 97%, a precision of 82% and a level of recall of 100%.
This paper describes QCRI's machine translation systems for the IWSLT 2016 evaluation campaign.
We participated in the Arabic->English and English->Arabic tracks.
We built both Phrase-based and Neural machine translation models, in an effort to probe whether the newly emerged NMT framework surpasses the traditional phrase-based systems in Arabic-English language pairs.
We trained a very strong phrase-based system including, a big language model, the Operation Sequence Model, Neural Network Joint Model and Class-based models along with different domain adaptation techniques such as MML filtering, mixture modeling and using fine tuning over NNJM model.
However, a Neural MT system, trained by stacking data from different genres through fine-tuning, and applying ensemble over 8 models, beat our very strong phrase-based system by a significant 2 BLEU points margin in Arabic->English direction.
We did not obtain similar gains in the other direction but were still able to outperform the phrase-based system.
We also applied system combination on phrase-based and NMT outputs.
The splendid success of convolutional neural networks (CNNs) in computer vision is largely attributed to the availability of large annotated datasets, such as ImageNet and Places.
However, in biomedical imaging, it is very challenging to create such large annotated datasets, as annotating biomedical images is not only tedious, laborious, and time consuming, but also demanding of costly, specialty-oriented skills, which are not easily accessible.
To dramatically reduce annotation cost, this paper presents a novel method to naturally integrate active learning and transfer learning (fine-tuning) into a single framework, called AFT*, which starts directly with a pre-trained CNN to seek "worthy" samples for annotation and gradually enhance the (fine-tuned) CNN via continuous fine-tuning.
We have evaluated our method in three distinct biomedical imaging applications, demonstrating that it can cut the annotation cost by at least half, in comparison with the state-of-the-art method.
This performance is attributed to the several advantages derived from the advanced active, continuous learning capability of our method.
Although AFT* was initially conceived in the context of computer-aided diagnosis in biomedical imaging, it is generic and applicable to many tasks in computer vision and image analysis; we illustrate the key ideas behind AFT* with the Places database for scene interpretation in natural images.
In monocular vision systems, lack of knowledge about metric distances caused by the inherent scale ambiguity can be a strong limitation for some applications.
We offer a method for fusing inertial measurements with monocular odometry or tracking to estimate metric distances in inertial-monocular systems and to increase the rate of pose estimates.
As we performed the fusion in a loosely-coupled manner, each input block can be easily replaced with one's preference, which makes our method quite flexible.
We experimented our method using the ORB-SLAM algorithm for the monocular tracking input and Euler forward integration to process the inertial measurements.
We chose sets of data recorded on UAVs to design a suitable system for flying robots.
High Utility Itemset (HUI) mining problem is one of the important problems in the data mining literature.
The problem offers greater flexibility to a decision maker to incorporate her/his notion of utility into the pattern mining process.
The problem, however, requires the decision maker to choose a minimum utility threshold value for discovering interesting patterns.
This is quite challenging due to the disparate itemset characteristics and their utility distributions.
In order to address this issue, Top-K High Utility Itemset (THUI) mining problem was introduced in the literature.
THUI mining problem is primarily a variant of the HUI mining problem that allows a decision maker to specify the desired number of HUIs rather than the minimum utility threshold value.
Several algorithms have been introduced in the literature to efficiently mine top-k HUIs.
This paper systematically analyses the top-k HUI mining methods in the literature, describes the methods, and performs a comparative analysis.
The data structures, threshold raising strategies, and pruning strategies adopted for efficient top-k HUI mining are also presented and analysed.
Furthermore, the paper reviews several extensions of the top-k HUI mining problem such as data stream mining, sequential pattern mining and on-shelf utility mining.
The paper is likely to be useful for researchers to examine the key methods in top-k HUI mining, evaluate the gaps in literature, explore new research opportunities and enhance the state-of-the-art in high utility pattern mining.
Unmanned Aerial Vehicles (UAVs) have been recently considered as means to provide enhanced coverage or relaying services to mobile users (MUs) in wireless systems with limited or no infrastructure.
In this paper, a UAV-based mobile cloud computing system is studied in which a moving UAV is endowed with computing capabilities to offer computation offloading opportunities to MUs with limited local processing capabilities.
The system aims at minimizing the total mobile energy consumption while satisfying quality of service requirements of the offloaded mobile application.
Offloading is enabled by uplink and downlink communications between the mobile devices and the UAV that take place by means of frequency division duplex (FDD) via orthogonal or non-orthogonal multiple access (NOMA) schemes.
The problem of jointly optimizing the bit allocation for uplink and downlink communication as well as for computing at the UAV, along with the cloudlet's trajectory under latency and UAV's energy budget constraints is formulated and addressed by leveraging successive convex approximation (SCA) strategies.
Numerical results demonstrate the significant energy savings that can be accrued by means of the proposed joint optimization of bit allocation and cloudlet's trajectory as compared to local mobile execution as well as to partial optimization approaches that design only the bit allocation or the cloudlet's trajectory.
Semi-supervised node classification in attributed graphs, i.e., graphs with node features, involves learning to classify unlabeled nodes given a partially labeled graph.
Label predictions are made by jointly modeling the node and its' neighborhood features.
State-of-the-art models for node classification on such attributed graphs use differentiable recursive functions that enable aggregation and filtering of neighborhood information from multiple hops.
In this work, we analyze the representation capacity of these models to regulate information from multiple hops independently.
From our analysis, we conclude that these models despite being powerful, have limited representation capacity to capture multi-hop neighborhood information effectively.
Further, we also propose a mathematically motivated, yet simple extension to existing graph convolutional networks (GCNs) which has improved representation capacity.
We extensively evaluate the proposed model, F-GCN on eight popular datasets from different domains.
F-GCN outperforms the state-of-the-art models for semi-supervised learning on six datasets while being extremely competitive on the other two.
In this paper, we introduce a simple, yet powerful pipeline for medical image segmentation that combines Fully Convolutional Networks (FCNs) with Fully Convolutional Residual Networks (FC-ResNets).
We propose and examine a design that takes particular advantage of recent advances in the understanding of both Convolutional Neural Networks as well as ResNets.
Our approach focuses upon the importance of a trainable pre-processing when using FC-ResNets and we show that a low-capacity FCN model can serve as a pre-processor to normalize medical input data.
In our image segmentation pipeline, we use FCNs to obtain normalized images, which are then iteratively refined by means of a FC-ResNet to generate a segmentation prediction.
As in other fully convolutional approaches, our pipeline can be used off-the-shelf on different image modalities.
We show that using this pipeline, we exhibit state-of-the-art performance on the challenging Electron Microscopy benchmark, when compared to other 2D methods.
We improve segmentation results on CT images of liver lesions, when contrasting with standard FCN methods.
Moreover, when applying our 2D pipeline on a challenging 3D MRI prostate segmentation challenge we reach results that are competitive even when compared to 3D methods.
The obtained results illustrate the strong potential and versatility of the pipeline by achieving highly accurate results on multi-modality images from different anatomical regions and organs.
What properties about the internals of a program explain the possible differences in its overall running time for different inputs?
In this paper, we propose a formal framework for considering this question we dub trace-set discrimination.
We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals.
On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy---more scalably than maximum likelihood discriminants and with comparable accuracy.
We demonstrate on three larger case studies how decision-tree discriminants produced by our tool are useful for debugging timing side-channel vulnerabilities (i.e., where a malicious observer infers secrets simply from passively watching execution times) and availability vulnerabilities.
We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model.
Current methods build from distinct components, using standard encoders with separate off-the-shelf features such as SIFT descriptors or pre-trained CNN features for material recognition.
Our new approach provides an end-to-end learning framework, where the inherent visual vocabularies are learned directly from the loss function.
The features, dictionaries and the encoding representation for the classifier are all learned simultaneously.
The representation is orderless and therefore is particularly useful for material and texture recognition.
The Encoding Layer generalizes robust residual encoders such as VLAD and Fisher Vectors, and has the property of discarding domain specific information which makes the learned convolutional features easier to transfer.
Additionally, joint training using multiple datasets of varied sizes and class labels is supported resulting in increased recognition performance.
The experimental results show superior performance as compared to state-of-the-art methods using gold-standard databases such as MINC-2500, Flickr Material Database, KTH-TIPS-2b, and two recent databases 4D-Light-Field-Material and GTOS.
The source code for the complete system are publicly available.
We present a formalized, fully decentralized runtime semantics for a core subset of ABS, a language and framework for modelling distributed object-oriented systems.
The semantics incorporates an abstract graph representation of a network infrastructure, with network endpoints represented as graph nodes, and links as arcs with buffers, corresponding to OSI layer 2 interconnects.
The key problem we wish to address is how to allocate computational tasks to nodes so that certain performance objectives are met.
To this end, we use the semantics as a foundation for performing network-adaptive task execution via object migration between nodes.
Adaptability is analyzed in terms of three Quality of Service objectives: node load, arc load and message latency.
We have implemented the key parts of our semantics in a simulator and evaluated how well objectives are achieved for some application-relevant choices of network topology, migration procedure and ABS program.
The evaluation suggests that it is feasible in a decentralized setting to continually meet both the objective of a node-balanced task allocation and make headway towards minimizing communication, and thus arc load and message latency.
Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions.
This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging process is completed manually.
One major challenge to extend the automation to ramp merging is that the automated vehicle needs to incorporate and optimize long-term objectives (e.g. successful and smooth merge) when near-term actions must be safely executed.
Moreover, the merging process involves interactions with other vehicles whose behaviors are sometimes hard to predict but may influence the merging vehicle optimal actions.
To tackle such a complicated control problem, we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an optimal driving policy by maximizing the long-term reward in an interactive environment.
Specifically, we apply a Long Short-Term Memory (LSTM) architecture to model the interactive environment, from which an internal state containing historical driving information is conveyed to a Deep Q-Network (DQN).
The DQN is used to approximate the Q-function, which takes the internal state as input and generates Q-values as output for action selection.
With this DRL architecture, the historical impact of interactive environment on the long-term reward can be captured and taken into account for deciding the optimal control policy.
The proposed architecture has the potential to be extended and applied to other autonomous driving scenarios such as driving through a complex intersection or changing lanes under varying traffic flow conditions.
The way in which electric power depends on the topology of circuits with mixed voltage and current sources is examined.
The power flowing in any steady-state DC circuit is shown to depend on a minimal set of key variables called fundamental node voltages and fundamental edge currents.
Every steady-state DC circuit can be decomposed into a voltage controlled subcircuit and a current controlled subcircuit.
In terms of such a decomposition, the I^2R losses of a mixed source circuit are always the sum of losses on the voltage controlled subcircuit and the current controlled subcircuit.
The paper concludes by showing that the total power flowing in a mixed source circuit can be found as critical points of the power expressed in terms of the key voltage and current variables mentioned above.
The possible relationship to topology control of electric grid operations is discussed.
We present an effect system for core Eff, a simplified variant of Eff, which is an ML-style programming language with first-class algebraic effects and handlers.
We define an expressive effect system and prove safety of operational semantics with respect to it.
Then we give a domain-theoretic denotational semantics of core Eff, using Pitts's theory of minimal invariant relations, and prove it adequate.
We use this fact to develop tools for finding useful contextual equivalences, including an induction principle.
To demonstrate their usefulness, we use these tools to derive the usual equations for mutable state, including a general commutativity law for computations using non-interfering references.
We have formalized the effect system, the operational semantics, and the safety theorem in Twelf.
The rank aggregation problem has received significant recent attention within the computer science community.
Its applications today range far beyond the original aim of building metasearch engines to problems in machine learning, recommendation systems and more.
Several algorithms have been proposed for these problems, and in many cases approximation guarantees have been proven for them.
However, it is also known that some Markov chain based algorithms (MC1, MC2, MC3, MC4) perform extremely well in practice, yet had no known performance guarantees.
We prove supra-constant lower bounds on approximation guarantees for all of them.
We also raise the lower bound for sorting by Copeland score from 3/2 to 2 and prove an upper bound of 11, before showing that in particular ways, MC4 can nevertheless be seen as a generalization of Copeland score.
Use case driven development methodologies put use cases at the center of the software development process.
However, in order to support automated development and analysis, use cases need to be appropriately formalized.
This will also help guarantee consistency between requirements specifications and the developed solutions.
Formal methods tend to suffer from take up issues, as they are usually hard to accept by industry.
In this context, it is relevant not only to produce languages and approaches to support formalization, but also to perform their validation.
In previous works we have developed an approach to formalize use cases resorting to ontologies.
In this paper we present the validation of one such approach.
Through a three stage study, we evaluate the acceptance of the language and supporting tool.
The first stage focusses on the acceptance of the process and language, the second on the support the tool provides to the process, and finally the third one on the tool's usability aspects.
Results show test subjects found the approach feasible and useful and the tool easy to use.
Contemporary electricity distribution systems are being challenged by the variability of renewable energy sources.
Slow response times and long energy management periods cannot efficiently integrate intermittent renewable generation and demand.
Yet stochasticity can be judiciously coupled with system flexibilities to enhance grid operation efficiency.
Voltage magnitudes for instance can transiently exceed regulation limits, while smart inverters can be overloaded over short time intervals.
To implement such a mode of operation, an ergodic energy management framework is developed here.
Considering a distribution grid with distributed energy sources and a feed-in tariff program, active power curtailment and reactive power compensation are formulated as a stochastic optimization problem.
Tighter operational constraints are enforced in an average sense, while looser margins are enforced to be satisfied at all times.
Stochastic dual subgradient solvers are developed based on exact and approximate grid models of varying complexity.
Numerical tests on a real-world 56-bus distribution grid and the IEEE 123-bus test feeder relying on both grid models corroborate the advantages of the novel schemes over their deterministic alternatives.
It is well known that the emptiness problem for binary probabilistic automata and so for quantum automata is undecidable.
We present the current status of the emptiness problems for unary probabilistic and quantum automata with connections with Skolem's and positivity problems.
We also introduce the concept of linear recurrence automata in order to show the connection naturally.
Then, we also give possible generalizations of linear recurrence relations and automata on vectors.
Advances in machine learning have led to broad deployment of systems with impressive performance on important problems.
Nonetheless, these systems can be induced to make errors on data that are surprisingly similar to examples the learned system handles correctly.
The existence of these errors raises a variety of questions about out-of-sample generalization and whether bad actors might use such examples to abuse deployed systems.
As a result of these security concerns, there has been a flurry of recent papers proposing algorithms to defend against such malicious perturbations of correctly handled examples.
It is unclear how such misclassifications represent a different kind of security problem than other errors, or even other attacker-produced examples that have no specific relationship to an uncorrupted input.
In this paper, we argue that adversarial example defense papers have, to date, mostly considered abstract, toy games that do not relate to any specific security concern.
Furthermore, defense papers have not yet precisely described all the abilities and limitations of attackers that would be relevant in practical security.
Towards this end, we establish a taxonomy of motivations, constraints, and abilities for more plausible adversaries.
Finally, we provide a series of recommendations outlining a path forward for future work to more clearly articulate the threat model and perform more meaningful evaluation.
There is a growing demand for accurate high-resolution land cover maps in many fields, e.g., in land-use planning and biodiversity conservation.
Developing such maps has been performed using Object-Based Image Analysis (OBIA) methods, which usually reach good accuracies, but require a high human supervision and the best configuration for one image can hardly be extrapolated to a different image.
Recently, the deep learning Convolutional Neural Networks (CNNs) have shown outstanding results in object recognition in the field of computer vision.
However, they have not been fully explored yet in land cover mapping for detecting species of high biodiversity conservation interest.
This paper analyzes the potential of CNNs-based methods for plant species detection using free high-resolution Google Earth T M images and provides an objective comparison with the state-of-the-art OBIA-methods.
We consider as case study the detection of Ziziphus lotus shrubs, which are protected as a priority habitat under the European Union Habitats Directive.
According to our results, compared to OBIA-based methods, the proposed CNN-based detection model, in combination with data-augmentation, transfer learning and pre-processing, achieves higher performance with less human intervention and the knowledge it acquires in the first image can be transferred to other images, which makes the detection process very fast.
The provided methodology can be systematically reproduced for other species detection.
We introduce a game-theoretic framework to study the hypothesis testing problem, in the presence of an adversary aiming at preventing a correct decision.
Specifically, the paper considers a scenario in which an analyst has to decide whether a test sequence has been drawn according to a probability mass function (pmf) P_X or not.
In turn, the goal of the adversary is to take a sequence generated according to a different pmf and modify it in such a way to induce a decision error.
P_X is known only through one or more training sequences.
We derive the asymptotic equilibrium of the game under the assumption that the analyst relies only on first order statistics of the test sequence, and compute the asymptotic payoff of the game when the length of the test sequence tends to infinity.
We introduce the concept of indistinguishability region, as the set of pmf's that can not be distinguished reliably from P_X in the presence of attacks.
Two different scenarios are considered: in the first one the analyst and the adversary share the same training sequence, in the second scenario, they rely on independent sequences.
The obtained results are compared to a version of the game in which the pmf P_X is perfectly known to the analyst and the adversary.
Motivated by online advertising auctions, we consider repeated Vickrey auctions where goods of unknown value are sold sequentially and bidders only learn (potentially noisy) information about a good's value once it is purchased.
We adopt an online learning approach with bandit feedback to model this problem and derive bidding strategies for two models: stochastic and adversarial.
In the stochastic model, the observed values of the goods are random variables centered around the true value of the good.
In this case, logarithmic regret is achievable when competing against well behaved adversaries.
In the adversarial model, the goods need not be identical and we simply compare our performance against that of the best fixed bid in hindsight.
We show that sublinear regret is also achievable in this case and prove matching minimax lower bounds.
To our knowledge, this is the first complete set of strategies for bidders participating in auctions of this type.
Testing is the most widely employed method to find vulnerabilities in real-world software programs.
Compositional analysis, based on symbolic execution, is an automated testing method to find vulnerabilities in medium- to large-scale programs consisting of many interacting components.
However, existing compositional analysis frameworks do not assess the severity of reported vulnerabilities.
In this paper, we present a framework to analyze vulnerabilities discovered by an existing compositional analysis tool and assign CVSS3 (Common Vulnerability Scoring System v3.0) scores to them, based on various heuristics such as interaction with related components, ease of reachability, complexity of design and likelihood of accepting unsanitized input.
By analyzing vulnerabilities reported with CVSS3 scores in the past, we train simple machine learning models.
By presenting our interactive framework to developers of popular open-source software and other security experts, we gather feedback on our trained models and further improve the features to increase the accuracy of our predictions.
By providing qualitative (based on community feedback) and quantitative (based on prediction accuracy) evidence from 21 open-source programs, we show that our severity prediction framework can effectively assist developers with assessing vulnerabilities.
To aid a variety of research studies, we propose TWIROLE, a hybrid model for role-related user classification on Twitter, which detects male-related, female-related, and brand-related (i.e., organization or institution) users.
TWIROLE leverages features from tweet contents, user profiles, and profile images, and then applies our hybrid model to identify a user's role.
To evaluate it, we used two existing large datasets about Twitter users, and conducted both intra- and inter-comparison experiments.
TWIROLE outperforms existing methods and obtains more balanced results over the several roles.
We also confirm that user names and profile images are good indicators for this task.
Our research extends prior work that does not consider brand-related users, and is an aid to future evaluation efforts relative to investigations that rely upon self-labeled datasets.
In this paper we present OSC, a scientific workflow specification language based on software architecture principles.
In contrast with other approaches, OSC employs connectors as first-class constructs.
In this way, we leverage reusability and compositionality in the workflow modeling process, specially in the configuration of mechanisms that manage non-functional attributes.
We present a new computational model for gaze prediction in egocentric videos by exploring patterns in temporal shift of gaze fixations (attention transition) that are dependent on egocentric manipulation tasks.
Our assumption is that the high-level context of how a task is completed in a certain way has a strong influence on attention transition and should be modeled for gaze prediction in natural dynamic scenes.
Specifically, we propose a hybrid model based on deep neural networks which integrates task-dependent attention transition with bottom-up saliency prediction.
In particular, the task-dependent attention transition is learned with a recurrent neural network to exploit the temporal context of gaze fixations, e.g.looking at a cup after moving gaze away from a grasped bottle.
Experiments on public egocentric activity datasets show that our model significantly outperforms state-of-the-art gaze prediction methods and is able to learn meaningful transition of human attention.
We present an online deliberation system using mutual evaluation in order to collaboratively develop solutions.
Participants submit their proposals and evaluate each other's proposals; some of them may then be invited by the system to rewrite 'problematic' proposals.
Two cases are discussed: a proposal supported by many, but not by a given person, who is then invited to rewrite it for making yet more acceptable; and a poorly presented but presumably interesting proposal.
The first of these cases has been successfully implemented.
Proposals are evaluated along two axes-understandability (or clarity, or, more generally, quality), and agreement.
The latter is used by the system to cluster proposals according to their ideas, while the former is used both to present the best proposals on top of their clusters, and to find poorly written proposals candidates for rewriting.
These functionalities may be considered as important components of a large scale online deliberation system.
In this paper, we propose multi-variable LSTM capable of accurate forecasting and variable importance interpretation for time series with exogenous variables.
Current attention mechanism in recurrent neural networks mostly focuses on the temporal aspect of data and falls short of characterizing variable importance.
To this end, the multi-variable LSTM equipped with tensorized hidden states is developed to learn hidden states for individual variables, which give rise to our mixture temporal and variable attention.
Based on such attention mechanism, we infer and quantify variable importance.
Extensive experiments using real datasets with Granger-causality test and the synthetic dataset with ground truth demonstrate the prediction performance and interpretability of multi-variable LSTM in comparison to a variety of baselines.
It exhibits the prospect of multi-variable LSTM as an end-to-end framework for both forecasting and knowledge discovery.
Computationally synthesized blood vessels can be used for training and evaluation of medical image analysis applications.
We propose a deep generative model to synthesize blood vessel geometries, with an application to coronary arteries in cardiac CT angiography (CCTA).
In the proposed method, a Wasserstein generative adversarial network (GAN) consisting of a generator and a discriminator network is trained.
While the generator tries to synthesize realistic blood vessel geometries, the discriminator tries to distinguish synthesized geometries from those of real blood vessels.
Both real and synthesized blood vessel geometries are parametrized as 1D signals based on the central vessel axis.
The generator can optionally be provided with an attribute vector to synthesize vessels with particular characteristics.
The GAN was optimized using a reference database with parametrizations of 4,412 real coronary artery geometries extracted from CCTA scans.
After training, plausible coronary artery geometries could be synthesized based on random vectors sampled from a latent space.
A qualitative analysis showed strong similarities between real and synthesized coronary arteries.
A detailed analysis of the latent space showed that the diversity present in coronary artery anatomy was accurately captured by the generator.
Results show that Wasserstein generative adversarial networks can be used to synthesize blood vessel geometries.
This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens.
Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and data fusion and is expected to result in increased geographical coverage and temporal granularity of air quality data.
Robust Stable Marriage (RSM) is a variant of the classical Stable Marriage problem, where the robustness of a given stable matching is measured by the number of modifications required for repairing it in case an unforeseen event occurs.
We focus on the complexity of finding an (a,b)-supermatch.
An (a,b)-supermatch is defined as a stable matching in which if any 'a' (non-fixed) men/women break up it is possible to find another stable matching by changing the partners of those 'a' men/women and also the partners of at most 'b' other couples.
In order to show deciding if there exists an (a,b)-supermatch is NP-Complete, we first introduce a SAT formulation that is NP-Complete by using Schaefer's Dichotomy Theorem.
Then, we show the equivalence between the SAT formulation and finding a (1,1)-supermatch on a specific family of instances.
We propose Nazr-CNN1, a deep learning pipeline for object detection and fine-grained classification in images acquired from Unmanned Aerial Vehicles (UAVs) for damage assessment and monitoring.
Nazr-CNN consists of two components.
The function of the first component is to localize objects (e.g. houses or infrastructure) in an image by carrying out a pixel-level classification.
In the second component, a hidden layer of a Convolutional Neural Network (CNN) is used to encode Fisher Vectors (FV) of the segments generated from the first component in order to help discriminate between different levels of damage.
To showcase our approach we use data from UAVs that were deployed to assess the level of damage in the aftermath of a devastating cyclone that hit the island of Vanuatu in 2015.
The collected images were labeled by a crowdsourcing effort and the labeling categories consisted of fine-grained levels of damage to built structures.
Since our data set is relatively small, a pre- trained network for pixel-level classification and FV encoding was used.
Nazr-CNN attains promising results both for object detection and damage assessment suggesting that the integrated pipeline is robust in the face of small data sets and labeling errors by annotators.
While the focus of Nazr-CNN is on assessment of UAV images in a post-disaster scenario, our solution is general and can be applied in many diverse settings.
We show one such case of transfer learning to assess the level of damage in aerial images collected after a typhoon in Philippines.
The most widely used activation functions in current deep feed-forward neural networks are rectified linear units (ReLU), and many alternatives have been successfully applied, as well.
However, none of the alternatives have managed to consistently outperform the rest and there is no unified theory connecting properties of the task and network with properties of activation functions for most efficient training.
A possible solution is to have the network learn its preferred activation functions.
In this work, we introduce Adaptive Blending Units (ABUs), a trainable linear combination of a set of activation functions.
Since ABUs learn the shape, as well as the overall scaling of the activation function, we also analyze the effects of adaptive scaling in common activation functions.
We experimentally demonstrate advantages of both adaptive scaling and ABUs over common activation functions across a set of systematically varied network specifications.
We further show that adaptive scaling works by mitigating covariate shifts during training, and that the observed advantages in performance of ABUs likewise rely largely on the activation function's ability to adapt over the course of training.
We explore learning-based approaches for feedback control of a dexterous five-finger hand performing non-prehensile manipulation.
First, we learn local controllers that are able to perform the task starting at a predefined initial state.
These controllers are constructed using trajectory optimization with respect to locally-linear time-varying models learned directly from sensor data.
In some cases, we initialize the optimizer with human demonstrations collected via teleoperation in a virtual environment.
We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.
We then consider two interpolation methods for generalizing to a wider range of initial conditions: deep learning, and nearest neighbors.
We find that nearest neighbors achieve higher performance.
Nevertheless, the neural network has its advantages: it uses only tactile and proprioceptive feedback but no visual feedback about the object (i.e. it performs the task blind) and learns a time-invariant policy.
In contrast, the nearest neighbors method switches between time-varying local controllers based on the proximity of initial object states sensed via motion capture.
While both generalization methods leave room for improvement, our work shows that (i) local trajectory-based controllers for complex non-prehensile manipulation tasks can be constructed from surprisingly small amounts of training data, and (ii) collections of such controllers can be interpolated to form more global controllers.
Results are summarized in the supplementary video: https://youtu.be/E0wmO6deqjo
We introduce a deep residual recurrent neural network (DR-RNN) as an efficient model reduction technique for nonlinear dynamical systems.
The developed DR-RNN is inspired by the iterative steps of line search methods in finding the residual minimiser of numerically discretized differential equations.
We formulate this iterative scheme as stacked recurrent neural network (RNN) embedded with the dynamical structure of the emulated differential equations.
Numerical examples demonstrate that DR-RNN can effectively emulate the full order models of nonlinear physical systems with a significantly lower number of parameters in comparison to standard RNN architectures.
Further, we combined DR-RNN with Proper Orthogonal Decomposition (POD) for model reduction of time dependent partial differential equations.
The presented numerical results show the stability of proposed DR-RNN as an explicit reduced order technique.
We also show significant gains in accuracy by increasing the depth of proposed DR-RNN similar to other applications of deep learning.
Cybercrime forums enable modern criminal entrepreneurs to collaborate with other criminals into increasingly efficient and sophisticated criminal endeavors.
Understanding the connections between different products and services can often illuminate effective interventions.
However, generating this understanding of supply chains currently requires time-consuming manual effort.
In this paper, we propose a language-agnostic method to automatically extract supply chains from cybercrime forum posts and replies.
Our supply chain detection algorithm can identify 36% and 58% relevant chains within major English and Russian forums, respectively, showing improvements over the baselines of 13% and 36%, respectively.
Our analysis of the automatically generated supply chains demonstrates underlying connections between products and services within these forums.
For example, the extracted supply chain illuminated the connection between hack-for-hire services and the selling of rare and valuable `OG' accounts, which has only recently been reported.
The understanding of connections between products and services exposes potentially effective intervention points.
Millimeter-wave (mmWave) with large spectrum available is considered as the most promising frequency band for future wireless communications.
The IEEE 802.11ad and IEEE 802.11ay operating on 60 GHz mmWave are the two most expected wireless local area network (WLAN) technologies for ultra-high-speed communications.
For the IEEE 802.11ay standard still under development, there are plenty of proposals from companies and researchers who are involved with the IEEE 802.11ay task group.
In this survey, we conduct a comprehensive review on the medium access control layer (MAC) related issues for the IEEE 802.11ay, some cross-layer between physical layer (PHY) and MAC technologies are also included.
We start with MAC related technologies in the IEEE 802.11ad and discuss design challenges on mmWave communications, leading to some MAC related technologies for the IEEE 802.11ay.
We then elaborate on important design issues for IEEE 802.11ay.
Specifically, we review the channel bonding and aggregation for the IEEE 802.11ay, and point out the major differences between the two technologies.
Then, we describe channel access and channel allocation in the IEEE 802.11ay, including spatial sharing and interference mitigation technologies.
After that, we present an in-depth survey on beamforming training (BFT), beam tracking, single-user multiple-input-multiple-output (SU-MIMO) beamforming and multi-user multiple-input-multiple-output (MU-MIMO) beamforming.
Finally, we discuss some open design issues and future research directions for mmWave WLANs.
We hope that this paper provides a good introduction to this exciting research area for future wireless systems.
Feature representation of different modalities is the main focus of current cross-modal information retrieval research.
Existing models typically project texts and images into the same embedding space.
In this paper, we explore the multitudinous of textural relationships in text modeling.
Specifically, texts are represented by a graph generated using various textural relationships including semantic relations, statistical co-occurrence, and predefined knowledge base.
A joint neural model is proposed to learn feature representation individually in each modality.
We use Graph Convolutional Network (GCN) to capture relation-aware representations of texts and Convolutional Neural Network (CNN) to learn image representations.
Comprehensive experiments are conducted on two benchmark datasets.
The results show that our model outperforms the state-of-the-art models significantly by 6.3% on the CMPlaces data and 3.4% on English Wikipedia, respectively.
Malignant Pleural Mesothelioma (MPM) or malignant mesothelioma (MM) is an atypical, aggressive tumor that matures into cancer in the pleura, a stratum of tissue bordering the lungs.
Diagnosis of MPM is difficult and it accounts for about seventy-five percent of all mesothelioma diagnosed yearly in the United States of America.
Being a fatal disease, early identification of MPM is crucial for patient survival.
Our study implements logistic regression and develops association rules to identify early stage symptoms of MM.
We retrieved medical reports generated by Dicle University and implemented logistic regression to measure the model accuracy.
We conducted (a) logistic correlation, (b) Omnibus test and (c) Hosmer and Lemeshow test for model evaluation.
Moreover, we also developed association rules by confidence, rule support, lift, condition support and deployability.
Categorical logistic regression increases the training accuracy from 72.30% to 81.40% with a testing accuracy of 63.46%.
The study also shows the top 5 symptoms that is mostly likely indicates the presence in MM.
This study concludes that using predictive modeling can enhance primary presentation and diagnosis of MM.
Many people use Yelp to find a good restaurant.
Nonetheless, with only an overall rating for each restaurant, Yelp offers not enough information for independently judging its various aspects such as environment, service or flavor.
In this paper, we introduced a machine learning based method to characterize such aspects for particular types of restaurants.
The main approach used in this paper is to use a support vector machine (SVM) model to decipher the sentiment tendency of each review from word frequency.
Word scores generated from the SVM models are further processed into a polarity index indicating the significance of each word for special types of restaurant.
Customers overall tend to express more sentiment regarding service.
As for the distinction between different cuisines, results that match the common sense are obtained: Japanese cuisines are usually fresh, some French cuisines are overpriced while Italian Restaurants are often famous for their pizzas.
Identifying potential abuses of human rights through imagery is a novel and challenging task in the field of computer vision, that will enable to expose human rights violations over large-scale data that may otherwise be impossible.
While standard databases for object and scene categorisation contain hundreds of different classes, the largest available dataset of human rights violations contains only 4 classes.
Here, we introduce the `Human Rights Archive Database' (HRA), a verified-by-experts repository of 3050 human rights violations photographs, labelled with human rights semantic categories, comprising a list of the types of human rights abuses encountered at present.
With the HRA dataset and a two-phase transfer learning scheme, we fine-tuned the state-of-the-art deep convolutional neural networks (CNNs) to provide human rights violations classification CNNs (HRA-CNNs).
We also present extensive experiments refined to evaluate how well object-centric and scene-centric CNN features can be combined for the task of recognising human rights abuses.
With this, we show that HRA database poses a challenge at a higher level for the well studied representation learning methods, and provide a benchmark in the task of human rights violations recognition in visual context.
We expect this dataset can help to open up new horizons on creating systems able of recognising rich information about human rights violations.
Our dataset, codes and trained models are available online at https://github.com/GKalliatakis/Human-Rights-Archive-CNNs.
Human activity recognition based on video streams has received numerous attentions in recent years.
Due to lack of depth information, RGB video based activity recognition performs poorly compared to RGB-D video based solutions.
On the other hand, acquiring depth information, inertia etc. is costly and requires special equipment, whereas RGB video streams are available in ordinary cameras.
Hence, our goal is to investigate whether similar or even higher accuracy can be achieved with RGB-only modality.
In this regard, we propose a novel framework that couples skeleton data extracted from RGB video and deep Bidirectional Long Short Term Memory (BLSTM) model for activity recognition.
A big challenge of training such a deep network is the limited training data, and exploring RGB-only stream significantly exaggerates the difficulty.
We therefore propose a set of algorithmic techniques to train this model effectively, e.g., data augmentation, dynamic frame dropout and gradient injection.
The experiments demonstrate that our RGB-only solution surpasses the state-of-the-art approaches that all exploit RGB-D video streams by a notable margin.
This makes our solution widely deployable with ordinary cameras.
People with profound motor deficits could perform useful physical tasks for themselves by controlling robots that are comparable to the human body.
Whether this is possible without invasive interfaces has been unclear, due to the robot's complexity and the person's limitations.
We developed a novel, augmented reality interface and conducted two studies to evaluate the extent to which it enabled people with profound motor deficits to control robotic body surrogates.
15 novice users achieved meaningful improvements on a clinical manipulation assessment when controlling the robot in Atlanta from locations across the United States.
Also, one expert user performed 59 distinct tasks in his own home over seven days, including self-care tasks such as feeding.
Our results demonstrate that people with profound motor deficits can effectively control robotic body surrogates without invasive interfaces.
The advent of language implementation tools such as PyPy and Truffle/Graal have reinvigorated and broadened interest in topics related to automatic compiler generation and optimization.
Given this broader interest, we revisit the Futamura Projections using a novel diagram scheme.
Through these diagrams we emphasize the recurring patterns in the Futamura Projections while addressing their complexity and abstract nature.
We anticipate that this approach will improve the accessibility of the Futamura Projections and help foster analysis of those new tools through the lens of partial evaluation.
The share of videos in the internet traffic has been growing, therefore understanding how videos capture attention on a global scale is also of growing importance.
Most current research focus on modeling the number of views, but we argue that video engagement, or time spent watching is a more appropriate measure for resource allocation problems in attention, networking, and promotion activities.
In this paper, we present a first large-scale measurement of video-level aggregate engagement from publicly available data streams, on a collection of 5.3 million YouTube videos published over two months in 2016.
We study a set of metrics including time and the average percentage of a video watched.
We define a new metric, relative engagement, that is calibrated against video properties and strongly correlate with recognized notions of quality.
Moreover, we find that engagement measures of a video are stable over time, thus separating the concerns for modeling engagement and those for popularity -- the latter is known to be unstable over time and driven by external promotions.
We also find engagement metrics predictable from a cold-start setup, having most of its variance explained by video context, topics and channel information -- R2=0.77.
Our observations imply several prospective uses of engagement metrics -- choosing engaging topics for video production, or promoting engaging videos in recommender systems.
Android Inter-Component Communication (ICC) is complex, largely unconstrained, and hard for developers to understand.
As a consequence, ICC is a common source of security vulnerability in Android apps.
To promote secure programming practices, we have reviewed related research, and identified avoidable ICC vulnerabilities in Android-run devices and the security code smells that indicate their presence.
We explain the vulnerabilities and their corresponding smells, and we discuss how they can be eliminated or mitigated during development.
We present a lightweight static analysis tool on top of Android Lint that analyzes the code under development and provides just-in-time feedback within the IDE about the presence of such smells in the code.
Moreover, with the help of this tool we study the prevalence of security code smells in more than 700 open-source apps, and manually inspect around 15% of the apps to assess the extent to which identifying such smells uncovers ICC security vulnerabilities.
In this paper, we attempt to advance the research work done in human action recognition to a rather specialized application namely Indian Classical Dance (ICD) classification.
The variation in such dance forms in terms of hand and body postures, facial expressions or emotions and head orientation makes pose estimation an extremely challenging task.
To circumvent this problem, we construct a pose-oblivious shape signature which is fed to a sequence learning framework.
The pose signature representation is done in two-fold process.
First, we represent person-pose in first frame of a dance video using symmetric Spatial Transformer Networks (STN) to extract good person object proposals and CNN-based parallel single person pose estimator (SPPE).
Next, the pose basis are converted to pose flows by assigning a similarity score between successive poses followed by non-maximal suppression.
Instead of feeding a simple chain of joints in the sequence learner which generally hinders the network performance we constitute a feature vector of the normalized distance vectors, flow, angles between anchor joints which captures the adjacency configuration in the skeletal pattern.
Thus, the kinematic relationship amongst the body joints across the frames using pose estimation helps in better establishing the spatio-temporal dependencies.
We present an exhaustive empirical evaluation of state-of-the-art deep network based methods for dance classification on ICD dataset.
Replikativ is a replication middleware supporting a new kind of confluent replicated datatype resembling a distributed version control system.
It retains the order of write operations at the trade-off of reduced availability with after-the- fact conflict resolution.
The system allows to develop applications with distributed state in a similar fashion as native applications with exclusive local state, while transparently exposing the necessary compromises in terms of the CAP theorem.
In this paper, we give a specification of the replicated datatype and discuss its usage in the replikativ middleware.
Experiments with the implementation show the feasibility of the concept as a foundation for replication as a service (RaaS).
In this paper, we investigate the strength of six different SAT solvers in attacking various obfuscation schemes.
Our investigation revealed that Glucose and Lingeling SAT solvers are generally suited for attacking small-to-midsize obfuscated circuits, while the MapleGlucose, if the system is not memory bound, is best suited for attacking mid-to-difficult obfuscation methods.
Our experimental result indicates that when dealing with extremely large circuits and very difficult obfuscation problems, the SAT solver may be memory bound, and Lingeling, for having the most memory efficient implementation, is the best-suited solver for such problems.
Additionally, our investigation revealed that SAT solver execution times may vary widely across different SAT solvers.
Hence, when testing the hardness of an obfuscation method, although the increase in difficulty could be verified by one SAT solver, the pace of increase in difficulty is dependent on the choice of a SAT solver.
Many real-world problems involve massive amounts of data.
Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed.
A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning.
Alternatively, one customizes learning algorithms to achieve scalability.
In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results.
In this paper we discuss a meta-learning algorithm (PSBML) which combines features of parallel algorithms with concepts from ensemble and boosting methodologies to achieve the desired scalability property.
We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin.
We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers.
We perform extensive experiments to investigate the tradeoff achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data.
These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy.
We present an iterative overlap estimation technique to augment existing point cloud registration algorithms that can achieve high performance in difficult real-world situations where large pose displacement and non-overlapping geometry would otherwise cause traditional methods to fail.
Our approach estimates overlapping regions through an iterative Expectation Maximization procedure that encodes the sensor field-of-view into the registration process.
The proposed technique, Expected Overlap Estimation (EOE), is derived from the observation that differences in field-of-view violate the iid assumption implicitly held by all maximum likelihood based registration techniques.
We demonstrate how our approach can augment many popular registration methods with minimal computational overhead.
Through experimentation on both synthetic and real-world datasets, we find that adding an explicit overlap estimation step can aid robust outlier handling and increase the accuracy of both ICP-based and GMM-based registration methods, especially in large unstructured domains and where the amount of overlap between point clouds is very small.
Researchers find weaknesses in current strategies for protecting privacy in large datasets.
Many anonymized datasets are reidentifiable, and norms for offering data subjects notice and consent over emphasize individual responsibility.
Based on fieldwork with data managers in the City of Seattle, I identify ways that these conventional approaches break down in practice.
Drawing on work from theorists in sociocultural anthropology, I propose that a Human Centered Data Science move beyond concepts like dataset identifiability and sensitivity toward a broader ontology of who is implicated by a dataset, and new ways of anticipating how data can be combined and used.
Statistical learning on biological data can be challenging due to confounding variables in sample collection and processing.
Confounders can cause models to generalize poorly and result in inaccurate prediction performance metrics if models are not validated thoroughly.
In this paper, we propose methods to control for confounding factors and further improve prediction performance.
We introduce OrthoNormal basis construction In cOnfounding factor Normalization (ONION) to remove confounding covariates and use the Domain-Adversarial Neural Network (DANN) to penalize models for encoding confounder information.
We apply the proposed methods to simulated and empirical patient data and show significant improvements in generalization.
Knowledge-based question answering relies on the availability of facts, the majority of which cannot be found in structured sources (e.g.Wikipedia info-boxes, Wikidata).
One of the major components of extracting facts from unstructured text is Relation Extraction (RE).
In this paper we propose a novel method for creating distant (weak) supervision labels for training a large-scale RE system.
We also provide new evidence about the effectiveness of neural network approaches by decoupling the model architecture from the feature design of a state-of-the-art neural network system.
Surprisingly, a much simpler classifier trained on similar features performs on par with the highly complex neural network system (at 75x reduction to the training time), suggesting that the features are a bigger contributor to the final performance.
A cellular automaton (CA) is a parallel synchronous computing model, which consists in a juxtaposition of finite automata (cells) whose state evolves according to that of their neighbors.
Its trace is the set of infinite words representing the sequence of states taken by some particular cell.
In this paper we study the ultimate trace of CA and partial CA (a CA restricted to a particular subshift).
The ultimate trace is the trace observed after a long time run of the CA.
We give sufficient conditions for a set of infinite words to be the trace of some CA and prove the undecidability of all properties over traces that are stable by ultimate coincidence.
We are experiencing an upcoming trend of using head mounted display systems in games and serious games, which is likely to become an established practice in the near future.
While these systems provide highly immersive experiences, many users have been reporting discomfort symptoms, such as nausea, sickness, and headaches, among others.
When using VR for health applications, this is more critical, since the discomfort may interfere a lot in treatments.
In this work we discuss possible causes of these issues, and present possible solutions as design guidelines that may mitigate them.
In this context, we go deeper within a dynamic focus solution to reduce discomfort in immersive virtual environments, when using first-person navigation.
This solution applies an heuristic model of visual attention that works in real time.
This work also discusses a case study (as a first-person spatial shooter demo) that applies this solution and the proposed design guidelines.
Much research has been devoted to optimizing algorithms of the Lempel-Ziv (LZ) 77 family, both in terms of speed and memory requirements.
Binary search trees and suffix trees (ST) are data structures that have been often used for this purpose, as they allow fast searches at the expense of memory usage.
In recent years, there has been interest on suffix arrays (SA), due to their simplicity and low memory requirements.
One key issue is that an SA can solve the sub-string problem almost as efficiently as an ST, using less memory.
This paper proposes two new SA-based algorithms for LZ encoding, which require no modifications on the decoder side.
Experimental results on standard benchmarks show that our algorithms, though not faster, use 3 to 5 times less memory than the ST counterparts.
Another important feature of our SA-based algorithms is that the amount of memory is independent of the text to search, thus the memory that has to be allocated can be defined a priori.
These features of low and predictable memory requirements are of the utmost importance in several scenarios, such as embedded systems, where memory is at a premium and speed is not critical.
Finally, we point out that the new algorithms are general, in the sense that they are adequate for applications other than LZ compression, such as text retrieval and forward/backward sub-string search.
The big breakthrough on the ImageNet challenge in 2012 was partially due to the `dropout' technique used to avoid overfitting.
Here, we introduce a new approach called `Spectral Dropout' to improve the generalization ability of deep neural networks.
We cast the proposed approach in the form of regular Convolutional Neural Network (CNN) weight layers using a decorrelation transform with fixed basis functions.
Our spectral dropout method prevents overfitting by eliminating weak and `noisy' Fourier domain coefficients of the neural network activations, leading to remarkably better results than the current regularization methods.
Furthermore, the proposed is very efficient due to the fixed basis functions used for spectral transformation.
In particular, compared to Dropout and Drop-Connect, our method significantly speeds up the network convergence rate during the training process (roughly x2), with considerably higher neuron pruning rates (an increase of ~ 30%).
We demonstrate that the spectral dropout can also be used in conjunction with other regularization approaches resulting in additional performance gains.
Objectives: This paper presents an up-to-date overview of research performed in the Virtual Reality (VR) environment ranging from definitions, its presence in the various fields, and existing market players and their projects in the VR technology.
Further an attempt is made to gain an insight on the psychological mechanism underlying experience in using VR device.
Methods: Our literature survey is based on the research articles, analysis of the projects of various companies and their findings for different areas of interest.
Findings: In our literature survey we observed that the recent advances in virtual reality enabling technologies have led to variety of virtual devices that facilitate people to interact with the digital world.
In fact in the past two decades researchers have tried to integrate reality and VR in the form of intuitive computer interface.
Improvements: This has led to variety of potential benefits of VR in many applications such as News, Healthcare, Entertainment, Tourism, Military and Defence etc.
However despite the extensive research efforts in creating virtual system environments it is yet to become apparent in normal daily life.
Vertex centrality measures are a multi-purpose analysis tool, commonly used in many application environments to retrieve information and unveil knowledge from the graphs and network structural properties.
However, the algorithms of such metrics are expensive in terms of computational resources when running real-time applications or massive real world networks.
Thus, approximation techniques have been developed and used to compute the measures in such scenarios.
In this paper, we demonstrate and analyze the use of neural network learning algorithms to tackle such task and compare their performance in terms of solution quality and computation time with other techniques from the literature.
Our work offers several contributions.
We highlight both the pros and cons of approximating centralities though neural learning.
By empirical means and statistics, we then show that the regression model generated with a feedforward neural networks trained by the Levenberg-Marquardt algorithm is not only the best option considering computational resources, but also achieves the best solution quality for relevant applications and large-scale networks.
Keywords: Vertex Centrality Measures, Neural Networks, Complex Network Models, Machine Learning, Regression Model
Clustering scientific publications in an important problem in bibliometric research.
We demonstrate how two software tools, CitNetExplorer and VOSviewer, can be used to cluster publications and to analyze the resulting clustering solutions.
CitNetExplorer is used to cluster a large set of publications in the field of astronomy and astrophysics.
The publications are clustered based on direct citation relations.
CitNetExplorer and VOSviewer are used together to analyze the resulting clustering solutions.
Both tools use visualizations to support the analysis of the clustering solutions, with CitNetExplorer focusing on the analysis at the level of individual publications and VOSviewer focusing on the analysis at an aggregate level.
The demonstration provided in this paper shows how a clustering of publications can be created and analyzed using freely available software tools.
Using the approach presented in this paper, bibliometricians are able to carry out sophisticated cluster analyses without the need to have a deep knowledge of clustering techniques and without requiring advanced computer skills.
Recent advances in saliency detection have utilized deep learning to obtain high level features to detect salient regions in a scene.
These advances have demonstrated superior results over previous works that utilize hand-crafted low level features for saliency detection.
In this paper, we demonstrate that hand-crafted features can provide complementary information to enhance performance of saliency detection that utilizes only high level features.
Our method utilizes both high level and low level features for saliency detection under a unified deep learning framework.
The high level features are extracted using the VGG-net, and the low level features are compared with other parts of an image to form a low level distance map.
The low level distance map is then encoded using a convolutional neural network(CNN) with multiple 1X1 convolutional and ReLU layers.
We concatenate the encoded low level distance map and the high level features, and connect them to a fully connected neural network classifier to evaluate the saliency of a query region.
Our experiments show that our method can further improve the performance of state-of-the-art deep learning-based saliency detection methods.
Outlier detection plays an essential role in many data-driven applications to identify isolated instances that are different from the majority.
While many statistical learning and data mining techniques have been used for developing more effective outlier detection algorithms, the interpretation of detected outliers does not receive much attention.
Interpretation is becoming increasingly important to help people trust and evaluate the developed models through providing intrinsic reasons why the certain outliers are chosen.
It is difficult, if not impossible, to simply apply feature selection for explaining outliers due to the distinct characteristics of various detection models, complicated structures of data in certain applications, and imbalanced distribution of outliers and normal instances.
In addition, the role of contrastive contexts where outliers locate, as well as the relation between outliers and contexts, are usually overlooked in interpretation.
To tackle the issues above, in this paper, we propose a novel Contextual Outlier INterpretation (COIN) method to explain the abnormality of existing outliers spotted by detectors.
The interpretability for an outlier is achieved from three aspects: outlierness score, attributes that contribute to the abnormality, and contextual description of its neighborhoods.
Experimental results on various types of datasets demonstrate the flexibility and effectiveness of the proposed framework compared with existing interpretation approaches.
Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling.
Though notable improvements have been made in these areas, the reordering problem still remains a challenge in statistical machine translations.
In this paper, we present a novel neural reordering model that directly models word pairs and alignment.
By utilizing LSTM recurrent neural networks, much longer context could be learned for reordering prediction.
Experimental results on NIST OpenMT12 Arabic-English and Chinese-English 1000-best rescoring task show that our LSTM neural reordering feature is robust and achieves significant improvements over various baseline systems.
City-scale sensing holds the promise of enabling a deeper understanding of our urban environments.
However, a city-scale deployment requires physical installation, power management, and communications---all challenging tasks standing between a good idea and a realized one.
This indicates the need for a platform that enables easy deployment and experimentation for applications operating at city scale.
To address these challenges, we present Signpost, a modular, energy-harvesting platform for city-scale sensing.
Signpost simplifies deployment by eliminating the need for connection to wired infrastructure and instead harvesting energy from an integrated solar panel.
The platform furnishes the key resources necessary to support multiple, pluggable sensor modules while providing fair, safe, and reliable sharing in the face of dynamic energy constraints.
We deploy Signpost with several sensor modules, showing the viability of an energy-harvesting, multi-tenant, sensing system, and evaluate its ability to support sensing applications.
We believe Signpost reduces the difficulty inherent in city-scale deployments, enables new experimentation, and provides improved insights into urban health.
Programmers make rich use of natural language in the source code they write through identifiers and comments.
Source code identifiers are selected from a pool of tokens which are strongly related to the meaning, naming conventions, and context.
These tokens are often combined to produce more precise and obvious designations.
Such multi-part identifiers count for 97% of all naming tokens in the Public Git Archive - the largest dataset of Git repositories to date.
We introduce a bidirectional LSTM recurrent neural network to detect subtokens in source code identifiers.
We trained that network on 41.7 million distinct splittable identifiers collected from 182,014 open source projects in Public Git Archive, and show that it outperforms several other machine learning models.
The proposed network can be used to improve the upstream models which are based on source code identifiers, as well as improving developer experience allowing writing code without switching the keyboard case.
Growing consumer awareness as well as manufacturers' internal quality requirements lead to novel demands on supply chain traceability.
Existing centralized solutions suffer from isolated data storage and lacking trust when multiple parties are involved.
Decentralized blockchain-based approaches attempt to overcome these shortcomings by creating digital representations of physical goods to facilitate tracking across multiple entities.
However, they currently do not capture the transformation of goods in manufacturing processes.
Therefore, the relation between ingredients and product is lost, limiting the ability to trace a product's provenance.
We propose a blockchain-based supply chain traceability system using smart contracts.
In such contracts, manufacturers define the composition of products in the form of recipes.
Each ingredient of the recipe is a non-fungible token that corresponds to a batch of physical goods.
When the recipe is applied, its ingredients are consumed and a new token is produced.
This mechanism preserves the traceability of product transformations.
The system is implemented for the Ethereum Virtual Machine and is applicable to any blockchain configuration that supports it.
Our evaluation reveals that the gas costs scale linearly with the number of products considered in the system.
This leads to the conclusion that the solution can handle complex use cases.
Supporting programming on touchscreen devices requires effective text input and editing methods.
Unfortunately, the virtual keyboard can be inefficient and uses valuable screen space on already small devices.
Recent advances in stylus input make handwriting a potentially viable text input solution for programming on touchscreen devices.
The primary barrier, however, is that handwriting recognition systems are built to take advantage of the rules of natural language, not those of a programming language.
In this paper, we explore this particular problem of handwriting recognition for source code.
We collect and make publicly available a dataset of handwritten Python code samples from 15 participants and we characterize the typical recognition errors for this handwritten Python source code when using a state-of-the-art handwriting recognition tool.
We present an approach to improve the recognition accuracy by augmenting a handwriting recognizer with the programming language grammar rules.
Our experiment on the collected dataset shows an 8.6% word error rate and a 3.6% character error rate which outperforms standard handwriting recognition systems and compares favorably to typing source code on virtual keyboards.
We present a method for transferring neural representations from label-rich source domains to unlabeled target domains.
Recent adversarial methods proposed for this task learn to align features across domains by fooling a special domain critic network.
However, a drawback of this approach is that the critic simply labels the generated features as in-domain or not, without considering the boundaries between classes.
This can lead to ambiguous features being generated near class boundaries, reducing target classification accuracy.
We propose a novel approach, Adversarial Dropout Regularization (ADR), to encourage the generator to output more discriminative features for the target domain.
Our key idea is to replace the critic with one that detects non-discriminative features, using dropout on the classifier network.
The generator then learns to avoid these areas of the feature space and thus creates better features.
We apply our ADR approach to the problem of unsupervised domain adaptation for image classification and semantic segmentation tasks, and demonstrate significant improvement over the state of the art.
We also show that our approach can be used to train Generative Adversarial Networks for semi-supervised learning.
We propose robust methods for estimating camera egomotion in noisy, real-world monocular image sequences in the general case of unknown observer rotation and translation with two views and a small baseline.
This is a difficult problem because of the nonconvex cost function of the perspective camera motion equation and because of non-Gaussian noise arising from noisy optical flow estimates and scene non-rigidity.
To address this problem, we introduce the expected residual likelihood method (ERL), which estimates confidence weights for noisy optical flow data using likelihood distributions of the residuals of the flow field under a range of counterfactual model parameters.
We show that ERL is effective at identifying outliers and recovering appropriate confidence weights in many settings.
We compare ERL to a novel formulation of the perspective camera motion equation using a lifted kernel, a recently proposed optimization framework for joint parameter and confidence weight estimation with good empirical properties.
We incorporate these strategies into a motion estimation pipeline that avoids falling into local minima.
We find that ERL outperforms the lifted kernel method and baseline monocular egomotion estimation strategies on the challenging KITTI dataset, while adding almost no runtime cost over baseline egomotion methods.
Sampling efficiency in a highly constrained environment has long been a major challenge for sampling-based planners.
In this work, we propose Rapidly-exploring Random disjointed-Trees* (RRdT*), an incremental optimal multi-query planner.
RRdT* uses multiple disjointed-trees to exploit local-connectivity of spaces via Markov Chain random sampling, which utilises neighbourhood information derived from previous successful and failed samples.
To balance local exploitation, RRdT* actively explore unseen global spaces when local-connectivity exploitation is unsuccessful.
The active trade-off between local exploitation and global exploration is formulated as a multi-armed bandit problem.
We argue that the active balancing of global exploration and local exploitation is the key to improving sample efficient in sampling-based motion planners.
We provide rigorous proofs of completeness and optimal convergence for this novel approach.
Furthermore, we demonstrate experimentally the effectiveness of RRdT*'s locally exploring trees in granting improved visibility for planning.
Consequently, RRdT* outperforms existing state-of-the-art incremental planners, especially in highly constrained environments.
This paper describes a massively parallel code for a state-of-the art thermal lattice- Boltzmann method.
Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs.
Versions of this code have been already used for large-scale studies of convective turbulence.
GPUs are becoming increasingly popular in HPC applications, as they are able to deliver higher performance than traditional processors.
Writing efficient programs for large clusters is not an easy task as codes must adapt to increasingly parallel architectures, and the overheads of node-to-node communications must be properly handled.
We describe the structure of our code, discussing several key design choices that were guided by theoretical models of performance and experimental benchmarks.
We present an extensive set of performance measurements and identify the corresponding main bot- tlenecks; finally we compare the results of our GPU code with those measured on other currently available high performance processors.
Our results are a production-grade code able to deliver a sustained performance of several tens of Tflops as well as a design and op- timization methodology that can be used for the development of other high performance applications for computational physics.
Since the introduction of the Fortran programming language some 60 years ago, there has been little progress in making error messages more user-friendly.
A first step in this direction is to translate them into the natural language of the students.
In this paper we propose a simple script for Linux systems which gives word by word translations of error messages.
It works for most programming languages and for all natural languages.
Understanding the error messages generated by compilers is a major hurdle for students who are learning programming, particularly for non-native English speakers.
Not only may they never become "fluent" in programming but many give up programming altogether.
Whereas programming is a tool which can be useful in many human activities, e.g. history, genealogy, astronomy, entomology, in many countries the skill of programming remains confined to a narrow fringe of professional programmers.
In all societies, besides professional violinists there are also amateurs.
It should be the same for programming.
It is our hope that once translated and explained the error messages will be seen by the students as an aid rather than as an obstacle and that in this way more students will enjoy learning and practising programming.
They should see it as a funny game.
The e-commerce share in the global retail spend is showing a steady increase over the years indicating an evident shift of consumer attention from bricks and mortar to clicks in retail sector.
In recent years, online marketplaces have become one of the key contributors to this growth.
As the business model matures, the number and types of frauds getting reported in the area is also growing on a daily basis.
Fraudulent e-commerce buyers and their transactions are being studied in detail and multiple strategies to control and prevent them are discussed.
Another area of fraud happening in marketplaces are on the seller side and is called merchant fraud.
Goods/services offered and sold at cheap rates, but never shipped is a simple example of this type of fraud.
This paper attempts to suggest a framework to detect such fraudulent sellers with the help of machine learning techniques.
The model leverages the historic data from the marketplace and detect any possible fraudulent behaviours from sellers and alert to the marketplace.
Web applications are permanently being exposed to attacks that exploit their vulnerabilities.
In this work we investigate the application of machine learning techniques to leverage Web Application Firewall (WAF), a technology that is used to detect and prevent attacks.
We propose a combined approach of machine learning models, based on one-class classification and n-gram analysis, to enhance the detection and accuracy capabilities of MODSECURITY, an open source and widely used WAF.
The results are promising and outperform MODSECURITY when configured with the OWASP Core Rule Set, the baseline configuration setting of a widely deployed, rule-based WAF technology.
The proposed solution, combining both approaches, allow us to deploy a WAF when no training data for the application is available (using one-class classification), and an improved one using n-grams when training data is available.
We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits.
The former concerns limited "resources" consumed by the algorithm, e.g., limited supply in dynamic pricing.
The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable.
We define a common generalization, support it with several motivating examples, and design an algorithm for it.
Our regret bounds are comparable with those for BwK and combinatorial semi- bandits.
We present a real-time object-based SLAM system that leverages the largest object database to date.
Our approach comprises two main components: 1) a monocular SLAM algorithm that exploits object rigidity constraints to improve the map and find its real scale, and 2) a novel object recognition algorithm based on bags of binary words, which provides live detections with a database of 500 3D objects.
The two components work together and benefit each other: the SLAM algorithm accumulates information from the observations of the objects, anchors object features to especial map landmarks and sets constrains on the optimization.
At the same time, objects partially or fully located within the map are used as a prior to guide the recognition algorithm, achieving higher recall.
We evaluate our proposal on five real environments showing improvements on the accuracy of the map and efficiency with respect to other state-of-the-art techniques.
We introduce a novel approach for building language models based on a systematic, recursive exploration of skip n-gram models which are interpolated using modified Kneser-Ney smoothing.
Our approach generalizes language models as it contains the classical interpolation with lower order models as a special case.
In this paper we motivate, formalize and present our approach.
In an extensive empirical experiment over English text corpora we demonstrate that our generalized language models lead to a substantial reduction of perplexity between 3.1% and 12.7% in comparison to traditional language models using modified Kneser-Ney smoothing.
Furthermore, we investigate the behaviour over three other languages and a domain specific corpus where we observed consistent improvements.
Finally, we also show that the strength of our approach lies in its ability to cope in particular with sparse training data.
Using a very small training data set of only 736 KB text we yield improvements of even 25.7% reduction of perplexity.
Deep generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have recently been applied to style and domain transfer for images, and in the case of VAEs, music.
GAN-based models employing several generators and some form of cycle consistency loss have been among the most successful for image domain transfer.
In this paper we apply such a model to symbolic music and show the feasibility of our approach for music genre transfer.
Evaluations using separate genre classifiers show that the style transfer works well.
In order to improve the fidelity of the transformed music, we add additional discriminators that cause the generators to keep the structure of the original music mostly intact, while still achieving strong genre transfer.
Visual and audible results further show the potential of our approach.
To the best of our knowledge, this paper represents the first application of GANs to symbolic music domain transfer.
We consider a multitask learning problem, in which several predictors are learned jointly.
Prior research has shown that learning the relations between tasks, and between the input features, together with the predictor, can lead to better generalization and interpretability, which proved to be useful for applications in many domains.
In this paper, we consider a formulation of multitask learning that learns the relationships both between tasks and between features, represented through a task covariance and a feature covariance matrix, respectively.
First, we demonstrate that existing methods proposed for this problem present an issue that may lead to ill-posed optimization.
We then propose an alternative formulation, as well as an efficient algorithm to optimize it.
Using ideas from optimization and graph theory, we propose an efficient coordinate-wise minimization algorithm that has a closed form solution for each block subproblem.
Our experiments show that the proposed optimization method is orders of magnitude faster than its competitors.
We also provide a nonlinear extension that is able to achieve better generalization than existing methods.
In this paper, we first address adverse effects of cyber-physical attacks on distributed synchronization of multi-agent systems, by providing necessary and sufficient conditions under which an attacker can destabilize the underlying network, as well as another set of necessary and sufficient conditions under which local neighborhood tracking errors of intact agents converge to zero.
Based on this analysis, we propose a Kullback-Liebler divergence based criterion in view of which each agent detects its neighbors' misbehavior and, consequently, forms a self-belief about the trustworthiness of the information it receives.
Agents continuously update their self-beliefs and communicate them with their neighbors to inform them of the significance of their outgoing information.
Moreover, if the self-belief of an agent is low, it forms trust on its neighbors.
Agents incorporate their neighbors' self-beliefs and their own trust values on their control protocols to slow down and mitigate attacks.
We show that using the proposed resilient approach, an agent discards the information it receives from a neighbor only if its neighbor is compromised, and not solely based on the discrepancy among neighbors' information, which might be caused by legitimate changes, and not attacks.
The proposed approach is guaranteed to work under mild connectivity assumptions.
This paper studies the geodesic diameter of polygonal domains having h holes and n corners.
For simple polygons (i.e., h = 0), the geodesic diameter is determined by a pair of corners of a given polygon and can be computed in linear time, as known by Hershberger and Suri.
For general polygonal domains with h >= 1, however, no algorithm for computing the geodesic diameter was known prior to this paper.
In this paper, we present the first algorithms that compute the geodesic diameter of a given polygonal domain in worst-case time O(n^7.73) or O(n^7 (log n + h)).
The main difficulty unlike the simple polygon case relies on the following observation revealed in this paper: two interior points can determine the geodesic diameter and in that case there exist at least five distinct shortest paths between the two.
This paper proposes a new optimal control synthesis algorithm for multi-robot systems under global temporal logic tasks.
Existing planning approaches under global temporal goals rely on graph search techniques applied to a product automaton constructed among the robots.
In this paper, we propose a new sampling-based algorithm that builds incrementally trees that approximate the state-space and transitions of the synchronous product automaton.
By approximating the product automaton by a tree rather than representing it explicitly, we require much fewer memory resources to store it and motion plans can be found by tracing sequences of parent nodes without the need for sophisticated graph search methods.
This significantly increases the scalability of our algorithm compared to existing optimal control synthesis methods.
We also show that the proposed algorithm is probabilistically complete and asymptotically optimal.
Finally, we present numerical experiments showing that our approach can synthesize optimal plans from product automata with billions of states, which is not possible using standard optimal control synthesis algorithms or off-the-shelf model checkers.
We consider a new Steiner tree problem, called vertex-cover-weighted Steiner tree problem.
This problem defines the weight of a Steiner tree as the minimum weight of vertex covers in the tree, and seeks a minimum-weight Steiner tree in a given vertex-weighted undirected graph.
Since it is included by the Steiner tree activation problem, the problem admits an O(log n)-approximation algorithm in general graphs with n vertices.
This approximation factor is tight up to a constant because it is NP-hard to achieve an o(log n)-approximation for the vertex-cover-weighted Steiner tree problem on general graphs even if the given vertex weights are uniform and a spanning tree is required instead of a Steiner tree.
In this paper, we present constant-factor approximation algorithms for the problem with unit disk graphs and with graphs excluding a fixed minor.
For the latter graph class, our algorithm can be also applied for the Steiner tree activation problem.
How to tell if a review is real or fake?
What does the underworld of fraudulent reviewing look like?
Detecting suspicious reviews has become a major issue for many online services.
We propose the use of a clique-finding approach to discover well-organized suspicious reviewers.
From a Yelp dataset with over one million reviews, we construct multiple Reviewer Similarity graphs to link users that have unusually similar behavior: two reviewers are connected in the graph if they have reviewed the same set of venues within a few days.
From these graphs, our algorithms extracted many large cliques and quasi-cliques, the largest one containing a striking 11 users who coordinated their review activities in identical ways.
Among the detected cliques, a large portion contain Yelp Scouts who are paid by Yelp to review venues in new areas.
Our work sheds light on their little-known operation.
Within a fairly short amount of time, the Islamic State of Iraq and Syria (ISIS) has managed to put large swaths of land in Syria and Iraq under their control.
To many observers, the sheer speed at which this "state" was established was dumbfounding.
To better understand the roots of this organization and its supporters we present a study using data from Twitter.
We start by collecting large amounts of Arabic tweets referring to ISIS and classify them into pro-ISIS and anti-ISIS.
This classification turns out to be easily done simply using the name variants used to refer to the organization: the full name and the description as "state" is associated with support, whereas abbreviations usually indicate opposition.
We then "go back in time" by analyzing the historic timelines of both users supporting and opposing and look at their pre-ISIS period to gain insights into the antecedents of support.
To achieve this, we build a classifier using pre-ISIS data to "predict", in retrospect, who will support or oppose the group.
The key story that emerges is one of frustration with failed Arab Spring revolutions.
ISIS supporters largely differ from ISIS opposition in that they refer a lot more to Arab Spring uprisings that failed.
We also find temporal patterns in the support and opposition which seems to be linked to major news, such as reported territorial gains, reports on gruesome acts of violence, and reports on airstrikes and foreign intervention.
Researchers are increasingly incorporating numeric high-order data, i.e., numeric tensors, within their practice.
Just like the matrix/vector (MV) paradigm, the development of multi-purpose, but high-performance, sparse data structures and algorithms for arithmetic calculations, e.g., those found in Einstein-like notation, is crucial for the continued adoption of tensors.
We use the example of high-order differential operators to illustrate this need.
As sparse tensor arithmetic is an emerging research topic, with challenges distinct from the MV paradigm, many aspects require further articulation.
We focus on three core facets.
First, aligning with prominent voices in the field, we emphasise the importance of data structures able to accommodate the operational complexity of tensor arithmetic.
However, we describe a linearised coordinate (LCO) data structure that provides faster and more memory-efficient sorting performance.
Second, flexible data structures, like the LCO, rely heavily on sorts and permutations.
We introduce an innovative permutation algorithm, based on radix sort, that is tailored to rearrange already-sorted sparse data, producing significant performance gains.
Third, we introduce a novel poly-algorithm for sparse tensor products, where hyper-sparsity is a possibility.
Different manifestations of hyper-sparsity demand their own approach, which our poly-algorithm is the first to provide.
These developments are incorporated within our LibNT and NTToolbox software libraries.
Benchmarks, frequently drawn from the high-order differential operators example, demonstrate the practical impact of our routines, with speed-ups of 40% or higher compared to alternative high-performance implementations.
Comparisons against the MATLAB Tensor Toolbox show over 10 times speed improvements.
Thus, these advancements produce significant practical improvements for sparse tensor arithmetic.
We study the problem of motion-planning for free-flying multi-link robots and develop a sampling-based algorithm that is specifically tailored for the task.
Our work is based on the simple observation that the set of configurations for which the robot is self-collision free is independent of the obstacles or of the exact placement of the robot.
This allows to eliminate the need to perform costly self-collision checks online when solving motion-planning problems, assuming some offline preprocessing.
In particular, given a specific robot type our algorithm precomputes a tiling roadmap, which efficiently and implicitly encodes the self-collision free (sub-)space over the entire configuration space, where the latter can be infinite for that matter.
To answer any query, in any given scenario, we traverse the tiling roadmap while only testing for collisions with obstacles.
Our algorithm suggests more flexibility than the prevailing paradigm in which a precomputed roadmap depends both on the robot and on the scenario at hand.
We show through various simulations the effectiveness of this approach on open and closed-chain multi-link robots, where in some settings our algorithm is more than fifty times faster than the state-of-the-art.
The k-nearest-neighbor method performs classification tasks for a query sample based on the information contained in its neighborhood.
Previous studies into the k-nearest-neighbor algorithm usually achieved the decision value for a class by combining the support of each sample in the neighborhood.
They have generally considered the nearest neighbors separately, and potentially integral neighborhood information important for classification was lost, e.g. the distribution information.
This article proposes a novel local learning method that organizes the information in the neighborhood through local distribution.
In the proposed method, additional distribution information in the neighborhood is estimated and then organized; the classification decision is made based on maximum posterior probability which is estimated from the local distribution in the neighborhood.
Additionally, based on the local distribution, we generate a generalized local classification form that can be effectively applied to various datasets through tuning the parameters.
We use both synthetic and real datasets to evaluate the classification performance of the proposed method; the experimental results demonstrate the dimensional scalability, efficiency, effectiveness and robustness of the proposed method compared to some other state-of-the-art classifiers.
The results indicate that the proposed method is effective and promising in a broad range of domains.
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges.
In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems.
Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks.
In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors.
Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images.
These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs.
We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes.
The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB+D datasets.
Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource.
In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB+D dataset.
Aerial robots are becoming popular among general public, and with the development of artificial intelligence (AI), there is a trend to equip aerial robots with a natural user interface (NUI).
Hand/arm gestures are an intuitive way to communicate for humans, and various research works have focused on controlling an aerial robot with natural gestures.
However, the techniques in this area are still far from mature.
Many issues in this area have been poorly addressed, such as the principles of choosing gestures from the design point of view, hardware requirements from an economic point of view, considerations of data availability, and algorithm complexity from a practical perspective.
Our work focuses on building an economical monocular system particularly designed for gesture-based piloting of an aerial robot.
Natural arm gestures are mapped to rich target directions and convenient fine adjustment is achieved.
Practical piloting scenarios, hardware cost and algorithm applicability are jointly considered in our system design.
The entire system is successfully implemented in an aerial robot and various properties of the system are tested.
Human actions comprise of joint motion of articulated body parts or `gestures'.
Human skeleton is intuitively represented as a sparse graph with joints as nodes and natural connections between them as edges.
Graph convolutional networks have been used to recognize actions from skeletal videos.
We introduce a part-based graph convolutional network (PB-GCN) for this task, inspired by Deformable Part-based Models (DPMs).
We divide the skeleton graph into four subgraphs with joints shared across them and learn a recognition model using a part-based graph convolutional network.
We show that such a model improves performance of recognition, compared to a model using entire skeleton graph.
Instead of using 3D joint coordinates as node features, we show that using relative coordinates and temporal displacements boosts performance.
Our model achieves state-of-the-art performance on two challenging benchmark datasets NTURGB+D and HDM05, for skeletal action recognition.
Despite deep recurrent neural networks (RNNs) demonstrate strong performance in text classification, training RNN models are often expensive and requires an extensive collection of annotated data which may not be available.
To overcome the data limitation issue, existing approaches leverage either pre-trained word embedding or sentence representation to lift the burden of training RNNs from scratch.
In this paper, we show that jointly learning sentence representations from multiple text classification tasks and combining them with pre-trained word-level and sentence level encoders result in robust sentence representations that are useful for transfer learning.
Extensive experiments and analyses using a wide range of transfer and linguistic tasks endorse the effectiveness of our approach.
Predicting unseen weather phenomena is an important issue for disaster management.
In this paper, we suggest a model for a convolutional sequence-to-sequence autoencoder for predicting undiscovered weather situations from previous satellite images.
We also propose a symmetric skip connection between encoder and decoder modules to produce more comprehensive image predictions.
To examine our model performance, we conducted experiments for each suggested model to predict future satellite images from historical satellite images.
A specific combination of skip connection and sequence-to-sequence autoencoder was able to generate closest prediction from the ground truth image.
During the last years, Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in image classification.
Their architectures have largely drawn inspiration by models of the primate visual system.
However, while recent research results of neuroscience prove the existence of non-linear operations in the response of complex visual cells, little effort has been devoted to extend the convolution technique to non-linear forms.
Typical convolutional layers are linear systems, hence their expressiveness is limited.
To overcome this, various non-linearities have been used as activation functions inside CNNs, while also many pooling strategies have been applied.
We address the issue of developing a convolution method in the context of a computational model of the visual cortex, exploring quadratic forms through the Volterra kernels.
Such forms, constituting a more rich function space, are used as approximations of the response profile of visual cells.
Our proposed second-order convolution is tested on CIFAR-10 and CIFAR-100.
We show that a network which combines linear and non-linear filters in its convolutional layers, can outperform networks that use standard linear filters with the same architecture, yielding results competitive with the state-of-the-art on these datasets.
The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples.
This overparameterization is often said to be controlled with the help of different regularization techniques, mainly weight decay and dropout.
However, since these techniques reduce the effective capacity of the model, typically even deeper and wider architectures are required to compensate for the reduced capacity.
Therefore, there seems to be a waste of capacity in this practice.
In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout may not be necessary for object recognition if enough data augmentation is introduced.
Real-time simultaneous tracking of hands manipulating and interacting with external objects has many potential applications in augmented reality, tangible computing, and wearable computing.
However, due to difficult occlusions, fast motions, and uniform hand appearance, jointly tracking hand and object pose is more challenging than tracking either of the two separately.
Many previous approaches resort to complex multi-camera setups to remedy the occlusion problem and often employ expensive segmentation and optimization steps which makes real-time tracking impossible.
In this paper, we propose a real-time solution that uses a single commodity RGB-D camera.
The core of our approach is a 3D articulated Gaussian mixture alignment strategy tailored to hand-object tracking that allows fast pose optimization.
The alignment energy uses novel regularizers to address occlusions and hand-object contacts.
For added robustness, we guide the optimization with discriminative part classification of the hand and segmentation of the object.
We conducted extensive experiments on several existing datasets and introduce a new annotated hand-object dataset.
Quantitative and qualitative results show the key advantages of our method: speed, accuracy, and robustness.
Multi-label classification (MLC) is an important learning problem that expects the learning algorithm to take the hidden correlation of the labels into account.
Extracting the hidden correlation is generally a challenging task.
In this work, we propose a novel deep learning framework to better extract the hidden correlation with the help of the memory structure within recurrent neural networks.
The memory stores the temporary guesses on the labels and effectively allows the framework to rethink about the goodness and correlation of the guesses before making the final prediction.
Furthermore, the rethinking process makes it easy to adapt to different evaluation criterion to match real-world application needs.
Experimental results across many real-world data sets justify that the rethinking process indeed improves MLC performance across different evaluation criteria and leads to superior performance over state-of-the-art MLC algorithms.
The ability to control a complex network towards a desired behavior relies on our understanding of the complex nature of these social and technological networks.
The existence of numerous control schemes in a network promotes us to wonder: what is the underlying relationship of all possible input nodes?
Here we introduce input graph, a simple geometry that reveals the complex relationship between all control schemes and input nodes.
We prove that the node adjacent to an input node in the input graph will appear in another control scheme, and the connected nodes in input graph have the same type in control, which they are either all possible input nodes or not.
Furthermore, we find that the giant components emerge in the input graphs of many real networks, which provides a clear topological explanation of bifurcation phenomenon emerging in dense networks and promotes us to design an efficient method to alter the node type in control.
The findings provide an insight into control principles of complex networks and offer a general mechanism to design a suitable control scheme for different purposes.
In this article we test the accuracy of three platforms used in computational modelling: MatLab, Octave and Scilab, running on i386 architecture and three operating systems (Windows, Ubuntu and Mac OS).
We submitted them to numerical tests using standard data sets and using the functions provided by each platform.
A Monte Carlo study was conducted in some of the datasets in order to verify the stability of the results with respect to small departures from the original input.
We propose a set of operations which include the computation of matrix determinants and eigenvalues, whose results are known.
We also used data provided by NIST (National Institute of Standards and Technology), a protocol which includes the computation of basic univariate statistics (mean, standard deviation and first-lag correlation), linear regression and extremes of probability distributions.
The assessment was made comparing the results computed by the platforms with certified values, that is, known results, computing the number of correct significant digits.
Recent innovations in the design of computer viruses have led to new trade-offs for the attacker.
Multiple variants of a malware may spread at different rates and have different levels of visibility to the network.
In this work we examine the optimal strategies for the attacker so as to trade off the extent of spread of the malware against the need for stealth.
We show that in the mean-field deterministic regime, this spread-stealth trade-off is optimized by computationally simple single-threshold policies.
Specifically, we show that only one variant of the malware is spread by the attacker at each time, as there exists a time up to which the attacker prioritizes maximizing the spread of the malware, and after which she prioritizes stealth.
Recurrent neural networks (RNNs) sequentially process data by updating their state with each new data point, and have long been the de facto choice for sequence modeling tasks.
However, their inherently sequential computation makes them slow to train.
Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.
Despite these successes, however, popular feed-forward sequence models like the Transformer fail to generalize in many simple tasks that recurrent models handle with ease, e.g.
copying strings or even simple logical inference when the string or formula lengths exceed those observed at training time.
We propose the Universal Transformer (UT), a parallel-in-time self-attentive recurrent sequence model which can be cast as a generalization of the Transformer model and which addresses these issues.
UTs combine the parallelizability and global receptive field of feed-forward sequence models like the Transformer with the recurrent inductive bias of RNNs.
We also add a dynamic per-position halting mechanism and find that it improves accuracy on several tasks.
In contrast to the standard Transformer, under certain assumptions, UTs can be shown to be Turing-complete.
Our experiments show that UTs outperform standard Transformers on a wide range of algorithmic and language understanding tasks, including the challenging LAMBADA language modeling task where UTs achieve a new state of the art, and machine translation where UTs achieve a 0.9 BLEU improvement over Transformers on the WMT14 En-De dataset.
In the framework of finite games in extensive form with perfect information and strict preferences, this paper introduces a new equilibrium concept: the Perfect Prediction Equilibrium (PPE).
In the Nash paradigm, rational players consider that the opponent's strategy is fixed while maximizing their payoff.
The PPE, on the other hand, models the behavior of agents with an alternate form of rationality that involves a Stackelberg competition with the past.
Agents with this form of rationality integrate in their reasoning that they have such accurate logical and predictive skills, that the world is fully transparent: all players share the same knowledge and know as much as an omniscient external observer.
In particular, there is common knowledge of the solution of the game including the reached outcome and the thought process leading to it.
The PPE is stable given each player's knowledge of its actual outcome and uses no assumptions at unreached nodes.
This paper gives the general definition and construction of the PPE as a fixpoint problem, proves its existence, uniqueness and Pareto optimality, and presents two algorithms to compute it.
Finally, the PPE is put in perspective with existing literature (Newcomb's Problem, Superrationality, Nash Equilibrium, Subgame Perfect Equilibrium, Backward Induction Paradox, Forward Induction).
Text mining can be applied to many fields.
One of the application is using text mining in digital newspaper to do politic sentiment analysis.
In this paper sentiment analysis is applied to get information from digital news articles about its positive or negative sentiment regarding particular politician.
This paper suggests a simple model to analyze digital newspaper sentiment polarity using naive Bayes classifier method.
The model uses a set of initial data to begin with which will be updated when new information appears.
The model showed promising result when tested and can be implemented to some other sentiment analysis problems.
One use of EEG-based brain-computer interfaces (BCIs) in rehabilitation is the detection of movement intention.
In this paper we investigate for the first time the instantaneous phase of movement related cortical potential (MRCP) and its application to the detection of gait intention.
We demonstrate the utility of MRCP phase in two independent datasets, in which 10 healthy subjects and 9 chronic stroke patients executed a self-initiated gait task in three sessions.
Phase features were compared to more conventional amplitude and power features.
The neurophysiology analysis showed that phase features have higher signal-to-noise ratio than the other features.
Also, BCI detectors of gait intention based on phase, amplitude, and their combination were evaluated under three conditions: session specific calibration, intersession transfer, and intersubject transfer.
Results show that the phase based detector is the most accurate for session specific calibration (movement intention was correctly detected in 66.5% of trials in healthy subjects, and in 63.3% in stroke patients).
However, in intersession and intersubject transfer, the detector that combines amplitude and phase features is the most accurate one and the only that retains its accuracy (62.5% in healthy subjects and 59% in stroke patients) w.r.t. session specific calibration.
Thus, MRCP phase features improve the detection of gait intention and could be used in practice to remove time-consuming BCI recalibration.
Nowadays, robots become a companion in everyday life.
To be well-accepted by humans, robots should efficiently understand meanings of their partners' motions and body language, and respond accordingly.
Learning concepts by imitation brings them this ability in a user-friendly way.
This paper presents a fast and robust model for Incremental Learning of Concepts by Imitation (ILoCI).
In ILoCI, observed multimodal spatio-temporal demonstrations are incrementally abstracted and generalized based on both their perceptual and functional similarities during the imitation.
In this method, perceptually similar demonstrations are abstracted by a dynamic model of mirror neuron system.
An incremental method is proposed to learn their functional similarities through a limited number of interactions with the teacher.
Learning all concepts together by the proposed memory rehearsal enables robot to utilize the common structural relations among concepts which not only expedites the learning process especially at the initial stages, but also improves the generalization ability and the robustness against discrepancies between observed demonstrations.
Performance of ILoCI is assessed using standard LASA handwriting benchmark data set.
The results show efficiency of ILoCI in concept acquisition, recognition and generation in addition to its robustness against variability in demonstrations.
The automated segmentation of cells in microscopic images is an open research problem that has important implications for studies of the developmental and cancer processes based on in vitro models.
In this paper, we present the approach for segmentation of the DIC images of cultured cells using G-neighbor smoothing followed by Kauwahara filtering and local standard deviation approach for boundary detection.
NIH FIJI/ImageJ tools are used to create the ground truth dataset.
The results of this work indicate that detection of cell boundaries using segmentation approach even in the case of realistic measurement conditions is a challenging problem.
Some notions in mathematics can be considered relative.
Relative is a term used to denote when the variation in the position of an observer implies variation in properties or measures on the observed object.
We know, from Skolem theorem, that there are first-order models where the set of real numbers is countable and some where it is not.
This fact depends on the position of the observer and on the instrument/language the obserevr uses as well, i.e., it depends on whether he/she is inside the model or not and in this particular case the use of first-order logic.
In this article, we assume that computation is based on finiteness rather than natural numbers and discuss Turing machines computable morphisms defined on top of the sole notion finiteness.
We explore the relativity of finiteness in models provided by toposes where the Axiom of Choice (AC) does not hold, since Tarski proved that if AC holds then all finiteness notions are equivalent.
Our toposes do not have natural numbers object (NNO) either, since in a topos with a NNO these finiteness notions are equivalent to Peano finiteness going back to computation on top of Natural Numbers.
The main contribution of this article is to show that although from inside every topos, with the properties previously stated, the computation model is standard, from outside some of these toposes, unexpected properties on the computation arise, e.g., infinitely long programs, finite computations containing infinitely long ones, infinitely branching computations.
We mainly consider Dedekind and Kuratowski notions of finiteness in this article.
This paper reports on ongoing research investigating more expressive approaches to spatial-temporal trajectory clustering.
Spatial-temporal data is increasingly becoming universal as a result of widespread use of GPS and mobile devices, which makes mining and predictive analyses based on trajectories a critical activity in many domains.
Trajectory analysis methods based on clustering techniques heavily often rely on a similarity definition to properly provide insights.
However, although trajectories are currently described in terms of its two dimensions (space and time), their representation is limited in that it is not expressive enough to capture, in a combined way, the structure of space and time as well as the contextual and semantic trajectory properties.
Moreover, the massive amounts of available trajectory data make trajectory mining and analyses very challenging.
In this paper, we briefly discuss (i) an improved trajectory representation that takes into consideration space-time structures, context and semantic properties of trajectories; (ii) new forms of relations between the dimensions of a pair of trajectories; and (iii) big data approaches that can be used to develop a novel spatial-temporal clustering framework.
Our goal is to answer questions about paragraphs describing processes (e.g., photosynthesis).
Texts of this genre are challenging because the effects of actions are often implicit (unstated), requiring background knowledge and inference to reason about the changing world states.
To supply this knowledge, we leverage VerbNet to build a rulebase (called the Semantic Lexicon) of the preconditions and effects of actions, and use it along with commonsense knowledge of persistence to answer questions about change.
Our evaluation shows that our system, ProComp, significantly outperforms two strong reading comprehension (RC) baselines.
Our contributions are two-fold: the Semantic Lexicon rulebase itself, and a demonstration of how a simulation-based approach to machine reading can outperform RC methods that rely on surface cues alone.
Since this work was performed, we have developed neural systems that outperform ProComp, described elsewhere (Dalvi et al., NAACL'18).
However, the Semantic Lexicon remains a novel and potentially useful resource, and its integration with neural systems remains a currently unexplored opportunity for further improvements in machine reading about processes.
This paper applies energy conservation principles to the Daala video codec using gain-shape vector quantization to encode a vector of AC coefficients as a length (gain) and direction (shape).
The technique originates from the CELT mode of the Opus audio codec, where it is used to conserve the spectral envelope of an audio signal.
Conserving energy in video has the potential to preserve textures rather than low-passing them.
Explicitly quantizing a gain allows a simple contrast masking model with no signaling cost.
Vector quantizing the shape keeps the number of degrees of freedom the same as scalar quantization, avoiding redundancy in the representation.
We demonstrate how to predict the vector by transforming the space it is encoded in, rather than subtracting off the predictor, which would make energy conservation impossible.
We also derive an encoding of the vector-quantized codewords that takes advantage of their non-uniform distribution.
We show that the resulting technique outperforms scalar quantization by an average of 0.90 dB on still images, equivalent to a 24.8% reduction in bitrate at equal quality, while for videos, the improvement averages 0.83 dB, equivalent to a 13.7% reduction in bitrate.
A recent independent study resulted in a ranking system which ranked Astronomy and Computing (ASCOM) much higher than most of the older journals highlighting the niche prominence of the particular journal.
We investigate the remarkable ascendancy in reputation of ASCOM by proposing a novel differential equation based modeling.
The Modeling is a consequence of knowledge discovery from big data-centric methods, namely L1-SVD.
The inadequacy of the ranking method in explaining the reason behind the growth in reputation of ASCOM is reasonable to understand given that the study was post-facto.
Thus, we propose a growth model by accounting for the behavior of parameters that contribute to the growth of a field.
It is worthwhile to spend some time in analysing the cause and control variables behind rapid rise in reputation of a journal in a niche area.
We intent to probe and bring out parameters responsible for its growing influence.
Delay differential equations are used to model the change of influence on a journal's status by exploiting the effects of historical data.
Iris recognition technology, used to identify individuals by photographing the iris of their eye, has become popular in security applications because of its ease of use, accuracy, and safety in controlling access to high-security areas.
Fusion of multiple algorithms for biometric verification performance improvement has received considerable attention.
The proposed method combines the zero-crossing 1 D wavelet Euler number, and genetic algorithm based for feature extraction.
The output from these three algorithms is normalized and their score are fused to decide whether the user is genuine or imposter.
This new strategies is discussed in this paper, in order to compute a multimodal combined score.
This paper addresses the problem of Human-Aware Navigation (HAN), using multi camera sensors to implement a vision-based person tracking system.
The main contributions of this paper are as follows: a novel and efficient Deep Learning person detection and a standardization of human-aware constraints.
In the first stage of the approach, we propose to cascade the Aggregate Channel Features (ACF) detector with a deep Convolutional Neural Network (CNN) to achieve fast and accurate Pedestrian Detection (PD).
Regarding the human awareness (that can be defined as constraints associated with the robot's motion), we use a mixture of asymmetric Gaussian functions, to define the cost functions associated to each constraint.
Both methods proposed herein are evaluated individually to measure the impact of each of the components.
The final solution (including both the proposed pedestrian detection and the human-aware constraints) is tested in a typical domestic indoor scenario, in four distinct experiments.
The results show that the robot is able to cope with human-aware constraints, defined after common proxemics and social rules.
We examine the non-Markovian nature of human mobility by exposing the inability of Markov models to capture criticality in human mobility.
In particular, the assumed Markovian nature of mobility was used to establish a theoretical upper bound on the predictability of human mobility (expressed as a minimum error probability limit), based on temporally correlated entropy.
Since its inception, this bound has been widely used and empirically validated using Markov chains.
We show that recurrent-neural architectures can achieve significantly higher predictability, surpassing this widely used upper bound.
In order to explain this anomaly, we shed light on several underlying assumptions in previous research works that has resulted in this bias.
By evaluating the mobility predictability on real-world datasets, we show that human mobility exhibits scale-invariant long-range correlations, bearing similarity to a power-law decay.
This is in contrast to the initial assumption that human mobility follows an exponential decay.
This assumption of exponential decay coupled with Lempel-Ziv compression in computing Fano's inequality has led to an inaccurate estimation of the predictability upper bound.
We show that this approach inflates the entropy, consequently lowering the upper bound on human mobility predictability.
We finally highlight that this approach tends to overlook long-range correlations in human mobility.
This explains why recurrent-neural architectures that are designed to handle long-range structural correlations surpass the previously computed upper bound on mobility predictability.
In the current Named Data Networking implementation, forwarding a data request requires finding an exact match between the prefix of the name carried in the request and a forwarding table entry.
However, consumers may not always know the exact naming, or an exact prefix, of their desired data.
The current approach to this problem-establishing naming conventions and performing name lookup-can be infeasible in highly ad hoc, heterogeneous, and dynamic environments: the same data can be named using different terms or even languages, naming conventions may be minimal if they exist at all, and name lookups can be costly.
In this paper, we present a fuzzy Interest forwarding approach that exploits semantic similarities between the names carried in Interest packets and the names of potentially matching data in CS and entries in FIB.
We describe the fuzzy Interest forwarding approach, outline the semantic understanding function that determines the name matching, and present our simulation study along with extended evaluation results.
Are human perception and decision biases grounded in a form of rationality?
You return to your camp after hunting or gathering.
You see the grass moving.
You do not know the probability that a snake is in the grass.
Should you cross the grass - at the risk of being bitten by a snake - or make a long, hence costly, detour?
Based on this storyline, we consider a rational decision maker maximizing expected discounted utility with learning.
We show that his optimal behavior displays three biases: status quo, salience, overestimation of small probabilities.
Biases can be the product of rational behavior.
The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s).
The spectral representations are then used to derive time-frequency masks.
In this work we introduce a method to directly learn time-frequency masks from an observed mixture magnitude spectrum.
We employ recurrent neural networks and train them using prior knowledge only for the magnitude spectrum of the target source.
To assess the performance of the proposed method, we focus on the task of singing voice separation.
The results from an objective evaluation show that our proposed method provides comparable results to deep learning based methods which operate over complicated signal representations.
Compared to previous methods that approximate time-frequency masks, our method has increased performance of signal to distortion ratio by an average of 3.8 dB.
We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs.
We also develop a crowdsourcing scheme to show that QAMRs can be labeled with very little training, and gather a dataset with over 5,000 sentences and 100,000 questions.
A detailed qualitative analysis demonstrates that the crowd-generated question-answer pairs cover the vast majority of predicate-argument relationships in existing datasets (including PropBank, NomBank, QA-SRL, and AMR) along with many previously under-resourced ones, including implicit arguments and relations.
The QAMR data and annotation code is made publicly available to enable future work on how best to model these complex phenomena.
Virtual reality 360-degree videos will become the first prosperous online VR application.
VR 360 videos are data-hungry and latency-sensitive that pose unique challenges to the networking infrastructure.
In this paper, we focus on the ultimate VR 360 that satisfies human eye fidelity.
The ultimate VR 360 requires downlink 1.5 Gbps for viewing and uplink 6.6 Gbps for live broadcasting, with round-trip time of less than 8.3 ms. On the other hand, wireless access to VR 360 services is preferred over wire-line transmission because of the better user experience and the safety concern (e.g., tripping hazard).
We explore in this paper whether the most advanced wireless technologies from both cellular communications and WiFi communications support the ultimate VR 360.
Specifically, we consider 5G in cellular communications, IEEE 802.11ac (operating in 5GHz) and IEEE 802.11ad (operating in 60GHz) in WiFi communications.
According to their performance specified in their standards and/or empirical measurements, we have the following findings: (1) Only 5G has the potential to support both the the ultimate VR 360 viewing and live broadcasting.
However, it is difficult for 5G to support multiple users of the ultimate VR live broadcasting at home; (2) IEEE 802.11ac supports the ultimate VR 360 viewing but fails to support the ultimate VR 360 live broadcasting because it does not meet the data rate requirement of the ultimate VR 360 live broadcasting; (3) IEEE 802.11ad fails to support the ultimate VR 360, because its current implementation incurs very high latency.
Our preliminary results indicate that more advanced wireless technologies are needed to fully support multiple ultimate VR 360 users at home.
When we have knowledge of the positions of nearby walls and buildings, estimating the source location becomes a very efficient way of characterizing and estimating a radio channel.
We consider localization performance with and without this knowledge.
We treat the multipath channel as a set of "virtual receivers" whose positions can be pre-stored in a channel database.
Using wall knowledge, we develop a generalized MUSIC algorithm that treats the wall reflection parameter as a nuisance variable.
We compare this to a classic MVDR direct positioning algorithm that lacks wall knowledge.
In a simple scenario, we find that lack of wall knowledge can increase location error by 7-100x, depending on the number of antennas, SNR, and true reflection parameter.
Interestingly, as the number of antennas increases, the value of wall knowledge decreases.
A key challenge in modern computing is to develop systems that address complex, dynamic problems in a scalable and efficient way, because the increasing complexity of software makes designing and maintaining efficient and flexible systems increasingly difficult.
Biological systems are thought to possess robust, scalable processing paradigms that can automatically manage complex, dynamic problem spaces, possessing several properties that may be useful in computer systems.
The biological properties of self-organisation, self-replication, self-management, and scalability are addressed in an interesting way by autopoiesis, a descriptive theory of the cell founded on the concept of a system's circular organisation to define its boundary with its environment.
In this paper, therefore, we review the main concepts of autopoiesis and then discuss how they could be related to fundamental concepts and theories of computation.
The paper is conceptual in nature and the emphasis is on the review of other people's work in this area as part of a longer-term strategy to develop a formal theory of autopoietic computing.
In this paper we discuss some reasons why temporal logic might not be suitable to model real life norms.
To show this, we present a novel deontic logic contrary-to-duty/derived permission paradox based on the interaction of obligations, permissions and contrary-to-duty obligations.
The paradox is inspired by real life norms.
Access to the cloud has the potential to provide scalable and cost effective enhancements of physical devices through the use of advanced computational processes run on apparently limitless cyber infrastructure.
On the other hand, cyber-physical systems and cloud-controlled devices are subject to numerous design challenges; among them is that of security.
In particular, recent advances in adversary technology pose Advanced Persistent Threats (APTs) which may stealthily and completely compromise a cyber system.
In this paper, we design a framework for the security of cloud-based systems that specifies when a device should trust commands from the cloud which may be compromised.
This interaction can be considered as a game between three players: a cloud defender/administrator, an attacker, and a device.
We use traditional signaling games to model the interaction between the cloud and the device, and we use the recently proposed FlipIt game to model the struggle between the defender and attacker for control of the cloud.
Because attacks upon the cloud can occur without knowledge of the defender, we assume that strategies in both games are picked according to prior commitment.
This framework requires a new equilibrium concept, which we call Gestalt Equilibrium, a fixed-point that expresses the interdependence of the signaling and FlipIt games.
We present the solution to this fixed-point problem under certain parameter cases, and illustrate an example application of cloud control of an unmanned vehicle.
Our results contribute to the growing understanding of cloud-controlled systems.
Scarcity of labeled data is one of the most frequent problems faced in machine learning.
This is particularly true in relation extraction in text mining, where large corpora of texts exists in many application domains, while labeling of text data requires an expert to invest much time to read the documents.
Overall, state-of-the art models, like the convolutional neural network used in this paper, achieve great results when trained on large enough amounts of labeled data.
However, from a practical point of view the question arises whether this is the most efficient approach when one takes the manual effort of the expert into account.
In this paper, we report on an alternative approach where we first construct a relation extraction model using distant supervision, and only later make use of a domain expert to refine the results.
Distant supervision provides a mean of labeling data given known relations in a knowledge base, but it suffers from noisy labeling.
We introduce an active learning based extension, that allows our neural network to incorporate expert feedback and report on first results on a complex data set.
Skeleton-based human action recognition has attracted a lot of research attention during the past few years.
Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data.
The proposed work extends this idea to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously.
Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed.
In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell.
Moreover, we introduce a novel multi-modal feature fusion strategy within the LSTM unit in this paper.
The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method.
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language.
Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories.
We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords.
We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither.
We train a multi-class classifier to distinguish between these different categories.
Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult.
We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive.
Tweets without explicit hate keywords are also more difficult to classify.
This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset.
The viewpoints of the scenes are densely sampled and objects in the scenes are annotated with bounding boxes and in the 3D point cloud.
Also, an approach for detection and recognition is presented, which is comprised of two parts: i) a new multi-view 3D proposal generation method and ii) the development of several recognition baselines using AlexNet to score our proposals, which is trained either on crops of the dataset or on synthetically composited training images.
Finally, we compare the performance of the object proposals and a detection baseline to the Washington RGB-D Scenes (WRGB-D) dataset and demonstrate that our Kitchen scenes dataset is more challenging for object detection and recognition.
The dataset is available at: http://cs.gmu.edu/~robot/gmu-kitchens.html.
We study the problem of finding and monitoring fixed-size subgraphs in a continually changing large-scale graph.
We present the first approach that (i) performs worst-case optimal computation and communication, (ii) maintains a total memory footprint linear in the number of input edges, and (iii) scales down per-worker computation, communication, and memory requirements linearly as the number of workers increases, even on adversarially skewed inputs.
Our approach is based on worst-case optimal join algorithms, recast as a data-parallel dataflow computation.
We describe the general algorithm and modifications that make it robust to skewed data, prove theoretical bounds on its resource requirements in the massively parallel computing model, and implement and evaluate it on graphs containing as many as 64 billion edges.
The underlying algorithm and ideas generalize from finding and monitoring subgraphs to the more general problem of computing and maintaining relational equi-joins over dynamic relations.
In this paper, we present our deep attention-based classification (DABC) network for robust single image depth prediction, in the context of the Robust Vision Challenge 2018 (ROB 2018).
Unlike conventional depth prediction, our goal is to design a model that can perform well in both indoor and outdoor scenes with a single parameter set.
However, robust depth prediction suffers from two challenging problems: a) How to extract more discriminative features for different scenes (compared to a single scene)?
b) How to handle the large differences of depth ranges between indoor and outdoor datasets?
To address these two problems, we first formulate depth prediction as a multi-class classification task and apply a softmax classifier to classify the depth label of each pixel.
We then introduce a global pooling layer and a channel-wise attention mechanism to adaptively select the discriminative channels of features and to update the original features by assigning important channels with higher weights.
Further, to reduce the influence of quantization errors, we employ a soft-weighted sum inference strategy for the final prediction.
Experimental results on both indoor and outdoor datasets demonstrate the effectiveness of our method.
It is worth mentioning that we won the 2-nd place in single image depth prediction entry of ROB 2018, in conjunction with IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018.
In many applications, ultra-wide band (UWB) system experiences impulse noise due to surrounding physical noise sources.
Therefore, a conventional receiver (correlator or matched filter) designed for additive Gaussian noise system is not optimum for an impulse noise affected communication channel.
In this paper, we propose a new robust receiver design that utilizes the received UWB signal cluster sparsity to mitigate impulse noise.
Further, multipath channel diversity enhances the signal-to-noise ratio, as compared to the single path after impulse noise removal in the proposed receiver design.
The proposed receiver is analyzed in time hopping binary phase shift keying UWB system and is compared with popular blanking non-linearity based receiver in Bernoulli-Gaussian impulse noise over both single and multipath IEEE 802.15.4a channels.
Unlike existing designs, the proposed receiver does not require any training sequence.
The proposed receiver is observed to be robust with improved bit error rate performance as compared to a blanking receiver in the presence of impulse noise.
This work addresses our research on driving skill modeling using artificial neural networks for haptic assistance.
In this paper, we present a haptic driving training simulator with performance-based, error-corrective haptic feedback.
One key component of our simulator is the ability to learn an optimized driving skill model from the driving data of expert drivers.
To this end, we obtain a model utilizing artificial neural networks to extract a desired movement of a steering wheel and an accelerator pedal based on the experts' prediction.
Then, we can deliver haptic assistance based on a driver's performance error which is a difference between a current and the desired movement.
We validate the performance of our framework in two respective user experiments recruiting expert/novice drivers to show the feasibility and applicability of facilitating neural networks for performance-based haptic driving skill transfer.
Visual question answering (VQA) has witnessed great progress since May, 2015 as a classic problem unifying visual and textual data into a system.
Many enlightening VQA works explore deep into the image and question encodings and fusing methods, of which attention is the most effective and infusive mechanism.
Current attention based methods focus on adequate fusion of visual and textual features, but lack the attention to where people focus to ask questions about the image.
Traditional attention based methods attach a single value to the feature at each spatial location, which losses many useful information.
To remedy these problems, we propose a general method to perform saliency-like pre-selection on overlapped region features by the interrelation of bidirectional LSTM (BiLSTM), and use a novel element-wise multiplication based attention method to capture more competent correlation information between visual and textual features.
We conduct experiments on the large-scale COCO-VQA dataset and analyze the effectiveness of our model demonstrated by strong empirical results.
A recent trend in object oriented (OO) programming languages is the use of Access Permissions (APs) as an abstraction for controlling concurrent executions of programs.
The use of AP source code annotations defines a protocol specifying how object references can access the mutable state of objects.
Although the use of APs simplifies the task of writing concurrent code, an unsystematic use of them can lead to subtle problems.
This paper presents a declarative interpretation of APs as Linear Concurrent Constraint Programs (lcc).
We represent APs as constraints (i.e., formulas in logic) in an underlying constraint system whose entailment relation models the transformation rules of APs.
Moreover, we use processes in lcc to model the dependencies imposed by APs, thus allowing the faithful representation of their flow in the program.
We verify relevant properties about AP programs by taking advantage of the interpretation of lcc processes as formulas in Girard's intuitionistic linear logic (ILL).
Properties include deadlock detection, program correctness (whether programs adhere to their AP specifications or not), and the ability of methods to run concurrently.
By relying on a focusing discipline for ILL, we provide a complexity measure for proofs of the above mentioned properties.
The effectiveness of our verification techniques is demonstrated by implementing the Alcove tool that includes an animator and a verifier.
The former executes the lcc model, observing the flow of APs and quickly finding inconsistencies of the APs vis-a-vis the implementation.
The latter is an automatic theorem prover based on ILL.
This paper is under consideration for publication in Theory and Practice of Logic Programming (TPLP).
One of the most significant 5G technology enablers will be Device-to-Device (D2D) communications.
D2D communications constitute a promising way to improve spectral, energy and latency performance, exploiting the physical proximity of communicating devices and increasing resource utilization.
Furthermore, network infrastructure densification has been considered as one of the most substantial methods to increase system performance, taking advantage of base station proximity and spatial reuse of system resources.
However, could we improve system performance by leveraging both of these two 5G enabling technologies together in a multi-cell environment?
How does spectrum sharing affect performance enhancement?
This article investigates the implications of interference, densification and spectrum sharing in D2D performance gain.
The in-band D2D approach, where legacy users coexist with potential D2D pairs, is considered in a multi-cell system.
Overlay and underlay spectrum sharing approaches are employed in order for the potential D2D pairs to access the spectrum.
Given that two of the most critical problems in the D2D concept are mode selection and user scheduling, we jointly address them, aiming at maximizing the total system uplink throughput.
Thus, we present a radio resource management mechanism for intra-cell and cross-cell overlay/underlay D2D communications enabled in a multi-cell system.
System-level simulations are executed to evaluate the system performance and examine the trends of D2D communication gain for the different spectrum sharing approaches and various densification scenarios.
Finally, realworld SDR-based experiments are performed to test and assess D2D communications for overlay and underlay spectrum sharing.
Biclustering is found to be useful in areas like data mining and bioinformatics.
The term biclustering involves searching subsets of observations and features forming coherent structure.
This can be interpreted in different ways like spatial closeness, relation between features for selected observations etc.
This paper discusses different properties, objectives and approaches of biclustering algorithms.
We also present an algorithm which detects feature relation based biclusters using density based techniques.
Here we use relative density of regions to identify biclusters embedded in the data.
Properties of this algorithm are discussed and demonstrated using artificial datasets.
Proposed method is seen to give better results on these datasets using paired right tailed t test.
Usefulness of proposed method is also demonstrated using real life datasets.
We consider the design of wireless queueing network control policies with particular focus on combining stability with additional application-dependent requirements.
Thereby, we consequently pursue a cost function based approach that provides the flexibility to incorporate constraints and requirements of particular services or applications.
As typical examples of such requirements, we consider the reduction of buffer underflows in case of streaming traffic, and energy efficiency in networks of battery powered nodes.
Compared to the classical throughput optimal control problem, such requirements significantly complicate the control problem.
We provide easily verifyable theoretical conditions for stability, and, additionally, compare various candidate cost functions applied to wireless networks with streaming media traffic.
Moreover, we demonstrate how the framework can be applied to the problem of energy efficient routing, and we demonstrate the aplication of our framework in cross-layer control problems for wireless multihop networks, using an advanced power control scheme for interference mitigation, based on successive convex approximation.
In all scenarios, the performance of our control framework is evaluated using extensive numerical simulations.
Delay Tolerant Networks (DTNs) are sparse mobile networks, which experiences frequent disruptions in connectivity among nodes.
Usually, DTN follows store-carry-and forward mechanism for message forwarding, in which a node store and carry the message until it finds an appropriate relay node to forward further in the network.
So, The efficiency of DTN routing protocol relies on the intelligent selection of a relay node from a set of encountered nodes.
Although there are plenty of DTN routing schemes proposed in the literature based on different strategies of relay selection, there are not many mathematical models proposed to study the behavior of message forwarding in DTN.
In this paper, we have proposed a novel epidemic model, called as CISER model, for message propagation in DTN, based on Amoebiasis disease propagation in human population.
The proposed CISER model is an extension of SIR epidemic model with additional states to represent the resource constrained behavior of nodes in DTN.
Experimental results using both synthetic and real-world traces show that the proposed model improves the routing performance metrics, such as delivery ratio, overhead ratio and delivery delay compared to SIR model.
This paper introduces a generalization of Convolutional Neural Networks (CNNs) to graphs with irregular linkage structures, especially heterogeneous graphs with typed nodes and schemas.
We propose a novel spatial convolution operation to model the key properties of local connectivity and translation invariance, using high-order connection patterns or motifs.
We develop a novel deep architecture Motif-CNN that employs an attention model to combine the features extracted from multiple patterns, thus effectively capturing high-order structural and feature information.
Our experiments on semi-supervised node classification on real-world social networks and multiple representative heterogeneous graph datasets indicate significant gains of 6-21% over existing graph CNNs and other state-of-the-art techniques.
There has been a tremendous effort in improving wireless LAN for supporting the demanding multimedia application.
Many new protocols or ideas have been proposed and proved by using a mathematical model or running a simulation program.
That is satisfactory but these proposed designs might not work in the real world situation.
Testbed is an option to alleviate this gap and present the opportunity to see the real problem and ensure that the design works.
A framework architecture for building a testbed to test a new concept or design is presented in this paper.
The framework is designed in the modularity style in such a way that can be easily exchanged or modified.
A testbed based on the framework that implements the polling based mechanism has been created and the results have shown that the QoS of the real time traffic can be maintained in the presence of the high non-real time traffic.
The blooming availability of traces for social, biological, and communication networks opens up unprecedented opportunities in analyzing diffusion processes in networks.
However, the sheer sizes of the nowadays networks raise serious challenges in computational efficiency and scalability.
In this paper, we propose a new hyper-graph sketching framework for inflence dynamics in networks.
The central of our sketching framework, called SKIS, is an efficient importance sampling algorithm that returns only non-singular reverse cascades in the network.
Comparing to previously developed sketches like RIS and SKIM, our sketch significantly enhances estimation quality while substantially reducing processing time and memory-footprint.
Further, we present general strategies of using SKIS to enhance existing algorithms for influence estimation and influence maximization which are motivated by practical applications like viral marketing.
Using SKIS, we design high-quality influence oracle for seed sets with average estimation error up to 10x times smaller than those using RIS and 6x times smaller than SKIM.
In addition, our influence maximization using SKIS substantially improves the quality of solutions for greedy algorithms.
It achieves up to 10x times speed-up and 4x memory reduction for the fastest RIS-based DSSA algorithm, while maintaining the same theoretical guarantees.
In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time.
We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors.
In this paper, we propose to use word predictions as a mechanism for direct supervision.
More specifically, we require these vectors to be able to predict the vocabulary in target sentence.
Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation.
It is also helpful in reducing the target side vocabulary and improving the decoding efficiency.
Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectively
Objective image quality assessment (IQA) is imperative in the current multimedia-intensive world, in order to assess the visual quality of an image at close to a human level of ability.
Many parameters such as color intensity, structure, sharpness, contrast, presence of an object, etc., draw human attention to an image.
Psychological vision research suggests that human vision is biased to the center area of an image and display screen.
As a result, if the center part contains any visually salient information, it draws human attention even more and any distortion in that part will be better perceived than other parts.
To the best of our knowledge, previous IQA methods have not considered this fact.
In this paper, we propose a full reference image quality assessment (FR-IQA) approach using visual saliency and contrast; however, we give extra attention to the center by increasing the sensitivity of the similarity maps in that region.
We evaluated our method on three large-scale popular benchmark databases used by most of the current IQA researchers (TID2008, CSIQ~and LIVE), having a total of 3345 distorted images with 28~different kinds of distortions.
Our method is compared with 13 state-of-the-art approaches.
This comparison reveals the stronger correlation of our method with human-evaluated values.
The prediction-of-quality score is consistent for distortion specific as well as distortion independent cases.
Moreover, faster processing makes it applicable to any real-time application.
The MATLAB code is publicly available to test the algorithm and can be found online at http://layek.khu.ac.kr/CEQI.
This paper concerns a deep learning approach to relevance ranking in information retrieval (IR).
Existing deep IR models such as DSSM and CDSSM directly apply neural networks to generate ranking scores, without explicit understandings of the relevance.
According to the human judgement process, a relevance label is generated by the following three steps: 1) relevant locations are detected, 2) local relevances are determined, 3) local relevances are aggregated to output the relevance label.
In this paper we propose a new deep learning architecture, namely DeepRank, to simulate the above human judgment process.
Firstly, a detection strategy is designed to extract the relevant contexts.
Then, a measure network is applied to determine the local relevances by utilizing a convolutional neural network (CNN) or two-dimensional gated recurrent units (2D-GRU).
Finally, an aggregation network with sequential integration and term gating mechanism is used to produce a global relevance score.
DeepRank well captures important IR characteristics, including exact/semantic matching signals, proximity heuristics, query term importance, and diverse relevance requirement.
Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.
Monoidal computer is a categorical model of intensional computation, where many different programs correspond to the same input-output behavior.
The upshot of yet another model of computation is that a categorical formalism should provide a much needed high level language for theory of computation, flexible enough to allow abstracting away the low level implementation details when they are irrelevant, or taking them into account when they are genuinely needed.
A salient feature of the approach through monoidal categories is the formal graphical language of string diagrams, which supports visual reasoning about programs and computations.
In the present paper, we provide a coalgebraic characterization of monoidal computer.
It turns out that the availability of interpreters and specializers, that make a monoidal category into a monoidal computer, is equivalent with the existence of a *universal state space*, that carries a weakly final state machine for any pair of input and output types.
Being able to program state machines in monoidal computers allows us to represent Turing machines, to capture their execution, count their steps, as well as, e.g., the memory cells that they use.
The coalgebraic view of monoidal computer thus provides a convenient diagrammatic language for studying computability and complexity.
The idea of video super resolution is to use different view points of a single scene to enhance the overall resolution and quality.
Classical energy minimization approaches first establish a correspondence of the current frame to all its neighbors in some radius and then use this temporal information for enhancement.
In this paper, we propose the first variational super resolution approach that computes several super resolved frames in one batch optimization procedure by incorporating motion information between the high-resolution image frames themselves.
As a consequence, the number of motion estimation problems grows linearly in the number of frames, opposed to a quadratic growth of classical methods and temporal consistency is enforced naturally.
We use infimal convolution regularization as well as an automatic parameter balancing scheme to automatically determine the reliability of the motion information and reweight the regularization locally.
We demonstrate that our approach yields state-of-the-art results and even is competitive with machine learning approaches.
The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market.
On the one hand, the promise of the "wisdom of the crowd" has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages.
On the other hand, the decentralized and often un-monitored environment of such projects may make them susceptible to low quality content.
In this work, we focus on Urban Dictionary, a crowd-sourced online dictionary.
We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content.
We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries.
Furthermore, Urban Dictionary covers many informal, unfamiliar words as well as proper nouns.
Urban Dictionary also contains offensive content, but highly offensive content tends to receive lower scores through the dictionary's voting system.
The low threshold to include new material in Urban Dictionary enables quick recording of new words and new meanings, but the resulting heterogeneous content can pose challenges in using Urban Dictionary as a source to study language innovation.
Proliferation of touch-based devices has made sketch-based image retrieval practical.
While many methods exist for sketch-based object detection/image retrieval on small datasets, relatively less work has been done on large (web)-scale image retrieval.
In this paper, we present an efficient approach for image retrieval from millions of images based on user-drawn sketches.
Unlike existing methods for this problem which are sensitive to even translation or scale variations, our method handles rotation, translation, scale (i.e. a similarity transformation) and small deformations.
The object boundaries are represented as chains of connected segments and the database images are pre-processed to obtain such chains that have a high chance of containing the object.
This is accomplished using two approaches in this work: a) extracting long chains in contour segment networks and b) extracting boundaries of segmented object proposals.
These chains are then represented by similarity-invariant variable length descriptors.
Descriptor similarities are computed by a fast Dynamic Programming-based partial matching algorithm.
This matching mechanism is used to generate a hierarchical k-medoids based indexing structure for the extracted chains of all database images in an offline process which is used to efficiently retrieve a small set of possible matched images for query chains.
Finally, a geometric verification step is employed to test geometric consistency of multiple chain matches to improve results.
Qualitative and quantitative results clearly demonstrate superiority of the approach over existing methods.
Generative neural models have recently achieved state-of-the-art results for constituency parsing.
However, without a feasible search procedure, their use has so far been limited to reranking the output of external parsers in which decoding is more tractable.
We describe an alternative to the conventional action-level beam search used for discriminative neural models that enables us to decode directly in these generative models.
We then show that by improving our basic candidate selection strategy and using a coarse pruning function, we can improve accuracy while exploring significantly less of the search space.
Applied to the model of Choe and Charniak (2016), our inference procedure obtains 92.56 F1 on section 23 of the Penn Treebank, surpassing prior state-of-the-art results for single-model systems.
Although the performance of commodity computers has improved drastically with the introduction of multicore processors and GPU computing, the standard R distribution is still based on single-threaded model of computation, using only a small fraction of the computational power available now for most desktops and laptops.
Modern statistical software packages rely on high performance implementations of the linear algebra routines there are at the core of several important leading edge statistical methods.
In this paper we present a GPU implementation of the GMRES iterative method for solving linear systems.
We compare the performance of this implementation with a pure single threaded version of the CPU.
We also investigate the performance of our implementation using different GPU packages available now for R such as gmatrix, gputools or gpuR which are based on CUDA or OpenCL frameworks.
In this research, a new indicator of disciplinarity-multidisciplinarity is developed, discussed and applied.
EBDI is based on the combination of the frequency distribution of subject categories of journals citing or cited by the analysis unit and the spread and diversity of the citations among subject categories measured with Shannon-Wiener entropy.
Its reproducibility, robustness and consistence are discussed.
Four of the combinations of its values when applied to the cited and citing dimensions lead to a suggested taxonomy of the role that the studied unit might have in terms of the transformation of knowledge from different disciplines in the scientific communication system and its position respect a hypothetical thematic core of the discipline in which it has been classified.
The indicator is applied to the journals belonging to the first quartile of JCR-SSCI 2011 Library and Information Science and an indicator-based taxonomy is applied and discussed, pointing to differential thematic roles of the journals analyzed.
Recently, a number of existing blockchain systems have witnessed major bugs and vulnerabilities within smart contracts.
Although the literature features a number of proposals for securing smart contracts, these proposals mostly focus on proving the correctness or absence of a certain type of vulnerability within a contract, but cannot protect deployed (legacy) contracts from being exploited.
In this paper, we address this problem in the context of re-entrancy exploits and propose a novel smart contract security technology, dubbed Sereum (Secure Ethereum), which protects existing, deployed contracts against re-entrancy attacks in a backwards compatible way based on run-time monitoring and validation.
Sereum does neither require any modification nor any semantic knowledge of existing contracts.
By means of implementation and evaluation using the Ethereum blockchain, we show that Sereum covers the actual execution flow of a smart contract to accurately detect and prevent attacks with a false positive rate as small as 0.06% and with negligible run-time overhead.
As a by-product, we develop three advanced re-entrancy attacks to demonstrate the limitations of existing offline vulnerability analysis tools.
Naive Bayes Nearest Neighbour (NBNN) is a simple and effective framework which addresses many of the pitfalls of K-Nearest Neighbour (KNN) classification.
It has yielded competitive results on several computer vision benchmarks.
Its central tenet is that during NN search, a query is not compared to every example in a database, ignoring class information.
Instead, NN searches are performed within each class, generating a score per class.
A key problem with NN techniques, including NBNN, is that they fail when the data representation does not capture perceptual (e.g.~class-based) similarity.
NBNN circumvents this by using independent engineered descriptors (e.g.~SIFT).
To extend its applicability outside of image-based domains, we propose to learn a metric which captures perceptual similarity.
Similar to how Neighbourhood Components Analysis optimizes a differentiable form of KNN classification, we propose "Class Conditional" metric learning (CCML), which optimizes a soft form of the NBNN selection rule.
Typical metric learning algorithms learn either a global or local metric.
However, our proposed method can be adjusted to a particular level of locality by tuning a single parameter.
An empirical evaluation on classification and retrieval tasks demonstrates that our proposed method clearly outperforms existing learned distance metrics across a variety of image and non-image datasets.
This paper investigates how secure information sharing with external vendors can be achieved in an Industrial Internet of Things (IIoT).
It also identifies necessary security requirements for secure information sharing based on identified security challenges stated by the industry.
The paper then proposes a roadmap for improving security in IIoT which investigates both short-term and long-term solutions for protecting IIoT devices.
The short-term solution is mainly based on integrating existing good practices.
The paper also outlines a long term solution for protecting IIoT devices with fine-grained access control for sharing data between external entities that would support cloud-based data storage.
We present a simple algorithm for computing the document array given the string collection and its suffix array as input.
Our algorithm runs in linear time using constant workspace for large collections of short strings.
We present a key recovery attack against Y. Wang's Random Linear Code Encryption (RLCE) scheme recently submitted to the NIST call for post-quantum cryptography.
This attack recovers the secret key for all the short key parameters proposed by the author.
Functioning is gaining recognition as an important indicator of global health, but remains under-studied in medical natural language processing research.
We present the first analysis of automatically extracting descriptions of patient mobility, using a recently-developed dataset of free text electronic health records.
We frame the task as a named entity recognition (NER) problem, and investigate the applicability of NER techniques to mobility extraction.
As text corpora focused on patient functioning are scarce, we explore domain adaptation of word embeddings for use in a recurrent neural network NER system.
We find that embeddings trained on a small in-domain corpus perform nearly as well as those learned from large out-of-domain corpora, and that domain adaptation techniques yield additional improvements in both precision and recall.
Our analysis identifies several significant challenges in extracting descriptions of patient mobility, including the length and complexity of annotated entities and high linguistic variability in mobility descriptions.
Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc.
Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation.
However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process.
As such, it limits its success when the length of the generated text samples is long (more than 20 words).
In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation.
We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance.
The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation.
Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios.
More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.
A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world.
One of the key issues is to describe the visual relationships between objects.
When dealing with real world data, capturing these very diverse interactions is a difficult problem.
It can be alleviated by incorporating common sense in a network.
For this, we propose a framework that makes use of semantic knowledge and estimates the relevance of object pairs during both training and test phases.
Extracted from precomputed models and training annotations, this information is distilled into the neural network dedicated to this task.
Using this approach, we observe a significant improvement on all classes of Visual Genome, a challenging visual relationship dataset.
A 68.5% relative gain on the recall at 100 is directly related to the relevance estimate and a 32.7% gain to the knowledge distillation.
There have been numerous studies on the problem of flocking control for multiagent systems whose simplified models are presented in terms of point-mass elements.
Meanwhile, full dynamic models pose some challenging problems in addressing the flocking control problem of mobile robots due to their nonholonomic dynamic properties.
Taking practical constraints into consideration, we propose a novel approach to distributed flocking control of nonholonomic mobile robots by bounded feedback.
The flocking control objectives consist of velocity consensus, collision avoidance, and cohesion maintenance among mobile robots.
A flocking control protocol which is based on the information of neighbor mobile robots is constructed.
The theoretical analysis is conducted with the help of a Lyapunov-like function and graph theory.
Simulation results are shown to demonstrate the efficacy of the proposed distributed flocking control scheme.
This paper describes three programming problems that are simple enough to be used in the beginning of a CS undergraduate program but also interesting enough to be worth exploring with different approaches.
We are able to apply a mixture of programming practices, abstraction and algebraic approaches to the problems, so that these subjects may be presented as complementary and allowing for clear and elegant solutions.
This work aims to assess the state of the art of data parallel deep neural network training, trying to identify potential research tracks to be exploited for performance improvement.
Beside, it presents a design for a practical C++ library dedicated at implementing and unifying the current state of the art methodologies for parallel training in a performance-conscious framework, allowing the user to explore novel strategies without departing significantly from its usual work-flow.
Automatic License Plate Recognition (ALPR) has been a frequent topic of research due to many practical applications.
However, many of the current solutions are still not robust in real-world situations, commonly depending on many constraints.
This paper presents a robust and efficient ALPR system based on the state-of-the-art YOLO object detector.
The Convolutional Neural Networks (CNNs) are trained and fine-tuned for each ALPR stage so that they are robust under different conditions (e.g., variations in camera, lighting, and background).
Specially for character segmentation and recognition, we design a two-stage approach employing simple data augmentation tricks such as inverted License Plates (LPs) and flipped characters.
The resulting ALPR approach achieved impressive results in two datasets.
First, in the SSIG dataset, composed of 2,000 frames from 101 vehicle videos, our system achieved a recognition rate of 93.53% and 47 Frames Per Second (FPS), performing better than both Sighthound and OpenALPR commercial systems (89.80% and 93.03%, respectively) and considerably outperforming previous results (81.80%).
Second, targeting a more realistic scenario, we introduce a larger public dataset, called UFPR-ALPR dataset, designed to ALPR.
This dataset contains 150 videos and 4,500 frames captured when both camera and vehicles are moving and also contains different types of vehicles (cars, motorcycles, buses and trucks).
In our proposed dataset, the trial versions of commercial systems achieved recognition rates below 70%.
On the other hand, our system performed better, with recognition rate of 78.33% and 35 FPS.
Many computer vision applications, such as object recognition and segmentation, increasingly build on superpixels.
However, there have been so far few superpixel algorithms that systematically deal with noisy images.
We propose to first decompose the image into equal-sized rectangular patches, which also sets the maximum superpixel size.
Within each patch, a Potts model for simultaneous segmentation and denoising is applied, that guarantees connected and non-overlapping superpixels and also produces a denoised image.
The corresponding optimization problem is formulated as a mixed integer linear program (MILP), and solved by a commercial solver.
Extensive experiments on the BSDS500 dataset images with noises are compared with other state-of-the-art superpixel methods.
Our method achieves the best result in terms of a combined score (OP) composed of the under-segmentation error, boundary recall and compactness.
Capabilities of inference and prediction are significant components of visual systems.
In this paper, we address an important and challenging task of them: visual path prediction.
Its goal is to infer the future path for a visual object in a static scene.
This task is complicated as it needs high-level semantic understandings of both the scenes and motion patterns underlying video sequences.
In practice, cluttered situations have also raised higher demands on the effectiveness and robustness of the considered models.
Motivated by these observations, we propose a deep learning framework which simultaneously performs deep feature learning for visual representation in conjunction with spatio-temporal context modeling.
After that, we propose a unified path planning scheme to make accurate future path prediction based on the analytic results of the context models.
The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.
In order to comprehensively evaluate the model's performance on the visual path prediction task, we construct two large benchmark datasets from the adaptation of video tracking datasets.
The qualitative and quantitative experimental results show that our approach outperforms the existing approaches and owns a better generalization capability.
Being able to automatically and quickly understand the user context during a session is a main issue for recommender systems.
As a first step toward achieving that goal, we propose a model that observes in real time the diversity brought by each item relatively to a short sequence of consultations, corresponding to the recent user history.
Our model has a complexity in constant time, and is generic since it can apply to any type of items within an online service (e.g. profiles, products, music tracks) and any application domain (e-commerce, social network, music streaming), as long as we have partial item descriptions.
The observation of the diversity level over time allows us to detect implicit changes.
In the long term, we plan to characterize the context, i.e. to find common features among a contiguous sub-sequence of items between two changes of context determined by our model.
This will allow us to make context-aware and privacy-preserving recommendations, to explain them to users.
As this is an ongoing research, the first step consists here in studying the robustness of our model while detecting changes of context.
In order to do so, we use a music corpus of 100 users and more than 210,000 consultations (number of songs played in the global history).
We validate the relevancy of our detections by finding connections between changes of context and events, such as ends of session.
Of course, these events are a subset of the possible changes of context, since there might be several contexts within a session.
We altered the quality of our corpus in several manners, so as to test the performances of our model when confronted with sparsity and different types of items.
The results show that our model is robust and constitutes a promising approach.
Social graphs, representing online friendships among users, are one of the fundamental types of data for many applications, such as recommendation, virality prediction and marketing in social media.
However, this data may be unavailable due to the privacy concerns of users, or kept private by social network operators, which makes such applications difficult.
Inferring user interests and discovering user connections through their shared multimedia content has attracted more and more attention in recent years.
This paper proposes a Gaussian relational topic model for connection discovery using user shared images in social media.
The proposed model not only models user interests as latent variables through their shared images, but also considers the connections between users as a result of their shared images.
It explicitly relates user shared images to user connections in a hierarchical, systematic and supervisory way and provides an end-to-end solution for the problem.
This paper also derives efficient variational inference and learning algorithms for the posterior of the latent variables and model parameters.
It is demonstrated through experiments with over 200k images from Flickr that the proposed method significantly outperforms the methods in previous works.
Sharing data from various sources and of diverse kinds, and fusing them together for sophisticated analytics and mash-up applications are emerging trends, and are prerequisites for grand visions such as that of cyber-physical systems enabled smart cities.
Cloud infrastructure can enable such data sharing both because it can scale easily to an arbitrary volume of data and computation needs on demand, as well as because of natural collocation of diverse such data sets within the infrastructure.
However, in order to convince data owners that their data are well protected while being shared among cloud users, the cloud platform needs to provide flexible mechanisms for the users to express the constraints (access rules) subject to which the data should be shared, and likewise, enforce them effectively.
We study a comprehensive set of practical scenarios where data sharing needs to be enforced by methods such as aggregation, windowed frame, value constrains, etc., and observe that existing basic access control mechanisms do not provide adequate flexibility to enable effective data sharing in a secure and controlled manner.
In this paper, we thus propose a framework for cloud that extends popular XACML model significantly by integrating flexible access control decisions and data access in a seamless fashion.
We have prototyped the framework and deployed it on commercial cloud environment for experimental runs to test the efficacy of our approach and evaluate the performance of the implemented prototype.
In this paper we present a method for automatically planning optimal paths for a group of robots that satisfy a common high level mission specification.
Each robot's motion in the environment is modeled as a weighted transition system.
The mission is given as a Linear Temporal Logic formula.
In addition, an optimizing proposition must repeatedly be satisfied.
The goal is to minimize the maximum time between satisfying instances of the optimizing proposition.
Our method is guaranteed to compute an optimal set of robot paths.
We utilize a timed automaton representation in order to capture the relative position of the robots in the environment.
We then obtain a bisimulation of this timed automaton as a finite transition system that captures the joint behavior of the robots and apply our earlier algorithm for the single robot case to optimize the group motion.
We present a simulation of a persistent monitoring task in a road network environment.
This paper describes a new method of data encoding which may be used in various modern digital, computer and telecommunication systems and devices.
The method permits the compression of data for storage or transmission, allowing the exact original data to be reconstructed without any loss of content.
The method is characterized by the simplicity of implementation, as well as high speed and compression ratio.
The method is based on a unique scheme of binary-ternary prefix-free encoding of characters of the original data.
This scheme does not require the transmission of the code tables from encoder to decoder; allows for the linear presentation of the code lists; permits the usage of computable indexes of the prefix codes in a linear list for decoding; makes it possible to estimate the compression ratio prior to encoding; makes the usage of multiplication and division operations, as well as operations with the floating point unnecessary; proves to be effective for static as well as adaptive coding; applicable to character sets of any size; allows for repeated compression to improve the ratio.
Many deep models have been recently proposed for anomaly detection.
This paper presents comparison of selected generative deep models and classical anomaly detection methods on an extensive number of non--image benchmark datasets.
We provide statistical comparison of the selected models, in many configurations, architectures and hyperparamaters.
We arrive to conclusion that performance of the generative models is determined by the process of selection of their hyperparameters.
Specifically, performance of the deep generative models deteriorates with decreasing amount of anomalous samples used in hyperparameter selection.
In practical scenarios of anomaly detection, none of the deep generative models systematically outperforms the kNN.
The use of floating bipolar electrodes in electrowinning cells of copper constitutes a nonconventional technology that promises economic and operational impacts.
This paper presents a computational tool for the simulation and analysis of such electrochemical cells.
A new model is developed for floating electrodes and a method of finite difference is used to obtain the threedimensional distribution of the potential and the field of current density inside the cell.
The analysis of the results is based on a technique for the interactive visualization of three-dimensional vectorial fields as lines of flow.
This paper presents an overview of an assembler driven verification methodology (ADVM) that was created and implemented for a chip card project at Infineon Technologies AG.
The primary advantage of this methodology is that it enables rapid porting of directed tests to new targets and derivatives, with only a minimum amount of code refactoring.
As a consequence, considerable verification development time and effort was saved.
This paper addresses the problem of localizing an unknown number of targets, all having the same radar signature, by a distributed MIMO radar consisting of single antenna transmitters and receivers that cannot determine directions of departure and arrival.
Furthermore, we consider the presence of multipath propagation, and the possible (correlated) blocking of the direct paths (going from the transmitter and reflecting off a target to the receiver).
In its most general form, this problem can be cast as a Bayesian estimation problem where every multipath component is accounted for.
However, when the environment map is unknown, this problem is ill-posed and hence, a tractable approximation is derived where only direct paths are accounted for.
In particular, we take into account the correlated blocking by scatterers in the environment which appears as a prior term in the Bayesian estimation framework.
A sub-optimal polynomial-time algorithm to solve the Bayesian multi-target localization problem with correlated blocking is proposed and its performance is evaluated using simulations.
We found that when correlated blocking is severe, assuming the blocking events to be independent and having constant probability (as was done in previous papers) resulted in poor detection performance, with false alarms more likely to occur than detections.
Weakly supervised semantic segmentation and localiza- tion have a problem of focusing only on the most important parts of an image since they use only image-level annota- tions.
In this paper, we solve this problem fundamentally via two-phase learning.
Our networks are trained in two steps.
In the first step, a conventional fully convolutional network (FCN) is trained to find the most discriminative parts of an image.
In the second step, the activations on the most salient parts are suppressed by inference conditional feedback, and then the second learning is performed to find the area of the next most important parts.
By combining the activations of both phases, the entire portion of the tar- get object can be captured.
Our proposed training scheme is novel and can be utilized in well-designed techniques for weakly supervised semantic segmentation, salient region detection, and object location prediction.
Detailed experi- ments demonstrate the effectiveness of our two-phase learn- ing in each task.
When a large collection of objects (e.g., robots, sensors, etc.) has to be deployed in a given environment, it is often required to plan a coordinated motion of the objects from their initial position to a final configuration enjoying some global property.
In such a scenario, the problem of minimizing some function of the distance travelled, and therefore energy consumption, is of vital importance.
In this paper we study several motion planning problems that arise when the objects must be moved on a graph, in order to reach certain goals which are of interest for several network applications.
Among the others, these goals include broadcasting messages and forming connected or interference-free networks.
We study these problems with the aim of minimizing a number of natural measures such as the average/overall distance travelled, the maximum distance travelled, or the number of objects that need to be moved.
To this respect, we provide several approximability and inapproximability results, most of which are tight.
This paper targets at learning to score the figure skating sports videos.
To address this task, we propose a deep architecture that includes two complementary components, i.e., Self-Attentive LSTM and Multi-scale Convolutional Skip LSTM.
These two components can efficiently learn the local and global sequential information in each video.
Furthermore, we present a large-scale figure skating sports video dataset -- FisV dataset.
This dataset includes 500 figure skating videos with the average length of 2 minutes and 50 seconds.
Each video is annotated by two scores of nine different referees, i.e., Total Element Score(TES) and Total Program Component Score (PCS).
Our proposed model is validated on FisV and MIT-skate datasets.
The experimental results show the effectiveness of our models in learning to score the figure skating videos.
A deep learning network was used to predict future blood glucose levels, as this can permit diabetes patients to take action before imminent hyperglycaemia and hypoglycaemia.
A sequential model with one long-short-term memory (LSTM) layer, one bidirectional LSTM layer and several fully connected layers was used to predict blood glucose levels for different prediction horizons.
The method was trained and tested on 26 datasets from 20 real patients.
The proposed network outperforms the baseline methods in terms of all evaluation criteria.
We propose and study a novel continuous space-time model for wireless networks which takes into account the stochastic interactions in both space through interference and in time due to randomness in traffic.
Our model consists of an interacting particle birth-death dynamics incorporating information-theoretic spectrum sharing.
Roughly speaking, particles (or more generally wireless links) arrive according to a Poisson Point Process on space-time, and stay for a duration governed by the local configuration of points present and then exit the network after completion of a file transfer.
We analyze this particle dynamics to derive an explicit condition for time ergodicity (i.e. stability) which is tight.
We also prove that when the dynamics is ergodic, the steady-state point process of links (or particles) exhibits a form statistical clustering.
Based on the clustering, we propose a conjecture which we leverage to derive approximations, bounds and asymptotics on performance characteristics such as delay and mean number of links per unit-space in the stationary regime.
The mathematical analysis is combined with discrete event simulation to study the performance of this type of networks.
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions.
There are two common scan orders for the variables: random scan and systematic scan.
Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan.
While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions.
To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.
We present a technique for estimating the similarity between objects such as movies or foods whose proper representation depends on human perception.
Our technique combines a modest number of human similarity assessments to infer a pairwise similarity function between the objects.
This similarity function captures some human notion of similarity which may be difficult or impossible to automatically extract, such as which movie from a collection would be a better substitute when the desired one is unavailable.
In contrast to prior techniques, our method does not assume that all similarity questions on the collection can be answered or that all users perceive similarity in the same way.
When combined with a user model, we find how each assessor's tastes vary, affecting their perception of similarity.
Variation graphs, which represent genetic variation within a population, are replacing sequences as reference genomes.
Path indexes are one of the most important tools for working with variation graphs.
They generalize text indexes to graphs, allowing one to find the paths matching the query string.
We propose using de Bruijn graphs as path indexes, compressing them by merging redundant subgraphs, and encoding them with the Burrows-Wheeler transform.
The resulting fast, space-efficient, and versatile index is used in the variation graph toolkit vg.
The study of representations invariant to common transformations of the data is important to learning.
Most techniques have focused on local approximate invariance implemented within expensive optimization frameworks lacking explicit theoretical guarantees.
In this paper, we study kernels that are invariant to the unitary group while having theoretical guarantees in addressing practical issues such as (1) unavailability of transformed versions of labelled data and (2) not observing all transformations.
We present a theoretically motivated alternate approach to the invariant kernel SVM.
Unlike previous approaches to the invariant SVM, the proposed formulation solves both issues mentioned.
We also present a kernel extension of a recent technique to extract linear unitary-group invariant features addressing both issues and extend some guarantees regarding invariance and stability.
We present experiments on the UCI ML datasets to illustrate and validate our methods.
We present a method that learns to answer visual questions by selecting image regions relevant to the text-based query.
Our method exhibits significant improvements in answering questions such as "what color," where it is necessary to evaluate a specific location, and "what room," where it selectively identifies informative image regions.
Our model is tested on the VQA dataset which is the largest human-annotated visual question answering dataset to our knowledge.
This paper presents the InScript corpus (Narrative Texts Instantiating Script structure).
InScript is a corpus of 1,000 stories centered around 10 different scenarios.
Verbs and noun phrases are annotated with event and participant types, respectively.
Additionally, the text is annotated with coreference information.
The corpus shows rich lexical variation and will serve as a unique resource for the study of the role of script knowledge in natural language processing.
This paper focuses on the temporal aspect for recognizing human activities in videos; an important visual cue that has long been either disregarded or ill-used.
We revisit the conventional definition of an activity and restrict it to "Complex Action": a set of one-actions with a weak temporal pattern that serves a specific purpose.
Related works use spatiotemporal 3D convolutions with fixed kernel size, too rigid to capture the varieties in temporal extents of complex actions, and too short for long-range temporal modeling.
In contrast, we use multi-scale temporal convolutions, and we reduce the complexity of 3D convolutions.
The outcome is Timeception convolution layers, which reasons about minute-long temporal patterns, a factor of 8 longer than best related works.
As a result, Timeception achieves impressive accuracy in recognizing human activities of Charades.
Further, we conduct analysis to demonstrate that Timeception learns long-range temporal dependencies and tolerate temporal extents of complex actions.
We introduce Mix&Match (M&M) - a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise.
The key innovation is a procedure that allows us to automatically form a curriculum over agents.
Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping from solutions found by simpler agents.
In contradistinction to typical curriculum learning approaches, we do not gradually modify the tasks or environments presented, but instead use a process to gradually alter how the policy is represented internally.
We show the broad applicability of our method by demonstrating significant performance gains in three different experimental setups: (1) We train an agent able to control more than 700 actions in a challenging 3D first-person task; using our method to progress through an action-space curriculum we achieve both faster training and better final performance than one obtains using traditional methods.
(2) We further show that M&M can be used successfully to progress through a curriculum of architectural variants defining an agents internal state.
(3) Finally, we illustrate how a variant of our method can be used to improve agent performance in a multitask setting.
Recurrent neural network models with an attention mechanism have proven to be extremely effective on a wide variety of sequence-to-sequence problems.
However, the fact that soft attention mechanisms perform a pass over the entire input sequence when producing each element in the output sequence precludes their use in online settings and results in a quadratic time complexity.
Based on the insight that the alignment between input and output sequence elements is monotonic in many problems of interest, we propose an end-to-end differentiable method for learning monotonic alignments which, at test time, enables computing attention online and in linear time.
We validate our approach on sentence summarization, machine translation, and online speech recognition problems and achieve results competitive with existing sequence-to-sequence models.
Washing machine is of great domestic necessity as it frees us from the burden of washing our clothes and saves ample of our time.
This paper will cover the aspect of designing and developing of Fuzzy Logic based, Smart Washing Machine.
The regular washing machine (timer based) makes use of multi-turned timer based start-stop mechanism which is mechanical as is prone to breakage.
In addition to its starting and stopping issues, the mechanical timers are not efficient with respect of maintenance and electricity usage.
Recent developments have shown that merger of digital electronics in optimal functionality of this machine is possible and nowadays in practice.
A number of international renowned companies have developed the machine with the introduction of smart artificial intelligence.
Such a machine makes use of sensors and smartly calculates the amount of run-time (washing time) for the main machine motor.
Realtime calculations and processes are also catered in optimizing the run-time of the machine.
The obvious result is smart time management, better economy of electricity and efficiency of work.
This paper deals with the indigenization of FLC (Fuzzy Logic Controller) based Washing Machine, which is capable of automating the inputs and getting the desired output (wash-time).
Generating graphs that are similar to real ones is an open problem, while the similarity notion is quite elusive and hard to formalize.
In this paper, we focus on sparse digraphs and propose SDG, an algorithm that aims at generating graphs similar to real ones.
Since real graphs are evolving and this evolution is important to study in order to understand the underlying dynamical system, we tackle the problem of generating series of graphs.
We propose SEDGE, an algorithm meant to generate series of graphs similar to a real series.
SEDGE is an extension of SDG.
We consider graphs that are representations of software programs and show experimentally that our approach outperforms other existing approaches.
Experiments show the performance of both algorithms.
In previous studies, much attention from multidisciplinary fields has been devoted to understand the mechanism of underlying scholarly networks including bibliographic networks, citation networks and co-citation networks.
Particularly focusing on networks constructed by means of either authors affinities or the mutual content.
Missing a valuable dimension of network, which is an audience scholarly paper.
We aim at this paper to assess the impact that social networks and media can have on scholarly papers.
We also examine the process of information flow in such networks.
We also mention some observa- tions of attractive incidents that our proposed network model revealed.
In wireless sensor networks (WSNs), security has a vital importance.
Recently, there was a huge interest to propose security solutions in WSNs because of their applications in both civilian and military domains.
Adversaries can launch different types of attacks, and cryptography is used to countering these attacks.
This paper presents challenges of security and a classification of the different possible attacks in WSNs.
The problems of security in each layer of the network's OSI model are discussed.
In this paper, we review multi-agent collective behavior algorithms in the literature and classify them according to their underlying mathematical structure.
For each mathematical technique, we identify the multi-agent coordination tasks it can be applied to, and we analyze its scalability, bandwidth use, and demonstrated maturity.
We highlight how versatile techniques such as artificial potential functions can be used for applications ranging from low-level position control to high-level coordination and task allocation, we discuss possible reasons for the slow adoption of complex distributed coordination algorithms in the field, and we highlight areas for further research and development.
In the last two decades, number of Higher Education Institutions (HEI) grows in leaps and bounds.
This causes a cut throat competition among these institutions while attracting the student get admission in these institutions.
To make reach up to the students institution makes effort of advertisement.
Similarly developing and developed both type of institution launch several services also to attract students.
Most of the institutions are opened in self finance mode.
So all time they feel short hand in expenditure.
Now a day a number of advertisement methods are available.
So it is difficult for an institution to make advertisement through all modes and launch all services at the same time due to different constraints.
In this paper we use support and confidence method to find out the best way of advertisement.
The position we advocate in this paper is that relational algebra can provide a unified language for both representing and computing with statistical-relational objects, much as linear algebra does for traditional single-table machine learning.
Relational algebra is implemented in the Structured Query Language (SQL), which is the basis of relational database management systems.
To support our position, we have developed the FACTORBASE system, which uses SQL as a high-level scripting language for statistical-relational learning of a graphical model structure.
The design philosophy of FACTORBASE is to manage statistical models as first-class citizens inside a database.
Our implementation shows how our SQL constructs in FACTORBASE facilitate fast, modular, and reliable program development.
Empirical evidence from six benchmark databases indicates that leveraging database system capabilities achieves scalable model structure learning.
Controller synthesis techniques for continuous systems with respect to temporal logic specifications typically use a finite-state symbolic abstraction of the system.
Constructing this abstraction for the entire system is computationally expensive, and does not exploit natural decompositions of many systems into interacting components.
We have recently introduced a new relation, called (approximate) disturbance bisimulation for compositional symbolic abstraction to help scale controller synthesis for temporal logic to larger systems.
In this paper, we extend the results to stochastic control systems modeled by stochastic differential equations.
Given any stochastic control system satisfying a stochastic version of the incremental input-to-state stability property and a positive error bound, we show how to construct a finite-state transition system (if there exists one) which is disturbance bisimilar to the given stochastic control system.
Given a network of stochastic control systems, we give conditions on the simultaneous existence of disturbance bisimilar abstractions to every component allowing for compositional abstraction of the network system.
Given a set of data, biclustering aims at finding simultaneous partitions in biclusters of its samples and of the features which are used for representing the samples.
Consistent biclusterings allow to obtain correct classifications of the samples from the known classification of the features, and vice versa, and they are very useful for performing supervised classifications.
The problem of finding consistent biclusterings can be seen as a feature selection problem, where the features that are not relevant for classification purposes are removed from the set of data, while the total number of features is maximized in order to preserve information.
This feature selection problem can be formulated as a linear fractional 0-1 optimization problem.
We propose a reformulation of this problem as a bilevel optimization problem, and we present a heuristic algorithm for an efficient solution of the reformulated problem.
Computational experiments show that the presented algorithm is able to find better solutions with respect to the ones obtained by employing previously presented heuristic algorithms.
Image description task has been invariably examined in a static manner with qualitative presumptions held to be universally applicable, regardless of the scope or target of the description.
In practice, however, different viewers may pay attention to different aspects of the image, and yield different descriptions or interpretations under various contexts.
Such diversity in perspectives is difficult to derive with conventional image description techniques.
In this paper, we propose a customized image narrative generation task, in which the users are interactively engaged in the generation process by providing answers to the questions.
We further attempt to learn the user's interest via repeating such interactive stages, and to automatically reflect the interest in descriptions for new images.
Experimental results demonstrate that our model can generate a variety of descriptions from single image that cover a wider range of topics than conventional models, while being customizable to the target user of interaction.
Multipath routing is a trivial way to exploit the path diversity to leverage the network throughput.
Technologies such as OSPF ECMP use all the available paths in the network to forward traffic, however, we argue that is not necessary to do so to load balance the network.
In this paper, we consider multipath routing with only a limited number of end-to-end paths for each source and destination, and found that this can still load balance the traffic.
We devised an algorithm to select a few paths for each source-destination pair so that when all traffic are forwarded over these paths, we can achieve a balanced load in the sense that the maximum link utilization is comparable to that of ECMP forwarding.
When the constraint of only shortest paths (i.e. equal paths) are relaxed, we can even outperform ECMP in certain cases.
As a result, we can use a few end-to-end tunnels between each source and destination nodes to achieve the load balancing of traffic.
In this paper, we consider the problem of leveraging existing fully labeled categories to improve the weakly supervised detection (WSD) of new object categories, which we refer to as mixed supervised detection (MSD).
Different from previous MSD methods that directly transfer the pre-trained object detectors from existing categories to new categories, we propose a more reasonable and robust objectness transfer approach for MSD.
In our framework, we first learn domain-invariant objectness knowledge from the existing fully labeled categories.
The knowledge is modeled based on invariant features that are robust to the distribution discrepancy between the existing categories and new categories; therefore the resulting knowledge would generalize well to new categories and could assist detection models to reject distractors (e.g., object parts) in weakly labeled images of new categories.
Under the guidance of learned objectness knowledge, we utilize multiple instance learning (MIL) to model the concepts of both objects and distractors and to further improve the ability of rejecting distractors in weakly labeled images.
Our robust objectness transfer approach outperforms the existing MSD methods, and achieves state-of-the-art results on the challenging ILSVRC2013 detection dataset and the PASCAL VOC datasets.
Data analytics and data science play a significant role in nowadays society.
In the context of Smart Grids (SG), the collection of vast amounts of data has seen the emergence of a plethora of data analysis approaches.
In this paper, we conduct a Systematic Mapping Study (SMS) aimed at getting insights about different facets of SG data analysis: application sub-domains (e.g., power load control), aspects covered (e.g., forecasting), used techniques (e.g., clustering), tool-support, research methods (e.g., experiments/simulations), replicability/reproducibility of research.
The final goal is to provide a view of the current status of research.
Overall, we found that each sub-domain has its peculiarities in terms of techniques, approaches and research methodologies applied.
Simulations and experiments play a crucial role in many areas.
The replicability of studies is limited concerning the provided implemented algorithms, and to a lower extent due to the usage of private datasets.
Future advancements in robot autonomy and sophistication of robotics tasks rest on robust, efficient, and task-dependent semantic understanding of the environment.
Semantic segmentation is the problem of simultaneous segmentation and categorization of a partition of sensory data.
The majority of current approaches tackle this using multi-class segmentation and labeling in a Conditional Random Field (CRF) framework or by generating multiple object hypotheses and combining them sequentially.
In practical settings, the subset of semantic labels that are needed depend on the task and particular scene and labelling every single pixel is not always necessary.
We pursue these observations in developing a more modular and flexible approach to multi-class parsing of RGBD data based on learning strategies for combining independent binary object-vs-background segmentations in place of the usual monolithic multi-label CRF approach.
Parameters for the independent binary segmentation models can be learned very efficiently, and the combination strategy---learned using reinforcement learning---can be set independently and can vary over different tasks and environments.
Accuracy is comparable to state-of-art methods on a subset of the NYU-V2 dataset of indoor scenes, while providing additional flexibility and modularity.
Understanding driving situations regardless the conditions of the traffic scene is a cornerstone on the path towards autonomous vehicles; however, despite common sensor setups already include complementary devices such as LiDAR or radar, most of the research on perception systems has traditionally focused on computer vision.
We present a LiDAR-based 3D object detection pipeline entailing three stages.
First, laser information is projected into a novel cell encoding for bird's eye view projection.
Later, both object location on the plane and its heading are estimated through a convolutional neural network originally designed for image processing.
Finally, 3D oriented detections are computed in a post-processing phase.
Experiments on KITTI dataset show that the proposed framework achieves state-of-the-art results among comparable methods.
Further tests with different LiDAR sensors in real scenarios assess the multi-device capabilities of the approach.
Digital image forensics is a young but maturing field, encompassing key areas such as camera identification, detection of forged images, and steganalysis.
However, large gaps exist between academic results and applications used by practicing forensic analysts.
To move academic discoveries closer to real-world implementations, it is important to use data that represent "in the wild" scenarios.
For detection of stego images created from steganography apps, images generated from those apps are ideal to use.
In this paper, we present our work to perform steg detection on images from mobile apps using two different approaches: "signature" detection, and machine learning methods.
A principal challenge of the ML task is to create a great many of stego images from different apps with certain embedding rates.
One of our main contributions is a procedure for generating a large image database by using Android emulators and reverse engineering techniques.
We develop algorithms and tools for signature detection on stego apps, and provide solutions to issues encountered when creating ML classifiers.
No two people are alike.
We usually ignore this diversity as we have the capability to adapt and, without noticing, become experts in interfaces that were probably misadjusted to begin with.
This adaptation is not always at the user's reach.
One neglected group is the blind.
Spatial ability, memory, and tactile sensitivity are some characteristics that diverge between users.
Regardless, all are presented with the same methods ignoring their capabilities and needs.
Interaction with mobile devices is highly visually demanding which widens the gap between blind people.
Our research goal is to identify the individual attributes that influence mobile interaction, considering the blind, and match them with mobile interaction modalities in a comprehensive and extensible design space.
We aim to provide knowledge both for device design, device prescription and interface adaptation.
State-of-the-art deep reading comprehension models are dominated by recurrent neural nets.
Their sequential nature is a natural fit for language, but it also precludes parallelization within an instances and often becomes the bottleneck for deploying such models to latency critical scenarios.
This is particularly problematic for longer texts.
Here we present a convolutional architecture as an alternative to these recurrent architectures.
Using simple dilated convolutional units in place of recurrent ones, we achieve results comparable to the state of the art on two question answering tasks, while at the same time achieving up to two orders of magnitude speedups for question answering.
Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery in deep learning.
Existing works indicate that this observation holds for both complicated real datasets and simple datasets of one-dimensional (1-d) functions.
In this work, for fitting low-frequency dominant 1-d functions, memorizing natural images and classification problems, we empirically found that a DNN, i.e., full-connected DNN or convolutional neural networks with common settings first quickly captures the dominant low-frequency components, and then relatively slowly captures high-frequency ones.
We call this phenomenon Frequency Principle (F-Principle).
F-Principle can be observed over various DNN setups of different activation functions, layer structures and training algorithms in our experiments.
F-Principle can be used to understand (i) the behavior of DNN training in the information plane and (ii) why DNNs often generalize well albeit its ability of overfitting.
This F-Principle potentially can provide insights into understanding the general principle underlying DNN optimization and generalization for real datasets.
Named Entity Recognition (NER) is a key NLP task, which is all the more challenging on Web and user-generated content with their diverse and continuously changing language.
This paper aims to quantify how this diversity impacts state-of-the-art NER methods, by measuring named entity (NE) and context variability, feature sparsity, and their effects on precision and recall.
In particular, our findings indicate that NER approaches struggle to generalise in diverse genres with limited training data.
Unseen NEs, in particular, play an important role, which have a higher incidence in diverse genres such as social media than in more regular genres such as newswire.
Coupled with a higher incidence of unseen features more generally and the lack of large training corpora, this leads to significantly lower F1 scores for diverse genres as compared to more regular ones.
We also find that leading systems rely heavily on surface forms found in training data, having problems generalising beyond these, and offer explanations for this observation.
This paper presents a new connection between the generalized Marcum-Q function and the confluent hypergeometric function of two variables, phi3.
This result is then applied to the closed-form characterization of the bivariate Nakagami-m distribution and of the distribution of the minimum eigenvalue of correlated non-central Wishart matrices, both important in communication theory.
New expressions for the corresponding cumulative distributions are obtained and a number of communication-theoretic problems involving them are pointed out.
We propose a method to leapfrog pixel-wise, semantic segmentation of (aerial) images and predict objects in a vector representation directly.
PolyMapper predicts maps of cities from aerial images as collections of polygons with a learnable framework.
Instead of the usual multi-step procedure of semantic segmentation, shape improvement, conversion to polygons, and polygon refinement, our approach learns mappings with a single network architecture and directly outputs maps.
We demonstrate that our method is capable of drawing polygons of buildings and road networks that very closely approximate the structure of existing online maps such as OpenStreetMap, and it does so in a fully automated manner.
Validation on existing and novel large scale datasets of several cities show that our approach achieves good levels of performance.
The modeling of cascade processes in multi-agent systems in the form of complex networks has in recent years become an important topic of study due to its many applications: the adoption of commercial products, spread of disease, the diffusion of an idea, etc.
In this paper, we begin by identifying a desiderata of seven properties that a framework for modeling such processes should satisfy: the ability to represent attributes of both nodes and edges, an explicit representation of time, the ability to represent non-Markovian temporal relationships, representation of uncertain information, the ability to represent competing cascades, allowance of non-monotonic diffusion, and computational tractability.
We then present the MANCaLog language, a formalism based on logic programming that satisfies all these desiderata, and focus on algorithms for finding minimal models (from which the outcome of cascades can be obtained) as well as how this formalism can be applied in real world scenarios.
We are not aware of any other formalism in the literature that meets all of the above requirements.
Embarrassingly parallel problems can be split in parts that are characterized by a really low (or sometime absent) exchange of information during their computation in parallel.
As a consequence they can be effectively computed in parallel exploiting commodity hardware, hence without particularly sophisticated interconnection networks.
Basically, this means Clusters, Networks of Workstations and Desktops as well as Computational Clouds.
Despite the simplicity of this computational model, it can be exploited to compute a quite large range of problems.
This paper describes JJPF, a tool for developing task parallel applications based on Java and Jini that showed to be an effective and efficient solution in environment like Clusters and Networks of Workstations and Desktops.
Recent work has shown that fast, compact low-bitwidth neural networks can be surprisingly accurate.
These networks use homogeneous binarization: all parameters in each layer or (more commonly) the whole model have the same low bitwidth (e.g., 2 bits).
However, modern hardware allows efficient designs where each arithmetic instruction can have a custom bitwidth, motivating heterogeneous binarization, where every parameter in the network may have a different bitwidth.
In this paper, we show that it is feasible and useful to select bitwidths at the parameter granularity during training.
For instance a heterogeneously quantized version of modern networks such as AlexNet and MobileNet, with the right mix of 1-, 2- and 3-bit parameters that average to just 1.4 bits can equal the accuracy of homogeneous 2-bit versions of these networks.
Further, we provide analyses to show that the heterogeneously binarized systems yield FPGA- and ASIC-based implementations that are correspondingly more efficient in both circuit area and energy efficiency than their homogeneous counterparts.
Semi-autonomous vehicles are increasingly serving critical functions in various settings from mining to logistics to defence.
A key characteristic of such systems is the presence of the human (drivers) in the control loop.
To ensure safety, both the driver needs to be aware of the autonomous aspects of the vehicle and the automated features of the vehicle built to enable safer control.
In this paper we propose a framework to combine empirical models describing human behaviour with the environment and system models.
We then analyse, via model checking, interaction between the models for desired safety properties.
The aim is to analyse the design for safe vehicle-driver interaction.
We demonstrate the applicability of our approach using a case study involving semi-autonomous vehicles where the driver fatigue are factors critical to a safe journey.
We address the recognition of agent-in-place actions, which are associated with agents who perform them and places where they occur, in the context of outdoor home surveillance.
We introduce a representation of the geometry and topology of scene layouts so that a network can generalize from the layouts observed in the training set to unseen layouts in the test set.
This Layout-Induced Video Representation (LIVR) abstracts away low-level appearance variance and encodes geometric and topological relationships of places in a specific scene layout.
LIVR partitions the semantic features of a video clip into different places to force the network to learn place-based feature descriptions; to predict the confidence of each action, LIVR aggregates features from the place associated with an action and its adjacent places on the scene layout.
We introduce the Agent-in-Place Action dataset to show that our method allows neural network models to generalize significantly better to unseen scenes.
Molecular communication is an expanding body of research.
Recent advances in biology have encouraged using genetically engineered bacteria as the main component in the molecular communication.
This has stimulated a new line of research that attempts to study molecular communication among bacteria from an information-theoretic point of view.
Due to high randomness in the individual behavior of the bacterium, reliable communication between two bacteria is almost impossible.
Therefore, we recently proposed that a population of bacteria in a cluster is considered as a node capable of molecular transmission and reception.
This proposition enables us to form a reliable node out of many unreliable bacteria.
The bacteria inside a node sense the environment and respond accordingly.
In this paper, we study the communication between two nodes, one acting as the transmitter and the other as the receiver.
We consider the case in which the information is encoded in the concentration of molecules by the transmitter.
The molecules produced by the bacteria in the transmitter node propagate in the environment via the diffusion process.
Then, their concentration sensed by the bacteria in the receiver node would decode the information.
The randomness in the communication is caused by both the error in the molecular production at the transmitter and the reception of molecules at the receiver.
We study the theoretical limits of the information transfer rate in such a setup versus the number of bacteria per node.
Finally, we consider M-ary modulation schemes and study the achievable rates and their error probabilities.
Many projects have applied knowledge patterns (KPs) to the retrieval of specialized information.
Yet terminologists still rely on manual analysis of concordance lines to extract semantic information, since there are no user-friendly publicly available applications enabling them to find knowledge rich contexts (KRCs).
To fill this void, we have created the KP-based EcoLexicon Semantic SketchGrammar (ESSG) in the well-known corpus query system Sketch Engine.
For the first time, the ESSG is now publicly available inSketch Engine to query the EcoLexicon English Corpus.
Additionally, reusing the ESSG in any English corpus uploaded by the user enables Sketch Engine to extract KRCs codifying generic-specific, part-whole, location, cause and function relations, because most of the KPs are domain-independent.
The information is displayed in the form of summary lists (word sketches) containing the pairs of terms linked by a given semantic relation.
This paper describes the process of building a KP-based sketch grammar with special focus on the last stage, namely, the evaluation with refinement purposes.
We conducted an initial shallow precision and recall evaluation of the 64 English sketch grammar rules created so far for hyponymy, meronymy and causality.
Precision was measured based on a random sample of concordances extracted from each word sketch type.
Recall was assessed based on a random sample of concordances where known term pairs are found.
The results are necessary for the improvement and refinement of the ESSG.
The noise of false positives helped to further specify the rules, whereas the silence of false negatives allows us to find useful new patterns.
Automating the detection of anomalous events within long video sequences is challenging due to the ambiguity of how such events are defined.
We approach the problem by learning generative models that can identify anomalies in videos using limited supervision.
We propose end-to-end trainable composite Convolutional Long Short-Term Memory (Conv-LSTM) networks that are able to predict the evolution of a video sequence from a small number of input frames.
Regularity scores are derived from the reconstruction errors of a set of predictions with abnormal video sequences yielding lower regularity scores as they diverge further from the actual sequence over time.
The models utilize a composite structure and examine the effects of conditioning in learning more meaningful representations.
The best model is chosen based on the reconstruction and prediction accuracy.
The Conv-LSTM models are evaluated both qualitatively and quantitatively, demonstrating competitive results on anomaly detection datasets.
Conv-LSTM units are shown to be an effective tool for modeling and predicting video sequences.
Active learning identifies data points to label that are expected to be the most useful in improving a supervised model.
Opportunistic active learning incorporates active learning into interactive tasks that constrain possible queries during interactions.
Prior work has shown that opportunistic active learning can be used to improve grounding of natural language descriptions in an interactive object retrieval task.
In this work, we use reinforcement learning for such an object retrieval task, to learn a policy that effectively trades off task completion with model improvement that would benefit future tasks.
I present a method for lossy transform coding of digital audio that uses the Weyl symbol calculus for constructing the encoding and decoding transformation.
The method establishes a direct connection between a time-frequency representation of the signal dependent threshold of masked noise and the encode/decode pair.
The formalism also offers a time-frequency measure of perceptual entropy.
Despite the recent deep learning (DL) revolution, kernel machines still remain powerful methods for action recognition.
DL has brought the use of large datasets and this is typically a problem for kernel approaches, which are not scaling up efficiently due to kernel Gram matrices.
Nevertheless, kernel methods are still attractive and more generally applicable since they can equally manage different sizes of the datasets, also in cases where DL techniques show some limitations.
This work investigates these issues by proposing an explicit approximated representation that, together with a linear model, is an equivalent, yet scalable, implementation of a kernel machine.
Our approximation is directly inspired by the exact feature map that is induced by an RBF Gaussian kernel but, unlike the latter, it is finite dimensional and very compact.
We justify the soundness of our idea with a theoretical analysis which proves the unbiasedness of the approximation, and provides a vanishing bound for its variance, which is shown to decrease much rapidly than in alternative methods in the literature.
In a broad experimental validation, we assess the superiority of our approximation in terms of 1) ease and speed of training, 2) compactness of the model, and 3) improvements with respect to the state-of-the-art performance.
We introduce an exact reformulation of a broad class of neighborhood filters, among which the bilateral filters, in terms of two functional rearrangements: the decreasing and the relative rearrangements.
Independently of the image spatial dimension (one-dimensional signal, image, volume of images, etc.
), we reformulate these filters as integral operators defined in a one-dimensional space corresponding to the level sets measures.
We prove the equivalence between the usual pixel-based version and the rearranged version of the filter.
When restricted to the discrete setting, our reformulation of bilateral filters extends previous results for the so-called fast bilateral filtering.
We, in addition, prove that the solution of the discrete setting, understood as constant-wise interpolators, converges to the solution of the continuous setting.
Finally, we numerically illustrate computational aspects concerning quality approximation and execution time provided by the rearranged formulation.
Cellular Automata (CA) have been considered one of the most pronounced parallel computational tools in the recent era of nature and bio-inspired computing.
Taking advantage of their local connectivity, the simplicity of their design and their inherent parallelism, CA can be effectively applied to many image processing tasks.
In this paper, a CA approach for efficient salt-n-pepper noise filtering in grayscale images is presented.
Using a 2D Moore neighborhood, the classified "noisy" cells are corrected by averaging the non-noisy neighboring cells.
While keeping the computational burden really low, the proposed approach succeeds in removing high-noise levels from various images and yields promising qualitative and quantitative results, compared to state-of-the-art techniques.
Computation of the extended gcd of two quadratic integers.
The ring of integers considered is principal but could be euclidean or not euclidean ring.
This method rely on principal ideal ring and reduction of binary quadratic forms.
Making personalized and context-aware suggestions of venues to the users is very crucial in venue recommendation.
These suggestions are often based on matching the venues' features with the users' preferences, which can be collected from previously visited locations.
In this paper we present a novel user-modeling approach which relies on a set of scoring functions for making personalized suggestions of venues based on venues content and reviews as well as users context.
Our experiments, conducted on the dataset of the TREC Contextual Suggestion Track, prove that our methodology outperforms state-of-the-art approaches by a significant margin.
We propose a novel approach for deformation-aware neural networks that learn the weighting and synthesis of dense volumetric deformation fields.
Our method specifically targets the space-time representation of physical surfaces from liquid simulations.
Liquids exhibit highly complex, non-linear behavior under changing simulation conditions such as different initial conditions.
Our algorithm captures these complex phenomena in two stages: a first neural network computes a weighting function for a set of pre-computed deformations, while a second network directly generates a deformation field for refining the surface.
Key for successful training runs in this setting is a suitable loss function that encodes the effect of the deformations, and a robust calculation of the corresponding gradients.
To demonstrate the effectiveness of our approach, we showcase our method with several complex examples of flowing liquids with topology changes.
Our representation makes it possible to rapidly generate the desired implicit surfaces.
We have implemented a mobile application to demonstrate that real-time interactions with complex liquid effects are possible with our approach.
Effective emergency and natural disaster management depend on the efficient mission-critical voice and data communication between first responders and victims.
Land Mobile Radio System (LMRS) is a legacy narrowband technology used for critical voice communications with limited use for data applications.
Recently Long Term Evolution (LTE) emerged as a broadband communication technology that has a potential to transform the capabilities of public safety technologies by providing broadband, ubiquitous, and mission-critical voice and data support.
For example, in the United States, FirstNet is building a nationwide coast-to-coast public safety network based of LTE broadband technology.
This paper presents a comparative survey of legacy and the LTE-based public safety networks, and discusses the LMRS-LTE convergence as well as mission-critical push-to-talk over LTE.
A simulation study of LMRS and LTE band class 14 technologies is provided using the NS-3 open source tool.
An experimental study of APCO-25 and LTE band class 14 is also conducted using software-defined radio, to enhance the understanding of the public safety systems.
Finally, emerging technologies that may have strong potential for use in public safety networks are reviewed.
Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling.
The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts.
However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost.
This makes the training of RNNs unscalable and difficult.
To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation.
Compared with other tensor decomposition methods, TR-LSTM is more stable.
In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data.
Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor train LSTM and other state-of-the-art competitors.
We introduce "Unspeech" embeddings, which are based on unsupervised learning of context feature representations for spoken language.
The embeddings were trained on up to 9500 hours of crawled English speech data without transcriptions or speaker information, by using a straightforward learning objective based on context and non-context discrimination with negative sampling.
We use a Siamese convolutional neural network architecture to train Unspeech embeddings and evaluate them on speaker comparison, utterance clustering and as a context feature in TDNN-HMM acoustic models trained on TED-LIUM, comparing it to i-vector baselines.
Particularly decoding out-of-domain speech data from the recently released Common Voice corpus shows consistent WER reductions.
We release our source code and pre-trained Unspeech models under a permissive open source license.
This paper presents a detailed study of the energy consumption of the different Java Collection Framework (JFC) implementations.
For each method of an implementation in this framework, we present its energy consumption when handling different amounts of data.
Knowing the greenest methods for each implementation, we present an energy optimization approach for Java programs: based on calls to JFC methods in the source code of a program, we select the greenest implementation.
Finally, we present preliminary results of optimizing a set of Java programs where we obtained 6.2% energy savings.
The attitude space has been parameterized in various ways for practical purposes.
Different representations gain preferences over others based on their intuitive understanding, ease of implementation, formulaic simplicity, and physical as well as mathematical complications involved in using them.
This technical note gives a brief overview and discusses the quaternions, which are fourth dimensional extended complex numbers and used to represent orientation.
Their relationship to other modes of attitude representation such as Euler angles and Axis-Angle representation is also explored and conversion from one representation to another is explained.
The conventions, intuitive understanding and formulas most frequently used and indispensable to any quaternion application are stated and wherever possible, derived.
We generalize previous studies on critical phenomena in communication networks by adding computational capabilities to the nodes to better describe real-world situations such as cloud computing.
A set of tasks with random origin and destination with a multi-tier computational structure is distributed on a network modeled as a graph.
The execution time (or latency) of each task is statically computed and the sum is used as the energy in a Montecarlo simulation in which the temperature parameter controls the resource allocation optimality.
We study the transition to congestion by varying temperature and system load.
A method to approximately recover the time-evolution of the system by interpolating the latency probability distributions is presented.
This allows us to study the standard transition to the congested phase by varying the task production rate.
We are able to reproduce the main known results on network congestion and to gain a deeper insight over the maximum theoretical performance of a system and its sensitivity to routing and load balancing errors.
In a previous paper, we have shown that any Boolean formula can be encoded as a linear programming problem in the framework of Bayesian probability theory.
When applied to NP-complete algorithms, this leads to the fundamental conclusion that P = NP.
Now, we implement this concept in elementary arithmetic and especially in multiplication.
This provides a polynomial time deterministic factoring algorithm, while no such algorithm is known to day.
This result clearly appeals for a revaluation of the current cryptosystems.
The Bayesian arithmetic environment can also be regarded as a toy model for quantum mechanics.
Unmanned aerial vehicles mounted base stations (UAV-BSs) are expected to become one of the significant components of the Next Generation Wireless Networks (NGWNs).
Rapid deployment, mobility, higher chances of unobstructed propagation path, and flexibility features of UAV-BSs have attracted significant attention.
Despite, potentially, high gains brought by UAV-BSs in NGWNs, many challenges are also introduced by them.
Optimal location assignment to UAV-BSs, arguably, is the most widely investigated problem in the literature on UAV-BSs in NGWNs.
This paper presents a comprehensive survey of the literature on the location optimization of UAV-BSs in NGWNs.
A generic optimization framework through a universal Mixed Integer Non-Linear Programming (MINLP) formulation is constructed and the specifications of its constituents are elaborated.
The generic problem is classified into a novel taxonomy.
Due to the highly challenging nature of the optimization problem a range of solutions are adopted in the literature which are also covered under the aforementioned classification.
Furthermore, future research directions on UAV-BS location optimization in 5G and beyond non-terrestrial aerial communication systems are discussed.
Given the great interest in creating keyframe summaries from video, it is surprising how little has been done to formalise their evaluation and comparison.
User studies are often carried out to demonstrate that a proposed method generates a more appealing summary than one or two rival methods.
But larger comparison studies cannot feasibly use such user surveys.
Here we propose a discrimination capacity measure as a formal way to quantify the improvement over the uniform baseline, assuming that one or more ground truth summaries are available.
Using the VSUMM video collection, we examine 10 video feature types, including CNN and SURF, and 6 methods for matching frames from two summaries.
Our results indicate that a simple frame representation through hue histograms suffices for the purposes of comparing keyframe summaries.
We subsequently propose a formal protocol for comparing summaries when ground truth is available.
The work takes another look at the number of runs that a string might contain and provides an alternative proof for the bound.
We also propose another stronger conjecture that states that, for a fixed order on the alphabet, within every factor of a word there are at most as many occurrences of Lyndon roots corresponding to runs in a word as the length of the factor (only first such occurrences for each run are considered).
In this research, it was used a segmentation and classification method to identify threat recognition in human scanner images of airport security.
The Department of Homeland Security's (DHS) in USA has a higher false alarm, produced from theirs algorithms using today's scanners at the airports.
To repair this problem they started a new competition at Kaggle site asking the science community to improve their detection with new algorithms.
The dataset used in this research comes from DHS at https://www.kaggle.com/c/passenger-screening-algorithm-challenge/data According to DHS: "This dataset contains a large number of body scans acquired by a new generation of millimeter wave scanner called the High Definition-Advanced Imaging Technology (HD-AIT) system.
They are comprised of volunteers wearing different clothing types (from light summer clothes to heavy winter clothes), different body mass indices, different genders, different numbers of threats, and different types of threats".
Using Python as a principal language, the preprocessed of the dataset images extracted features from 200 bodies using: intensity, intensity differences and local neighbourhood to detect, to produce segmentation regions and label those regions to be used as a truth in a training and test dataset.
The regions are subsequently give to a CNN deep learning classifier to predict 17 classes (that represents the body zones): zone1, zone2, ... zone17 and zones with threat in a total of 34 zones.
The analysis showed the results of the classifier an accuracy of 98.2863% and a loss of 0.091319, as well as an average of 100% for recall and precision.
Autonomous robot manipulation often involves both estimating the pose of the object to be manipulated and selecting a viable grasp point.
Methods using RGB-D data have shown great success in solving these problems.
However, there are situations where cost constraints or the working environment may limit the use of RGB-D sensors.
When limited to monocular camera data only, both the problem of object pose estimation and of grasp point selection are very challenging.
In the past, research has focused on solving these problems separately.
In this work, we introduce a novel method called SilhoNet that bridges the gap between these two tasks.
We use a Convolutional Neural Network (CNN) pipeline that takes in ROI proposals to simultaneously predict an intermediate silhouette representation for objects with an associated occlusion mask.
The 3D pose is then regressed from the predicted silhouettes.
Grasp points from a precomputed database are filtered by back-projecting them onto the occlusion mask to find which points are visible in the scene.
We show that our method achieves better overall performance than the state-of-the art PoseCNN network for 3D pose estimation on the YCB-video dataset.
This is the preprint version of our paper on ICONIP2015.
The proposed platform supports the integrated VRGIS functions including 3D spatial analysis functions, 3D visualization for spatial process and serves for 3D globe and digital city.
The 3D analysis and visualization of the concerned city massive information are conducted in the platform.
The amount of information that can be visualized with this platform is overwhelming, and the GIS based navigational scheme allows to have great flexibility to access the different available data sources.
Vlogs provide a rich public source of data in a novel setting.
This paper examined the continuous sentiment styles employed in 27,333 vlogs using a dynamic intra-textual approach to sentiment analysis.
Using unsupervised clustering, we identified seven distinct continuous sentiment trajectories characterized by fluctuations of sentiment throughout a vlog's narrative time.
We provide a taxonomy of these seven continuous sentiment styles and found that vlogs whose sentiment builds up towards a positive ending are the most prevalent in our sample.
Gender was associated with preferences for different continuous sentiment trajectories.
This paper discusses the findings with respect to previous work and concludes with an outlook towards possible uses of the corpus, method and findings of this paper for related areas of research.
A low carbon society aims at fighting global warming by stimulating synergic efforts from governments, industry and scientific communities.
Decision support systems should be adopted to provide policy makers with possible scenarios, options for prompt countermeasures in case of side effects on environment, economy and society due to low carbon society policies, and also options for information management.
A necessary precondition to fulfill this agenda is to face the complexity of this multi-disciplinary domain and to reach a common understanding on it as a formal specification.
Ontologies are widely accepted means to share knowledge.
Together with semantic rules, they enable advanced semantic services to manage knowledge in a smarter way.
Here we address the European Emissions Trading System (EU-ETS) and we present a knowledge base consisting of the EREON ontology and a catalogue of rules.
Then we describe two innovative semantic services to manage ETS data and information on ETS scenarios.
The economic model of the Internet of Things (IoT) consists of end users, advertisers and three different kinds of providers--IoT service provider (IoTSP), Wireless service provider (WSP) and cloud service provider (CSP).
We investigate three different kinds of interactions among the providers.
First, we consider that the IoTSP prices a bundled service to the end-users, and the WSP and CSP pay the IoTSP (push model).
Next, we consider the model where the end-users independently pay the each provider (pull model).
Finally, we consider a hybrid model of the above two where the IoTSP and WSP quote their prices to the end-users, but the CSP quotes its price to the IoTSP.
We characterize and quantify the impact of the advertisement revenue on the equilibrium pricing strategy and payoff of providers, and corresponding demands of end users in each of the above interaction models.
Our analysis reveals that the demand of end-users, and the payoffs of the providers are non decreasing functions of the advertisement revenue.
For sufficiently high advertisement revenue, the IoTSP will offer its service free of cost in each interaction model.
However, the payoffs of the providers, and the demand of end-users vary across different interaction models.
Our analysis shows that the demand of end-users, and the payoff of the WSP are the highest in the pull (push, resp.) model in the low (high, resp.) advertisement revenue regime.
The payoff of the IoTSP is always higher in the pull model irrespective of the advertisement revenue.
The payoff of the CSP is the highest in the hybrid model in the low advertisement revenue regime.
However, in the high advertisement revenue regime the payoff of the CSP in the hybrid model or in the push model can be higher depending on the equilibrium chosen in the push model.
In this paper we combine one method for hierarchical reinforcement learning - the options framework - with deep Q-networks (DQNs) through the use of different "option heads" on the policy network, and a supervisory network for choosing between the different options.
We utilise our setup to investigate the effects of architectural constraints in subtasks with positive and negative transfer, across a range of network capacities.
We empirically show that our augmented DQN has lower sample complexity when simultaneously learning subtasks with negative transfer, without degrading performance when learning subtasks with positive transfer.
A phaser is an expressive synchronization construct that unifies collective and point-to-point coordination with dynamic task parallelism.
Each task can participate in a phaser as a signaler, a waiter, or both.
The participants in a phaser may change over time as dynamic tasks are added and deleted.
In this poster, we present a highly concurrent and scalable design of phasers for a distributed memory environment that is suitable for use with asynchronous partitioned global address space programming models.
Our design for a distributed phaser employs a pair of skip lists augmented with the ability to collect and propagate synchronization signals.
To enable a high degree of concurrency, addition and deletion of participant tasks are performed in two phases: a "fast single-link-modify" step followed by multiple hand-overhand "lazy multi-link-modify" steps.
We show that the cost of synchronization and structural operations on a distributed phaser scales logarithmically, even in the presence of concurrent structural modifications.
To verify the correctness of our design for distributed phasers, we employ the SPIN model checker.
To address this issue of state space explosion, we describe how we decompose the state space to separately verify correct handling for different kinds of messages, which enables complete model checking of our phaser design.
ALPINE is to our knowledge the first anytime algorithm to mine frequent itemsets and closed frequent itemsets.
It guarantees that all itemsets with support exceeding the current checkpoint's support have been found before it proceeds further.
Thus, it is very attractive for extremely long mining tasks with very high dimensional data (for example in genetics) because it can offer intermediate meaningful and complete results.
This ANYTIME feature is the most important contribution of ALPINE, which is also fast but not necessarily the fastest algorithm around.
Another critical advantage of ALPINE is that it does not require the apriori decided minimum support value.
Increased accuracy in predictive models for handwritten character recognition will open up new frontiers for optical character recognition.
Major drawbacks of predictive machine learning models are headed by the elongated training time taken by some models, and the requirement that training and test data be in the same feature space and consist of the same distribution.
In this study, these obstacles are minimized by presenting a model for transferring knowledge from one task to another.
This model is presented for the recognition of handwritten numerals in Indic languages.
The model utilizes convolutional neural networks with backpropagation for error reduction and dropout for data overfitting.
The output performance of the proposed neural network is shown to have closely matched other state-of-the-art methods using only a fraction of time used by the state-of-the-arts.
Designing an optimal network topology while balancing multiple, possibly conflicting objectives like cost, performance, and resiliency to viruses is a challenging endeavor, let alone in the case of decentralized network formation.
We therefore propose a game-formation technique where each player aims to minimize its cost in installing links, the probability of being infected by a virus and the sum of hopcounts on its shortest paths to all other nodes.
In this article, we (1) determine the Nash Equilibria and the Price of Anarchy for our novel network formation game, (2) demonstrate that the Price of Anarchy (PoA) is usually low, which suggests that (near-)optimal topologies can be formed in a decentralized way, and (3) give suggestions for practitioners for those cases where the PoA is high and some centralized control/incentives are advisable.
This work considers the secure and reliable information transmission in two-hop relay wireless networks without the information of both eavesdropper channels and locations.
While the previous work on this problem mainly studied infinite networks and their asymptotic behavior and scaling law results, this papers focuses on a more practical network with finite number of system nodes and explores the corresponding exact results on the number of eavesdroppers the network can tolerant to ensure a desired secrecy and reliability.
For achieving secure and reliable information transmission in a finite network, two transmission protocols are considered in this paper, one adopts an optimal but complex relay selection process with less load balance capacity while the other adopts a random but simple relay selection process with good load balance capacity.
Theoretical analysis is further provided to determine the exact and maximum number of independent and also uniformly distributed eavesdroppers one network can tolerate to satisfy a specified requirement in terms of the maximum secrecy outage probability and maximum transmission outage probability allowed.
This article is an empirical contribution to the field of educational technology but also - and above all - a methodological contribution to the analysis of the activities enacted in this field.
It takes account of a pilot study conducted within the framework of doctoral research and consisted in describing, analysing and modelling the activity of a trainee teacher in a situation of autonomous use of a video-based digital learning environment (DLE).
We were particularly careful to describe the method in great detail.
Two types of data were collected and processed within the framework of "course-of-action": (i)activity observation data (dynamic screen capture) and (ii) data from resituating interviews supported by digital traces of that activity.
The findings (i) validate the method's relevance in relation to the object and issues of the research, (ii)show different levels of organization in the activity deployed in the situation of use, (iii) highlight four registers of concerns orienting use of the DLE.
We conclude from a perspective of educational technology, by discussing how, according to certain conditions and different time scales, the findings inform a process of continuous DLE design.
Joint privacy-cost optimization is studied for a smart grid consumer, whose electricity consumption is monitored in almost real time by the utility provider (UP).
It is assumed that an energy storage device, e.g., an electrical battery, is available to the consumer, which can be utilized both to achieve privacy and to reduce the energy cost by modifying the electricity consumption.
Privacy is measured via the mean squared distance between the smart meter readings and a target load profile, while time-of-use pricing is considered to compute the electricity cost.
The consumer also has the possibility to sell electricity back to the UP to further improve the privacy-cost trade-off.
Two privacy-preserving energy management policies (EMPs) are proposed, which differ in the way the target load profile is characterized.
Additionally, a simplified and more practical EMP, which optimizes the energy management less frequently, is considered.
Numerical results are presented to compare the performances of these EMPs in terms of the privacy-cost trade-off they achieve, considering a number of privacy indicators.
In blind motion deblurring, leading methods today tend towards highly non-convex approximations of the l0-norm, especially in the image regularization term.
In this paper, we propose a simple, effective and fast approach for the estimation of the motion blur-kernel, through a bi-l0-l2-norm regularization imposed on both the intermediate sharp image and the blur-kernel.
Compared with existing methods, the proposed regularization is shown to be more effective and robust, leading to a more accurate motion blur-kernel and a better final restored image.
A fast numerical scheme is deployed for alternatingly computing the sharp image and the blur-kernel, by coupling the operator splitting and augmented Lagrangian methods.
Experimental results on both a benchmark image dataset and real-world motion blurred images show that the proposed approach is highly competitive with state-of-the- art methods in both deblurring effectiveness and computational efficiency.
Game logic is a dynamic modal logic which models strategic two person games; it contains propositional dynamic logic (PDL) as a fragment.
We propose an interpretation of game logic based on stochastic effectivity functions.
A definition of these functions is proposed, and some algebraic properties of effectivity functions such as congruences are investigated.
The relationship to stochastic relations is characterized through a deduction system.
Logical and behavioral equivalence of game models is investigated.
Finally the completion of models receives some attention.
Processors may find some elementary operations to be faster than the others.
Although an operation may be conceptually as simple as some other operation, the processing speeds of the two can vary.
A clever programmer will always try to choose the faster instructions for the job.
This paper presents an algorithm to display squares of 1st N natural numbers without using multiplication (* operator).
Instead, the same work can be done using addition (+ operator).
The results can also be used to compute the sum of those squares.
If we compare the normal method of computing the squares of 1st N natural numbers with this method, we can conclude that the algorithm discussed in the paper is more optimized in terms of time complexity.
Traditional visual speech recognition systems consist of two stages, feature extraction and classification.
Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage.
However, research on joint learning of features and classification is very limited.
In this work, we present an end-to-end visual speech recognition system based on Long-Short Memory (LSTM) networks.
To the best of our knowledge, this is the first model which simultaneously learns to extract features directly from the pixels and perform classification and also achieves state-of-the-art performance in visual speech classification.
The model consists of two streams which extract features directly from the mouth and difference images, respectively.
The temporal dynamics in each stream are modelled by an LSTM and the fusion of the two streams takes place via a Bidirectional LSTM (BLSTM).
An absolute improvement of 9.7% over the base line is reported on the OuluVS2 database, and 1.5% on the CUAVE database when compared with other methods which use a similar visual front-end.
Image-based virtual try-on systems for fitting new in-shop clothes into a person image have attracted increasing research attention, yet is still challenging.
A desirable pipeline should not only transform the target clothes into the most fitting shape seamlessly but also preserve well the clothes identity in the generated image, that is, the key characteristics (e.g.texture, logo, embroidery) that depict the original clothes.
However, previous image-conditioned generation works fail to meet these critical requirements towards the plausible virtual try-on performance since they fail to handle large spatial misalignment between the input image and target clothes.
Prior work explicitly tackled spatial deformation using shape context matching, but failed to preserve clothing details due to its coarse-to-fine strategy.
In this work, we propose a new fully-learnable Characteristic-Preserving Virtual Try-On Network(CP-VTON) for addressing all real-world challenges in this task.
First, CP-VTON learns a thin-plate spline transformation for transforming the in-shop clothes into fitting the body shape of the target person via a new Geometric Matching Module (GMM) rather than computing correspondences of interest points as prior works did.
Second, to alleviate boundary artifacts of warped clothes and make the results more realistic, we employ a Try-On Module that learns a composition mask to integrate the warped clothes and the rendered image to ensure smoothness.
Extensive experiments on a fashion dataset demonstrate our CP-VTON achieves the state-of-the-art virtual try-on performance both qualitatively and quantitatively.
This paper describes our approach on Query Word Labeling as an attempt in the shared task on Mixed Script Information Retrieval at Forum for Information Retrieval Evaluation (FIRE) 2015.
The query is written in Roman script and the words were in English or transliterated from Indian regional languages.
A total of eight Indian languages were present in addition to English.
We also identified the Named Entities and special symbols as part of our task.
A CRF based machine learning framework was used for labeling the individual words with their corresponding language labels.
We used a dictionary based approach for language identification.
We also took into account the context of the word while identifying the language.
Our system demonstrated an overall accuracy of 75.5% for token level language identification.
The strict F-measure scores for the identification of token level language labels for Bengali, English and Hindi are 0.7486, 0.892 and 0.7972 respectively.
The overall weighted F-measure of our system was 0.7498.
Anomaly detectors are often used to produce a ranked list of statistical anomalies, which are examined by human analysts in order to extract the actual anomalies of interest.
Unfortunately, in realworld applications, this process can be exceedingly difficult for the analyst since a large fraction of high-ranking anomalies are false positives and not interesting from the application perspective.
In this paper, we aim to make the analyst's job easier by allowing for analyst feedback during the investigation process.
Ideally, the feedback influences the ranking of the anomaly detector in a way that reduces the number of false positives that must be examined before discovering the anomalies of interest.
In particular, we introduce a novel technique for incorporating simple binary feedback into tree-based anomaly detectors.
We focus on the Isolation Forest algorithm as a representative tree-based anomaly detector, and show that we can significantly improve its performance by incorporating feedback, when compared with the baseline algorithm that does not incorporate feedback.
Our technique is simple and scales well as the size of the data increases, which makes it suitable for interactive discovery of anomalies in large datasets.
Recurrent neural networks have proven to be very effective for natural language inference tasks.
We build on top of one such model, namely BiLSTM with max pooling, and show that adding a hierarchy of BiLSTM and max pooling layers yields state of the art results for the SNLI sentence encoding-based models and the SciTail dataset, as well as provides strong results for the MultiNLI dataset.
We also show that our sentence embeddings can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks.
Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings' ability to capture some of the important linguistic properties of sentences.
Fine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data.
Traditional fully supervised algorithms fail to handle this problem where there is low between-class variance and high within-class variance for the classes of interest with small sample sizes.
We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists for some of the classes.
ZSL aims to build a recognition model for new unseen categories by relating them to seen classes that were previously learned.
We establish this relation by learning a compatibility function between image features extracted via a convolutional neural network and auxiliary information that describes the semantics of the classes of interest by using training samples from the seen classes.
Then, we show how knowledge transfer can be performed for the unseen classes by maximizing this function during inference.
We introduce a new data set that contains 40 different types of street trees in 1-ft spatial resolution aerial data, and evaluate the performance of this model with manually annotated attributes, a natural language model, and a scientific taxonomy as auxiliary information.
The experiments show that the proposed model achieves 14.3% recognition accuracy for the classes with no training examples, which is significantly better than a random guess accuracy of 6.3% for 16 test classes, and three other ZSL algorithms.
Process mining has emerged as a way to analyze the behavior of an organization by extracting knowledge from event logs and by offering techniques to discover, monitor and enhance real processes.
In the discovery of process models, retrieving a complex one, i.e., a hardly readable process model, can hinder the extraction of information.
Even in well-structured process models, there is information that cannot be obtained with the current techniques.
In this paper, we present WoMine, an algorithm to retrieve frequent behavioural patterns from the model.
Our approach searches in process models extracting structures with sequences, selections, parallels and loops, which are frequently executed in the logs.
This proposal has been validated with a set of process models, including some from BPI Challenges, and compared with the state of the art techniques.
Experiments have validated that WoMine can find all types of patterns, extracting information that cannot be mined with the state of the art techniques.
A long-standing practical challenge in the optimization of higher-order languages is inlining functions with free variables.
Inlining code statically at a function call site is safe if the compiler can guarantee that the free variables have the same bindings at the inlining point as they do at the point where the function is bound as a closure (code and free variables).
There have been many attempts to create a heuristic to check this correctness condition, from Shivers' kCFA-based reflow analysis to Might's Delta-CFA and anodization, but all of those have performance unsuitable for practical compiler implementations.
In practice, modern language implementations rely on a series of tricks to capture some common cases (e.g., closures whose free variables are only top-level identifiers such as +) and rely on hand-inlining by the programmer for anything more complicated.
This work provides the first practical, general approach for inlining functions with free variables.
We also provide a proof of correctness, an evaluation of both the execution time and performance impact of this optimization, and some tips and tricks for implementing an efficient and precise control-flow analysis.
Convolutional neural networks with spatio-temporal 3D kernels (3D CNNs) have an ability to directly extract spatio-temporal features from videos for action recognition.
Although the 3D kernels tend to overfit because of a large number of their parameters, the 3D CNNs are greatly improved by using recent huge video databases.
However, the architecture of 3D CNNs is relatively shallow against to the success of very deep neural networks in 2D-based CNNs, such as residual networks (ResNets).
In this paper, we propose a 3D CNNs based on ResNets toward a better action representation.
We describe the training procedure of our 3D ResNets in details.
We experimentally evaluate the 3D ResNets on the ActivityNet and Kinetics datasets.
The 3D ResNets trained on the Kinetics did not suffer from overfitting despite the large number of parameters of the model, and achieved better performance than relatively shallow networks, such as C3D.
Our code and pretrained models (e.g.Kinetics and ActivityNet) are publicly available at https://github.com/kenshohara/3D-ResNets.
We extend the Multi-lane Spatial Logic MLSL, introduced in previous work for proving the safety (collision freedom) of traffic maneuvers on a multi-lane highway, by length measurement and dynamic modalities.
We investigate the proof theory of this extension, called EMLSL.
To this end, we prove the undecidability of EMLSL but nevertheless present a sound proof system which allows for reasoning about the safety of traffic situations.
We illustrate the latter by giving a formal proof for the reservation lemma we could only prove informally before.
Furthermore we prove a basic theorem showing that the length measurement is independent from the number of lanes on the highway.
A map merging component is crucial for the proper functionality of a multi-robot system performing exploration, since it provides the means to integrate and distribute the most important information carried by the agents: the explored-covered space and its exact (depending on the SLAM accuracy) morphology.
Map merging is a prerequisite for an intelligent multi-robot team aiming to deploy a smart exploration technique.
In the current work, a metric map merging approach based on environmental information is proposed, in conjunction with spatially scattered RFID tags localization.
This approach is divided into the following parts: the maps approximate rotation calculation via the obstacles poses and localized RFID tags, the translation employing the best localized common RFID tag and finally the transformation refinement using an ICP algorithm.
We develop a thermodynamic framework for modeling nonlinear ultrasonic damage sensing and prognosis in materials undergoing progressive damage.
The framework is based on the internal variable approach and relies on the construction of a pseudo-elastic strain energy function that captures the energetics associated with the damage progression.
The pseudo-elastic strain energy function is composed of two energy functions - one that describes how a material stores energy in an elastic fashion and the other describes how material dissipates energy or stores it in an inelastic fashion.
Experimental motivation for the choice of the above two functionals is discussed and some specific choices pertaining to damage progression during fatigue and creep are presented.
The thermodynamic framework is employed to model the nonlinear response of material undergoing stress relaxation and creep-like degradation.
For each of the above cases, evolution of the nonlinearity parameter with damage as well as with macroscopic measurables like accumulated plastic strain are obtained.
Visualizations have a potentially enormous influence on how data are used to make decisions across all areas of human endeavor.
However, it is not clear how this power connects to ethical duties: what obligations do we have when it comes to visualizations and visual analytics systems, beyond our duties as scientists and engineers?
Drawing on historical and contemporary examples, I address the moral components of the design and use of visualizations, identify some ongoing areas of visualization research with ethical dilemmas, and propose a set of additional moral obligations that we have as designers, builders, and researchers of visualizations.
Standard Time-to-Live (TTL) cache management prescribes the storage of entire files, or possibly fractions thereof, for a given amount of time after a request.
As a generalization of this approach, this work proposes the storage of a time-varying, diminishing, fraction of a requested file.
Accordingly, the cache progressively evicts parts of the file over an interval of time following a request.
The strategy, which is referred to as soft-TTL, is justified by the fact that traffic traces are often characterized by arrival processes that display a decreasing, but non-negligible, probability of observing a request as the time elapsed since the last request increases.
An optimization-based analysis of soft-TTL is presented, demonstrating the important role played by the hazard function of the inter-arrival request process, which measures the likelihood of observing a request as a function of the time since the most recent request.
The aim of this paper is to provide a general mathematical framework for group equivariance in the machine learning context.
The framework builds on a synergy between persistent homology and the theory of group actions.
We define group-equivariant non-expansive operators (GENEOs), which are maps between function spaces associated with groups of transformations.
We study the topological and metric properties of the space of GENEOs to evaluate their approximating power and set the basis for general strategies to initialise and compose operators.
We begin by defining suitable pseudo-metrics for the function spaces, the equivariance groups, and the set of non-expansive operators.
Basing on these pseudo-metrics, we prove that the space of GENEOs is compact and convex, under the assumption that the function spaces are compact and convex.
These results provide fundamental guarantees in a machine learning perspective.
We show examples on the MNIST and fashion-MNIST datasets.
By considering isometry-equivariant non-expansive operators, we describe a simple strategy to select and sample operators, and show how the selected and sampled operators can be used to perform both classical metric learning and an effective initialisation of the kernels of a convolutional neural network.
Most search engines sell slots to place advertisements on the search results page through keyword auctions.
Advertisers offer bids for how much they are willing to pay when someone enters a search query, sees the search results, and then clicks on one of their ads.
Search engines typically order the advertisements for a query by a combination of the bids and expected clickthrough rates for each advertisement.
In this paper, we extend a model of Yahoo's and Google's advertising auctions to include an effect where repeatedly showing less relevant ads has a persistent impact on all advertising on the search engine, an impact we designate as the pollution effect.
In Monte-Carlo simulations using distributions fitted to Yahoo data, we show that a modest pollution effect is sufficient to dramatically change the advertising rank order that yields the optimal advertising revenue for a search engine.
In addition, if a pollution effect exists, it is possible to maximize revenue while also increasing advertiser, and publisher utility.
Our results suggest that search engines could benefit from making relevant advertisements less expensive and irrelevant advertisements more costly for advertisers than is the current practice.
Due to their numerous advantages, communications over multicarrier schemes constitute an appealing approach for broadband wireless systems.
Especially, the strong penetration of orthogonal frequency division multiplexing (OFDM) into the communications standards has triggered heavy investigation on multicarrier systems, leading to re-consideration of different approaches as an alternative to OFDM.
The goal of the present survey is not only to provide a unified review of waveform design options for multicarrier schemes, but also to pave the way for the evolution of the multicarrier schemes from the current state of the art to future technologies.
In particular, a generalized framework on multicarrier schemes is presented, based on what to transmit, i.e., symbols, how to transmit, i.e., filters, and where/when to transmit, i.e., lattice.
Capitalizing on this framework, different variations of orthogonal, bi-orthogonal, and nonorthogonal multicarrier schemes are discussed.
In addition, filter design for various multicarrier systems is reviewed considering four different design perspectives: energy concentration, rapid decay, spectrum nulling, and channel/hardware characteristics.
Subsequently, evaluation tools which may be used to compare different filters in multicarrier schemes are studied.
Finally, multicarrier schemes are evaluated from the view of the practical implementation issues, such as lattice adaptation, equalization, synchronization, multiple antennas, and hardware impairments.
There has been a paradigm shift in the industrial wireless sensor domain caused by the Internet of Things (IoT).
IoT is a thriving technology leading the way in short range and fixed wireless sensing.
One of the issues in Industrial Wireless Sensor Network-IWSN is finding the optimal solution for minimizing the defect time in superframe scheduling.
This paper proposes a method using the evolutionary algorithms approach namely particle swarm optimization (PSO), Orthogonal Learning PSO, genetic algorithms (GA) and modified GA for optimizing the scheduling of superframe.
We have also evaluated a contemporary method, deadline monotonic scheduling on the ISA 100.11a.
By using this standard as a case study, the presented simulations are object-oriented based, with numerous variations in the number of timeslots and wireless sensor nodes.
The simulation results show that the use of GA and modified GA can provide better performance for idle and missed deadlines.
A comprehensive and detailed performance evaluation is given in the paper.
Will a new smartphone application diffuse deeply in the population or will it sink into oblivion soon?
To predict this, we argue that common models of spread of innovations based on cascade dynamics or epidemics may not be fully adequate.
Therefore we propose a novel stochastic network dynamics modeling the spread of a new technological asset, whose adoption is based on the word-of-mouth and the persuasion strength increases the more the product is diffused.
In this paper we carry on an analysis on large scale graphs to show off how the parameters of the model, the topology of the graph and, possibly, the initial diffusion of the asset, determine whether the spread of the asset is successful or not.
In particular, by means of stochastic dominations and deterministic approximations, we provide some general results for a large class of expansive graphs.
Finally we present numerical simulations trying to expand the analytical results we proved to even more general topologies.
In this paper, a point-to-point Orthogonal Frequency Division Multiplexing (OFDM) system with a decode-and-forward (DF) relay is considered.
The transmission consists of two hops.
The source transmits in the first hop, and the relay transmits in the second hop.
Each hop occupies one time slot.
The relay is half-duplex, and capable of decoding the message on a particular subcarrier in one time slot, and re-encoding and forwarding it on a different subcarrier in the next time slot.
Thus each message is transmitted on a pair of subcarriers in two hops.
It is assumed that the destination is capable of combining the signals from the source and the relay pertaining to the same message.
The goal is to maximize the weighted sum rate of the system by jointly optimizing subcarrier pairing and power allocation on each subcarrier in each hop.
The weighting of the rates is to take into account the fact that different subcarriers may carry signals for different services.
Both total and individual power constraints for the source and the relay are investigated.
For the situations where the relay does not transmit on some subcarriers because doing so does not improve the weighted sum rate, we further allow the source to transmit new messages on these idle subcarriers.
To the best of our knowledge, such a joint optimization inclusive of the destination combining has not been discussed in the literature.
The problem is first formulated as a mixed integer programming problem.
It is then transformed to a convex optimization problem by continuous relaxation, and solved in the dual domain.
Based on the optimization results, algorithms to achieve feasible solutions are also proposed.
Simulation results show that the proposed algorithms almost achieve the optimal weighted sum rate, and outperform the existing methods in various channel conditions.
Active learning (AL) is a learning paradigm where an active learner has to train a model (e.g., a classifier) which is in principal trained in a supervised way, but in AL it has to be done by means of a data set with initially unlabeled samples.
To get labels for these samples, the active learner has to ask an oracle (e.g., a human expert) for labels.
The goal is to maximize the performance of the model and to minimize the number of queries at the same time.
In this article, we first briefly discuss the state of the art and own, preliminary work in the field of AL.
Then, we propose the concept of collaborative active learning (CAL).
With CAL, we will overcome some of the harsh limitations of current AL.
In particular, we envision scenarios where an expert may be wrong for various reasons, there might be several or even many experts with different expertise, the experts may label not only samples but also knowledge at a higher level such as rules, and we consider that the labeling costs depend on many conditions.
Moreover, in a CAL process human experts will profit by improving their own knowledge, too.
Among many biometrics such as face, iris, fingerprint and others, periocular region has the advantages over other biometrics because it is non-intrusive and serves as a balance between iris or eye region (very stringent, small area) and the whole face region (very relaxed large area).
Research have shown that this is the region which does not get affected much because of various poses, aging, expression, facial changes and other artifacts, which otherwise would change to a large variation.
Active research has been carried out on this topic since past few years due to its obvious advantages over face and iris biometrics in unconstrained and uncooperative scenarios.
Many researchers have explored periocular biometrics involving both visible (VIS) and infra-red (IR) spectrum images.
For a system to work for 24/7 (such as in surveillance scenarios), the registration process may depend on the day time VIS periocular images (or any mug shot image) and the testing or recognition process may occur in the night time involving only IR periocular images.
This gives rise to a challenging research problem called the cross-spectral matching of images where VIS images are used for registration or as gallery images and IR images are used for testing or recognition process and vice versa.
After intensive research of more than two decades on face and iris biometrics in cross-spectral domain, a number of researchers have now focused their work on matching heterogeneous (cross-spectral) periocular images.
Though a number of surveys have been made on existing periocular biometric research, no study has been done on its cross-spectral aspect.
This paper analyses and reviews current state-of-the-art techniques in cross-spectral periocular recognition including various methodologies, databases, their protocols and current-state-of-the-art recognition performances.
Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax.
For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered.
We propose to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures.
In this work, we present how to construct graphs from source code and how to scale Gated Graph Neural Networks training to such large graphs.
We evaluate our method on two tasks: VarNaming, in which a network attempts to predict the name of a variable given its usage, and VarMisuse, in which the network learns to reason about selecting the correct variable that should be used at a given program location.
Our comparison to methods that use less structured program representations shows the advantages of modeling known structure, and suggests that our models learn to infer meaningful names and to solve the VarMisuse task in many cases.
Additionally, our testing showed that VarMisuse identifies a number of bugs in mature open-source projects.
We investigate the characteristics of factual and emotional argumentation styles observed in online debates.
Using an annotated set of "factual" and "feeling" debate forum posts, we extract patterns that are highly correlated with factual and emotional arguments, and then apply a bootstrapping methodology to find new patterns in a larger pool of unannotated forum posts.
This process automatically produces a large set of patterns representing linguistic expressions that are highly correlated with factual and emotional language.
Finally, we analyze the most discriminating patterns to better understand the defining characteristics of factual and emotional arguments.
This paper introduces a method to capture network traffic from medical IoT devices and automatically detect cleartext information that may reveal sensitive medical conditions and behaviors.
The research follows a three-step approach involving traffic collection, cleartext detection, and metadata analysis.
We analyze four popular consumer medical IoT devices, including one smart medical device that leaks sensitive health information in cleartext.
We also present a traffic capture and analysis system that seamlessly integrates with a home network and offers a user-friendly interface for consumers to monitor and visualize data transmissions of IoT devices in their homes.
Taking advantage of the rolling shutter effect of CMOS cameras in smartphones is a common practice to increase the transfered data rate with visible light communication (VLC) without employing external equipment such as photodiodes.
VLC can then be used as replacement of other marker based techniques for object identification for Augmented Reality and Ubiquitous computing applications.
However, the rolling shutter effect only allows to transmit data over a single dimension, which considerably limits the available bandwidth.
In this article we propose a new method exploiting spacial interference detection to enable parallel transmission and design a protocol that enables easy identification of interferences between two signals.
By introducing a second dimension, we are not only able to significantly increase the available bandwidth, but also identify and isolate light sources in close proximity.
Over the past few years, mobile operators are faced with enormous challenges.
Of such challenges, evolved user demands on personalized applications.
Telecommunications industry as well as research community have paid enormous attention to Next Generation Networks (NGN) to address this challenge.
NGN is perceived as a sophisticated platform where both application developers and mobile operators cooperate to develop user applications with enhanced quality of experience.
The objective of this paper is twofold: first we present an introduction to state-of-the-art NGN testbed to be developed at KAU, and second we provide initial analysis for deploying a mobile application on top of the testbed.
With the ubiquity of Internet technologies and growing demands for transparency and open data policies, the role of social networking and online deliberation tools for public engagement in decision-making has increased substantially in the last decades.
In this paper, we present the analysis of how social media are used by different public bodies to enhance public participation in deliberative democracy.
We collected and reviewed published information on the subject and carried out a field base assessment, involving structured interviews with different government representatives and urban policymakers.
In order to compare collected data, we used a framework for systematic analysis and comparison of e-participation platforms called the participatory cube.
The results we got were the following.
Participatory decision-making on matters of public concern justly consumes time and resources, therefore online tools should be applied with consideration of scale and efficiency, i.e. on burning issues for a majority of citizens or small-scale local platforms, and in combination with meetings in real time and space.
The budget and workforce allocated to managing online engagement tools should be proportionate to other political and administrative efforts to bring to execution proposed ideas and act on collected feedback in order to satisfy the needs expressed by the communities and not undermine their beliefs about their power to influence decisions.
RNN language models have achieved state-of-the-art perplexity results and have proven useful in a suite of NLP tasks, but it is as yet unclear what syntactic generalizations they learn.
Here we investigate whether state-of-the-art RNN language models represent long-distance filler-gap dependencies and constraints on them.
Examining RNN behavior on experimentally controlled sentences designed to expose filler-gap dependencies, we show that RNNs can represent the relationship in multiple syntactic positions and over large spans of text.
Furthermore, we show that RNNs learn a subset of the known restrictions on filler-gap dependencies, known as island constraints: RNNs show evidence for wh-islands, adjunct islands, and complex NP islands.
These studies demonstrates that state-of-the-art RNN models are able to learn and generalize about empty syntactic positions.
One of the biggest reasons for road accidents is curvy lanes and blind turns.
Even one of the biggest hurdles for new autonomous vehicles is to detect curvy lanes, multiple lanes and lanes with a lot of discontinuity and noise.
This paper presents very efficient and advanced algorithm for detecting curves having desired slopes (especially for detecting curvy lanes in real time) and detection of curves (lanes) with a lot of noise, discontinuity and disturbances.
Overall aim is to develop robust method for this task which is applicable even in adverse conditions.
Even in some of most famous and useful libraries like OpenCV and Matlab, there is no function available for detecting curves having desired slopes , shapes, discontinuities.
Only few predefined shapes like circle, ellipse, etc, can be detected using presently available functions.
Proposed algorithm can not only detect curves with discontinuity, noise, desired slope but also it can perform shadow and illumination correction and detect/ differentiate between different curves.
Breast Cancer is a major cause of death worldwide among women.
Hematoxylin and Eosin (H&E) stained breast tissue samples from biopsies are observed under microscopes for the primary diagnosis of breast cancer.
In this paper, we propose a deep learning-based method for classification of H&E stained breast tissue images released for BACH challenge 2018 by fine-tuning Inception-v3 convolutional neural network (CNN) proposed by Szegedy et al.
These images are to be classified into four classes namely, i) normal tissue, ii) benign tumor, iii) in-situ carcinoma and iv) invasive carcinoma.
Our strategy is to extract patches based on nuclei density instead of random or grid sampling, along with rejection of patches that are not rich in nuclei (non-epithelial) regions for training and testing.
Every patch (nuclei-dense region) in an image is classified in one of the four above mentioned categories.
The class of the entire image is determined using majority voting over the nuclear classes.
We obtained an average four class accuracy of 85% and an average two class (non-cancer vs. carcinoma) accuracy of 93%, which improves upon a previous benchmark by Araujo et al.
Periodic nonuniform sampling is a known method to sample spectrally sparse signals below the Nyquist rate.
This strategy relies on the implicit assumption that the individual samplers are exposed to the entire frequency range.
This assumption becomes impractical for wideband sparse signals.
The current paper proposes an alternative sampling stage that does not require a full-band front end.
Instead, signals are captured with an analog front end that consists of a bank of multipliers and lowpass filters whose cutoff is much lower than the Nyquist rate.
The problem of recovering the original signal from the low-rate samples can be studied within the framework of compressive sampling.
An appropriate parameter selection ensures that the samples uniquely determine the analog input.
Moreover, the analog input can be stably reconstructed with digital algorithms.
Numerical experiments support the theoretical analysis.
One of the prerequisites of any organization is an unvarying sustainability in the dynamic and competitive industrial environment.
Development of high quality software is therefore an inevitable constraint of any software industry.
Defect management being one of the highly influencing factors for the production of high quality software, it is obligatory for the software organizations to orient them towards effective defect management.
Since, the time of software evolution, testing is deemed a promising technique of defect management in all IT industries.
This paper provides an empirical investigation of several projects through a case study comprising of four software companies having various production capabilities.
The aim of this investigation is to analyze the efficiency of test team during software development process.
The study indicates very low-test efficiency at requirements analysis phase and even lesser test efficiency at design phase of software development.
Subsequently, the study calls for a strong need to improve testing approaches using techniques such as dynamic testing of design solutions in lieu of static testing of design document.
Dynamic testing techniques enhance the ability of detection and elimination of design flaws right at the inception phase and thereby reduce the cost and time of rework.
It further improves productivity, quality and sustainability of software industry.
Identity recognition from ear images is an active field of research within the biometric community.
The ability to capture ear images from a distance and in a covert manner makes ear recognition technology an appealing choice for surveillance and security applications as well as related application domains.
In contrast to other biometric modalities, where large datasets captured in uncontrolled settings are readily available, datasets of ear images are still limited in size and mostly of laboratory-like quality.
As a consequence, ear recognition technology has not benefited yet from advances in deep learning and convolutional neural networks (CNNs) and is still lacking behind other modalities that experienced significant performance gains owing to deep recognition technology.
In this paper we address this problem and aim at building a CNNbased ear recognition model.
We explore different strategies towards model training with limited amounts of training data and show that by selecting an appropriate model architecture, using aggressive data augmentation and selective learning on existing (pre-trained) models, we are able to learn an effective CNN-based model using a little more than 1300 training images.
The result of our work is the first CNN-based approach to ear recognition that is also made publicly available to the research community.
With our model we are able to improve on the rank one recognition rate of the previous state-of-the-art by more than 25% on a challenging dataset of ear images captured from the web (a.k.a. in the wild).
We study robust distributed learning that involves minimizing a non-convex loss function with saddle points.
We consider the Byzantine setting where some worker machines have abnormal or even arbitrary and adversarial behavior.
In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used.
We develop ByzantinePGD, a robust first-order algorithm that can provably escape saddle points and fake local minima, and converge to an approximate true local minimizer with low iteration complexity.
As a by-product, we give a simpler algorithm and analysis for escaping saddle points in the usual non-Byzantine setting.
We further discuss three robust gradient estimators that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering.
We characterize their performance in concrete statistical settings, and argue for their near-optimality in low and high dimensional regimes.
Series elastic actuators (SEA) are playing an increasingly important role in the fields of physical human-robot interaction.
This paper focuses on the modeling and control of a cable-driven SEA.
First, the scheme of the cable-driven SEA has been proposed, and a velocity controlled DC motor has been used as its power source.
Based on this, the model of the cable-driven SEA has been built up.
Further, a two degrees of freedom (2-DOF) control approach has been employed to control the output torque.
Simulation results have shown that the 2-DOF method has achieved better robust performance than the PD method.
In this work, we propose an analysis of the presence of gender bias associated with professions in Portuguese word embeddings.
The objective of this work is to study gender implications related to stereotyped professions for women and men in the context of the Portuguese language.
Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing.
Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application.
Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain.
Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables.
In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks.
We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives.
We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks.
Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition.
The formalism of active integrity constraints was introduced as a way to specify particular classes of integrity constraints over relational databases together with preferences on how to repair existing inconsistencies.
The rule-based syntax of such integrity constraints also provides algorithms for finding such repairs that achieve the best asymptotic complexity.
However, the different semantics that have been proposed for these integrity constraints all exhibit some counter-intuitive examples.
In this work, we look at active integrity constraints using ideas from algebraic fixpoint theory.
We show how database repairs can be modeled as fixpoints of particular operators on databases, and study how the notion of grounded fixpoint induces a corresponding notion of grounded database repair that captures several natural intuitions, and in particular avoids the problems of previous alternative semantics.
In order to study grounded repairs in their full generality, we need to generalize the notion of grounded fixpoint to non-deterministic operators.
We propose such a definition and illustrate its plausibility in the database context.
Fog Radio Access Network (F-RAN) architectures can leverage both cloud processing and edge caching for content delivery to the users.
To this end, F-RAN utilizes caches at the edge nodes (ENs) and fronthaul links connecting a cloud processor to ENs.
Assuming time-invariant content popularity, existing information-theoretic analyses of content delivery in F-RANs rely on offline caching with separate content placement and delivery phases.
In contrast, this work focuses on the scenario in which the set of popular content is time-varying, hence necessitating the online replenishment of the ENs' caches along with the delivery of the requested files.
The analysis is centered on the characterization of the long-term Normalized Delivery Time (NDT), which captures the temporal dependence of the coding latencies accrued across multiple time slots in the high signal-to-noise ratio regime.
Online edge caching and delivery schemes are investigated for both serial and pipelined transmission modes across fronthaul and edge segments.
Analytical results demonstrate that, in the presence of a time-varying content popularity, the rate of fronthaul links sets a fundamental limit to the long-term NDT of F- RAN system.
Analytical results are further verified by numerical simulation, yielding important design insights.
The Brazilian Ministry of Health has selected the openEHR model as a standard for electronic health record systems.
This paper presents a set of archetypes to represent the main data from the Brazilian Public Hospital Information System and the High Complexity Procedures Module of the Brazilian public Outpatient Health Information System.
The archetypes from the public openEHR Clinical Knowledge Manager (CKM), were examined in order to select archetypes that could be used to represent the data of the above mentioned systems.
For several concepts, it was necessary to specialize the CKM archetypes, or design new ones.
A total of 22 archetypes were used: 8 new, 5 specialized and 9 reused from CKM.
This set of archetypes can be used not only for information exchange, but also for generating a big anonymized dataset for testing openEHR-based systems.
In this paper, we consider the automated planning of optimal paths for a robotic team satisfying a high level mission specification.
Each robot in the team is modeled as a weighted transition system where the weights have associated deviation values that capture the non-determinism in the traveling times of the robot during its deployment.
The mission is given as a Linear Temporal Logic (LTL) formula over a set of propositions satisfied at the regions of the environment.
Additionally, we have an optimizing proposition capturing some particular task that must be repeatedly completed by the team.
The goal is to minimize the maximum time between successive satisfying instances of the optimizing proposition while guaranteeing that the mission is satisfied even under non-deterministic traveling times.
Our method relies on the communication capabilities of the robots to guarantee correctness and maintain performance during deployment.
After computing a set of optimal satisfying paths for the members of the team, we also compute a set of synchronization sequences for each robot to ensure that the LTL formula is never violated during deployment.
We implement and experimentally evaluate our method considering a persistent monitoring task in a road network environment.
Generative Adversarial Networks are a new family of generative models, frequently used for generating photorealistic images.
The theory promises for the GAN to eventually reach an equilibrium where generator produces pictures indistinguishable for the training set.
In practice, however, a range of problems frequently prevents the system from reaching this equilibrium, with training not progressing ahead due to instabilities or mode collapse.
This paper describes a series of experiments trying to identify patterns in regard to the effect of the training set on the dynamics and eventual outcome of the training.
The emerging Software Defined Networking (SDN) paradigm separates the data plane from the control plane and centralizes network control in an SDN controller.
Applications interact with controllers to implement network services, such as network transport with Quality of Service (QoS).
SDN facilitates the virtualization of network functions so that multiple virtual networks can operate over a given installed physical network infrastructure.
Due to the specific characteristics of optical (photonic) communication components and the high optical transmission capacities, SDN based optical networking poses particular challenges, but holds also great potential.
In this article, we comprehensively survey studies that examine the SDN paradigm in optical networks; in brief, we survey the area of Software Defined Optical Networks (SDONs).
We mainly organize the SDON studies into studies focused on the infrastructure layer, the control layer, and the application layer.
Moreover, we cover SDON studies focused on network virtualization, as well as SDON studies focused on the orchestration of multilayer and multidomain networking.
Based on the survey, we identify open challenges for SDONs and outline future directions.
Web developers routinely rely on third-party Java-Script libraries such as jQuery to enhance the functionality of their sites.
However, if not properly maintained, such dependencies can create attack vectors allowing a site to be compromised.
In this paper, we conduct the first comprehensive study of client-side JavaScript library usage and the resulting security implications across the Web.
Using data from over 133 k websites, we show that 37% of them include at least one library with a known vulnerability; the time lag behind the newest release of a library is measured in the order of years.
In order to better understand why websites use so many vulnerable or outdated libraries, we track causal inclusion relationships and quantify different scenarios.
We observe sites including libraries in ad hoc and often transitive ways, which can lead to different versions of the same library being loaded into the same document at the same time.
Furthermore, we find that libraries included transitively, or via ad and tracking code, are more likely to be vulnerable.
This demonstrates that not only website administrators, but also the dynamic architecture and developers of third-party services are to blame for the Web's poor state of library management.
The results of our work underline the need for more thorough approaches to dependency management, code maintenance and third-party code inclusion on the Web.
Using the matrix factorization technique in machine learning is very common mainly in areas like recommender systems.
Despite its high prediction accuracy and its ability to avoid over-fitting of the data, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of the prohibitive cost.
In this paper, we propose a distributed high-performance parallel implementation of the BPMF using Gibbs sampling on shared and distributed architectures.
We show by using efficient load balancing using work stealing on a single node, and by using asynchronous communication in the distributed version we beat state of the art implementations.
Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output.
In this paper, we propose a referenceless quality estimation (QE) approach based on recurrent neural networks, which predicts a quality score for a NLG system output by comparing it to the source meaning representation only.
Our method outperforms traditional metrics and a constant baseline in most respects; we also show that synthetic data helps to increase correlation results by 21% compared to the base system.
Our results are comparable to results obtained in similar QE tasks despite the more challenging setting.
We propose a lightly-supervised approach for information extraction, in particular named entity classification, which combines the benefits of traditional bootstrapping, i.e., use of limited annotations and interpretability of extraction patterns, with the robust learning approaches proposed in representation learning.
Our algorithm iteratively learns custom embeddings for both the multi-word entities to be extracted and the patterns that match them from a few example entities per category.
We demonstrate that this representation-based approach outperforms three other state-of-the-art bootstrapping approaches on two datasets: CoNLL-2003 and OntoNotes.
Additionally, using these embeddings, our approach outputs a globally-interpretable model consisting of a decision list, by ranking patterns based on their proximity to the average entity embedding in a given class.
We show that this interpretable model performs close to our complete bootstrapping model, proving that representation learning can be used to produce interpretable models with small loss in performance.
Online Social Networks (OSNs) attract billions of users to share information and communicate where viral marketing has emerged as a new way to promote the sales of products.
An OSN provider is often hired by an advertiser to conduct viral marketing campaigns.
The OSN provider generates revenue from the commission paid by the advertiser which is determined by the spread of its product information.
Meanwhile, to propagate influence, the activities performed by users such as viewing video ads normally induce diffusion cost to the OSN provider.
In this paper, we aim to find a seed set to optimize a new profit metric that combines the benefit of influence spread with the cost of influence propagation for the OSN provider.
Under many diffusion models, our profit metric is the difference between two submodular functions which is challenging to optimize as it is neither submodular nor monotone.
We design a general two-phase framework to select seeds for profit maximization and develop several bounds to measure the quality of the seed set constructed.
Experimental results with real OSN datasets show that our approach can achieve high approximation guarantees and significantly outperform the baseline algorithms, including state-of-the-art influence maximization algorithms.
Observable operator models (OOMs) and related models are one of the most important and powerful tools for modeling and analyzing stochastic systems.
They exactly describe dynamics of finite-rank systems and can be efficiently and consistently estimated through spectral learning under the assumption of identically distributed data.
In this paper, we investigate the properties of spectral learning without this assumption due to the requirements of analyzing large-time scale systems, and show that the equilibrium dynamics of a system can be extracted from nonequilibrium observation data by imposing an equilibrium constraint.
In addition, we propose a binless extension of spectral learning for continuous data.
In comparison with the other continuous-valued spectral algorithms, the binless algorithm can achieve consistent estimation of equilibrium dynamics with only linear complexity.
We propose a method for demonstrating sub community structure in scientific networks of relatively small size from analyzing databases of publications.
Research relationships between the network members can be visualized as a graph with vertices corresponding to authors and with edges indicating joint authorship.
Using a fast clustering algorithm combined with a graph layout algorithm, we demonstrate how to display these clustering results in an attractive and informative way.
The small size of the graph allows us to develop tools that keep track of how these research sub communities evolve in time, as well as to present the research articles that create the links between the network members.
These tools are included in a web app, where the visitor can easily identify the various sub communities, providing also valuable information for administrational purposes.
Our method was developed for the GEAR mathematical network and it can be applied to other networks.
The project presented in this article aims to formalize criteria and procedures in order to extract semantic information from parsed dictionary glosses.
The actual purpose of the project is the generation of a semantic network (nearly an ontology) issued from a monolingual Italian dictionary, through unsupervised procedures.
Since the project involves rule-based Parsing, Semantic Tagging and Word Sense Disambiguation techniques, its outcomes may find an interest also beyond this immediate intent.
The cooperation of both syntactic and semantic features in meaning construction are investigated, and procedures which allows a translation of syntactic dependencies in semantic relations are discussed.
The procedures that rise from this project can be applied also to other text types than dictionary glosses, as they convert the output of a parsing process into a semantic representation.
In addition some mechanism are sketched that may lead to a kind of procedural semantics, through which multiple paraphrases of an given expression can be generated.
Which means that these techniques may find an application also in 'query expansion' strategies, interesting Information Retrieval, Search Engines and Question Answering Systems.
We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions.
Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method.
Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples.
Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions.
Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps.
However, we provide a simple yet effective solution to deal with such ambiguities.
We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects.
Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation/
Cold-start is a very common and still open problem in the Recommender Systems literature.
Since cold start items do not have any interaction, collaborative algorithms are not applicable.
One of the main strategies is to use pure or hybrid content-based approaches, which usually yield to lower recommendation quality than collaborative ones.
Some techniques to optimize performance of this type of approaches have been studied in recent past.
One of them is called feature weighting, which assigns to every feature a real value, called weight, that estimates its importance.
Statistical techniques for feature weighting commonly used in Information Retrieval, like TF-IDF, have been adapted for Recommender Systems, but they often do not provide sufficient quality improvements.
More recent approaches, FBSM and LFW, estimate weights by leveraging collaborative information via machine learning, in order to learn the importance of a feature based on other users opinions.
This type of models have shown promising results compared to classic statistical analyzes cited previously.
We propose a novel graph, feature-based machine learning model to face the cold-start item scenario, learning the relevance of features from probabilities of item-based collaborative filtering algorithms.
In this paper, we apply the scattering transform (ST), a nonlinear map based off of a convolutional neural network (CNN), to classification of underwater objects using sonar signals.
The ST formalizes the observation that the filters learned by a CNN have wavelet like structure.
We achieve effective binary classification both on a real dataset of Unexploded Ordinance (UXOs), as well as synthetically generated examples.
We also explore the effects on the waveforms with respect to changes in the object domain (e.g., translation, rotation, and acoustic impedance, etc.), and examine the consequences coming from theoretical results for the scattering transform.
We show that the scattering transform is capable of excellent classification on both the synthetic and real problems, thanks to having more quasi-invariance properties that are well-suited to translation and rotation of the object.
Makespan minimization in tasks scheduling of infrastructure as a service (IaaS) cloud is an NP-hard problem.
A number of techniques had been used in the past to optimize the makespan time of scheduled tasks in IaaS cloud, which is propotional to the execution cost billed to customers.
In this paper, we proposed a League Championship Algorithm (LCA) based makespan time minimization scheduling technique in IaaS cloud.
The LCA is a sports-inspired population based algorithmic framework for global optimization over a continuous search space.
Three other existing algorithms that is, First Come First Served (FCFS), Last Job First (LJF) and Best Effort First (BEF) were used to evaluate the performance of the proposed algorithm.
All algorithms under consideration assumed to be non-preemptive.
The results obtained shows that, the LCA scheduling technique perform moderately better than the other algorithms in minimizing the makespan time of scheduled tasks in IaaS cloud.
Information transmission over channels with transceiver distortion is investigated via generalized mutual information (GMI) under Gaussian input distribution and nearest-neighbor decoding.
A canonical transceiver structure in which the channel output is processed by a minimum mean-squared error estimator before decoding is established to maximize the GMI, and the well-known Bussgang's decomposition is shown to be a heuristic that is consistent with the GMI under linear output processing.
The problem of unicity and reidentifiability of records in large-scale databases has been studied in different contexts and approaches, with focus on preserving privacy or matching records from different data sources.
With an increasing number of service providers nowadays routinely collecting location traces of their users on unprecedented scales, there is a pronounced interest in the possibility of matching records and datasets based on spatial trajectories.
Extending previous work on reidentifiability of spatial data and trajectory matching, we present the first large-scale analysis of user matchability in real mobility datasets on realistic scales, i.e. among two datasets that consist of several million people's mobility traces, coming from a mobile network operator and transportation smart card usage.
We extract the relevant statistical properties which influence the matching process and analyze their impact on the matchability of users.
We show that for individuals with typical activity in the transportation system (those making 3-4 trips per day on average), a matching algorithm based on the co-occurrence of their activities is expected to achieve a 16.8% success only after a one-week long observation of their mobility traces, and over 55% after four weeks.
We show that the main determinant of matchability is the expected number of co-occurring records in the two datasets.
Finally, we discuss different scenarios in terms of data collection frequency and give estimates of matchability over time.
We show that with higher frequency data collection becoming more common, we can expect much higher success rates in even shorter intervals.
Consumers with low demand, like households, are generally supplied single-phase power by connecting their service mains to one of the phases of a distribution transformer.
The distribution companies face the problem of keeping a record of consumer connectivity to a phase due to uninformed changes that happen.
The exact phase connectivity information is important for the efficient operation and control of distribution system.
We propose a new data driven approach to the problem based on Principal Component Analysis (PCA) and its Graph Theoretic interpretations, using energy measurements in equally timed short intervals, generated from smart meters.
We propose an algorithm for inferring phase connectivity from noisy measurements.
The algorithm is demonstrated using simulated data for phase connectivities in distribution networks.
Automated emotion recognition in the wild from facial images remains a challenging problem.
Although recent advances in Deep Learning have supposed a significant breakthrough in this topic, strong changes in pose, orientation and point of view severely harm current approaches.
In addition, the acquisition of labeled datasets is costly, and current state-of-the-art deep learning algorithms cannot model all the aforementioned difficulties.
In this paper, we propose to apply a multi-task learning loss function to share a common feature representation with other related tasks.
Particularly we show that emotion recognition benefits from jointly learning a model with a detector of facial Action Units (collective muscle movements).
The proposed loss function addresses the problem of learning multiple tasks with heterogeneously labeled data, improving previous multi-task approaches.
We validate the proposal using two datasets acquired in non controlled environments, and an application to predict compound facial emotion expressions.
Point pair features are a popular representation for free form 3D object detection and pose estimation.
In this paper, their performance in an industrial random bin picking context is investigated.
A new method to generate representative synthetic datasets is proposed.
This allows to investigate the influence of a high degree of clutter and the presence of self similar features, which are typical to our application.
We provide an overview of solutions proposed in literature and discuss their strengths and weaknesses.
A simple heuristic method to drastically reduce the computational complexity is introduced, which results in improved robustness, speed and accuracy compared to the naive approach.
The European Space Agency (ESA) defines an Earth Observation (EO) Level 2 product as a multispectral (MS) image corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its scene classification map (SCM), whose legend includes quality layers such as cloud and cloud-shadow.
No ESA EO Level 2 product has ever been systematically generated at the ground segment.
To contribute toward filling an information gap from EO big data to the ESA EO Level 2 product, an original Stage 4 validation (Val) of the Satellite Image Automatic Mapper (SIAM) lightweight computer program was conducted by independent means on an annual Web-Enabled Landsat Data (WELD) image composite time-series of the conterminous U.S.
The core of SIAM is a one pass prior knowledge based decision tree for MS reflectance space hyperpolyhedralization into static color names presented in literature in recent years.
For the sake of readability this paper is split into two.
The present Part 1 Theory provides the multidisciplinary background of a priori color naming in cognitive science, from linguistics to computer vision.
To cope with dictionaries of MS color names and land cover class names that do not coincide and must be harmonized, an original hybrid guideline is proposed to identify a categorical variable pair relationship.
An original quantitative measure of categorical variable pair association is also proposed.
The subsequent Part 2 Validation discusses Stage 4 Val results collected by an original protocol for wall-to-wall thematic map quality assessment without sampling where the test and reference map legends can differ.
Conclusions are that the SIAM-WELD maps instantiate a Level 2 SCM product whose legend is the 4 class taxonomy of the FAO Land Cover Classification System at the Dichotomous Phase Level 1 vegetation/nonvegetation and Level 2 terrestrial/aquatic.
The ultimate goal of any software developer seeking a competitive edge is to meet stakeholders needs and expectations.
To achieve this, it is necessary to effectively and accurately manage stakeholders system requirements.
The paper proposes a systematic way of classifying stakeholders and then describes a novel method for calculating stakeholder priority taking into consideration the fact that different stakeholders will have different importance level and different requirement preference.
Finally the requirement preference calculation is done where stakeholders choose the best requirements based on two factors, value and urgency of the requirement.
The proposed method actively involves stakeholders in the requirement elicitation process.
Aboria is a powerful and flexible C++ library for the implementation of particle-based numerical methods.
The particles in such methods can represent actual particles (e.g.Molecular Dynamics) or abstract particles used to discretise a continuous function over a domain (e.g.Radial Basis Functions).
Aboria provides a particle container, compatible with the Standard Template Library, spatial search data structures, and a Domain Specific Language to specify non-linear operators on the particle set.
This paper gives an overview of Aboria's design, an example of use, and a performance benchmark.
In 'An asymptotic result on compressed sensing matrices', a new construction for compressed sensing matrices using combinatorial design theory was introduced.
In this paper, we use deterministic and probabilistic methods to analyse the performance of matrices obtained from this construction.
We provide new theoretical results and detailed simulations.
These simulations indicate that the construction is competitive with Gaussian random matrices, and that recovery is tolerant to noise.
A new recovery algorithm tailored to the construction is also given.
Evaluating the effectiveness and benefits of driver assistance systems is crucial for improving the system performance.
In this paper, we propose a novel framework for testing and evaluating lane departure correction systems at a low cost by using lane departure events reproduced from naturalistic driving data.
First, 529,096 lane departure events were extracted from the Safety Pilot Model Deployment (SPMD) database collected by the University of Michigan Transportation Research Institute.
Second, a stochastic lane departure model consisting of eight random key variables was developed to reduce the dimension of the data description and improve the computational efficiency.
As such, we used a bounded Gaussian mixture model (BGM) model to describe drivers' stochastic lane departure behaviors.
Then, a lane departure correction system with an aim point controller was designed, and a batch of lane departure events were reproduced from the learned stochastic driver model.
Finally, we assessed the developed evaluation approach by comparing lateral departure areas of vehicles between with and without correction controller.
The simulation results show that the proposed method can effectively evaluate lane departure correction systems.
ConcurORAM is a parallel, multi-client oblivious RAM (ORAM) that eliminates waiting for concurrent stateless clients and allows overall throughput to scale gracefully, without requiring trusted third party components (proxies) or direct inter-client coordination.
A key insight behind ConcurORAM is the fact that, during multi-client data access, only a subset of the concurrently-accessed server-hosted data structures require access privacy guarantees.
Everything else can be safely implemented as oblivious data structures that are later synced securely and efficiently during an ORAM "eviction".
Further, since a major contributor to latency is the eviction - in which client-resident data is reshuffled and reinserted back encrypted into the main server database - ConcurORAM also enables multiple concurrent clients to evict asynchronously, in parallel (without compromising consistency), and in the background without having to block ongoing queries.
As a result, throughput scales well with increasing number of concurrent clients and is not significantly impacted by evictions.
For example, about 65 queries per second can be executed in parallel by 30 concurrent clients, a 2x speedup over the state-of-the-art.
The query access time for individual clients increases by only 2x when compared to a single-client deployment.
Coexistence of Wi-Fi and LTE Unlicensed (LTE-U) in shared or unlicensed bands has drawn growing attention from both academia and industry.
An important consideration is fairness between Wi-Fi and duty cycled LTE-U, which is often defined in terms of channel access time, as adopted by the LTE-U Forum.
Despite many studies on duty cycle adaptation design for fair sharing, one crucial fact has often been neglected: LTE-U systems unilaterally control LTE-U duty cycles; hence, as self- interested users, they have incentives to misbehave, e.g., transmitting with a larger duty cycle that exceeds a given limit, so as to gain a greater share in channel access time and throughput.
In this paper, we propose a scheme that allows the spectrum manager managing the shared bands to estimate the duty cycle of a target LTE-U cell based on PHY layer observations from a nearby Wi-Fi AP, without interrupting normal Wi-Fi operations.
We further propose a thresholding scheme to detect duty cycling misbehavior (i.e., determining if the duty cycle exceeds the assigned limit), and analyze its performance in terms of detection and false alarm probabilities.
The proposed schemes are implemented in ns3 and evaluated with extensive simulations.
Our results show that the proposed scheme provides an estimate within +/- 1% of the true duty cycle, and detects misbehavior with a duty cycle 2.8% higher than the limit with a detection probability of at least 95%, while keeping the false alarm probability less than or equal to 1%.
We leverage stochastic geometry to characterize key performance metrics for neighboring Wi-Fi and LTE networks in unlicensed spectrum.
Our analysis focuses on a single unlicensed frequency band, where the locations for the Wi-Fi access points (APs) and LTE eNodeBs (eNBs) are modeled as two independent homogeneous Poisson point processes.
Three LTE coexistence mechanisms are investigated: (1) LTE with continuous transmission and no protocol modifications; (2) LTE with discontinuous transmission; and (3) LTE with listen-before-talk (LBT) and random back-off (BO).
For each scenario, we have derived the medium access probability (MAP), the signal-to-interference-plus-noise ratio (SINR) coverage probability, the density of successful transmissions (DST), and the rate coverage probability for both Wi-Fi and LTE.
Compared to the baseline scenario where one Wi-Fi network coexists with an additional Wi-Fi network, our results show that Wi-Fi performance is severely degraded when LTE transmits continuously.
However, LTE is able to improve the DST and rate coverage probability of Wi-Fi while maintaining acceptable data rate performance when it adopts one or more of the following coexistence features: a shorter transmission duty cycle, lower channel access priority, or more sensitive clear channel assessment (CCA) thresholds.
The problem of understanding people's participation in real-world events has been a subject of active research and can offer valuable insights for human behavior analysis and event-related recommendation/advertisement.
In this work, we study the latent factors for determining event popularity using large-scale datasets collected from the popular Meetup.com EBSN in three major cities around the world.
We have conducted modeling analysis of four contextual factors (spatial, group, temporal, and semantic), and also developed a group-based social influence propagation network to model group-specific influences on events.
By combining the Contextual features And Social Influence NetwOrk, our integrated prediction framework CASINO can capture the diverse influential factors of event participation and can be used by event organizers to predict/improve the popularity of their events.
Evaluations demonstrate that our CASINO framework achieves high prediction accuracy with contributions from all the latent features we capture.
In this paper, we present a novel deep learning based approach for addressing the problem of interaction recognition from a first person perspective.
The proposed approach uses a pair of convolutional neural networks, whose parameters are shared, for extracting frame level features from successive frames of the video.
The frame level features are then aggregated using a convolutional long short-term memory.
The hidden state of the convolutional long short-term memory, after all the input video frames are processed, is used for classification in to the respective categories.
The two branches of the convolutional neural network perform feature encoding on a short time interval whereas the convolutional long short term memory encodes the changes on a longer temporal duration.
In our network the spatio-temporal structure of the input is preserved till the very final processing stage.
Experimental results show that our method outperforms the state of the art on most recent first person interactions datasets that involve complex ego-motion.
In particular, on UTKinect-FirstPerson it competes with methods that use depth image and skeletal joints information along with RGB images, while it surpasses all previous methods that use only RGB images by more than 20% in recognition accuracy.
In this article we introduce the concept and the first implementation of a lightweight client-server-framework as middleware for distributed computing.
On the client side an installation without administrative rights or privileged ports can turn any computer into a worker node.
Only a Java runtime environment and the JAR files comprising the workflow client are needed.
To connect all clients to the engine one open server port is sufficient.
The engine submits data to the clients and orchestrates their work by workflow descriptions from a central database.
Clients request new task descriptions periodically, thus the system is robust against network failures.
In the basic set-up, data up- and downloads are handled via HTTP communication with the server.
The performance of the modular system could additionally be improved using dedicated file servers or distributed network file systems.
We demonstrate the design features of the proposed engine in real-world applications from mechanical engineering.
We have used this system on a compute cluster in design-of-experiment studies, parameter optimisations and robustness validations of finite element structures.
The paper makes a thermal predictive analysis of the electric power system security for a day ahead.
This predictive analysis is set as a thermal computation of the expected security.
This computation is obtained by cointegrating the daily electric power systen load and the weather, by finding the daily electric power system thermodynamics and by introducing tests for this thermodynamics.
The predictive analysis made shows the electricity consumers' wisdom.
Most of existing image denoising methods assume the corrupted noise to be additive white Gaussian noise (AWGN).
However, the realistic noise in real-world noisy images is much more complex than AWGN, and is hard to be modelled by simple analytical distributions.
As a result, many state-of-the-art denoising methods in literature become much less effective when applied to real-world noisy images captured by CCD or CMOS cameras.
In this paper, we develop a trilateral weighted sparse coding (TWSC) scheme for robust real-world image denoising.
Specifically, we introduce three weight matrices into the data and regularisation terms of the sparse coding framework to characterise the statistics of realistic noise and image priors.
TWSC can be reformulated as a linear equality-constrained problem and can be solved by the alternating direction method of multipliers.
The existence and uniqueness of the solution and convergence of the proposed algorithm are analysed.
Extensive experiments demonstrate that the proposed TWSC scheme outperforms state-of-the-art denoising methods on removing realistic noise.
A central goal of unsupervised learning is to acquire representations from unlabeled data or experience that can be used for more effective learning of downstream tasks from modest amounts of labeled data.
Many prior unsupervised learning works aim to do so by developing proxy objectives based on reconstruction, disentanglement, prediction, and other metrics.
Instead, we develop an unsupervised meta-learning method that explicitly optimizes for the ability to learn a variety of tasks from small amounts of data.
To do so, we construct tasks from unlabeled data in an automatic way and run meta-learning over the constructed tasks.
Surprisingly, we find that, when integrated with meta-learning, relatively simple task construction mechanisms, such as clustering embeddings, lead to good performance on a variety of downstream, human-specified tasks.
Our experiments across four image datasets indicate that our unsupervised meta-learning approach acquires a learning algorithm without any labeled data that is applicable to a wide range of downstream classification tasks, improving upon the embedding learned by four prior unsupervised learning methods.
In the past decade, Convolutional Neural Networks (CNNs) have been demonstrated successful for object detections.
However, the size of network input is limited by the amount of memory available on GPUs.
Moreover, performance degrades when detecting small objects.
To alleviate the memory usage and improve the performance of detecting small traffic signs, we proposed an approach for detecting small traffic signs from large images under real world conditions.
In particular, large images are broken into small patches as input to a Small-Object-Sensitive-CNN (SOS-CNN) modified from a Single Shot Multibox Detector (SSD) framework with a VGG-16 network as the base network to produce patch-level object detection results.
Scale invariance is achieved by applying the SOS-CNN on an image pyramid.
Then, image-level object detection is obtained by projecting all the patch-level detection results to the image at the original scale.
Experimental results on a real-world conditioned traffic sign dataset have demonstrated the effectiveness of the proposed method in terms of detection accuracy and recall, especially for those with small sizes.
The Web and its Semantic extension (i.e.
Linked Open Data) contain open global-scale knowledge and make it available to potentially intelligent machines that want to benefit from it.
Nevertheless, most of Linked Open Data lack ontological distinctions and have sparse axiomatisation.
For example, distinctions such as whether an entity is inherently a class or an individual, or whether it is a physical object or not, are hardly expressed in the data, although they have been largely studied and formalised by foundational ontologies (e.g.DOLCE, SUMO).
These distinctions belong to common sense too, which is relevant for many artificial intelligence tasks such as natural language understanding, scene recognition, and the like.
There is a gap between foundational ontologies, that often formalise or are inspired by pre-existing philosophical theories and are developed with a top-down approach, and Linked Open Data that mostly derive from existing databases or crowd-based effort (e.g.
DBpedia, Wikidata).
We investigate whether machines can learn foundational distinctions over Linked Open Data entities, and if they match common sense.
We want to answer questions such as "does the DBpedia entity for dog refer to a class or to an instance?".
We report on a set of experiments based on machine learning and crowdsourcing that show promising results.
Automated Program Repair (APR) is an emerging research field.
Many APR techniques, for different programming language and platforms, have been proposed and evaluated on several Benchmarks.
However, for our best knowledge, there not exists a well-defined benchmark based on mobile projects, consequently, there is a gap to leverage APR methods for mobile development.
Therefore, regarding the amount of Android Applications around the world, we present DroidBugs, an introductory benchmark based on the analyzes of 360 open projects for Android, each of them with more than 5,000 downloads.
From five applications, DroidBugs contains 13 single-bugs classified by the type of test that exposed them.
By using an APR tool, called Astor4Android, and two common Fault Localization strategy, it was observed how challenging is to find and fix mobile bugs.
The emergence of low-power wide area networks (LPWANs) as a new agent in the Internet of Things (IoT) will result in the incorporation into the digital world of low-automated processes from a wide variety of sectors.
The single-hop conception of typical LPWAN deployments, though simple and robust, overlooks the self-organization capabilities of network devices, suffers from lack of scalability in crowded scenarios, and pays little attention to energy consumption.
Aimed to take the most out of devices' capabilities, the HARE protocol stack is proposed in this paper as a new LPWAN technology flexible enough to adopt uplink multi-hop communications when proving energetically more efficient.
In this way, results from a real testbed show energy savings of up to 15% when using a multi-hop approach while keeping the same network reliability.
System's self-organizing capability and resilience have been also validated after performing numerous iterations of the association mechanism and deliberately switching off network devices.
Adapted from biological sequence alignment, trace alignment is a process mining technique used to visualize and analyze workflow data.
Any analysis done with this method, however, is affected by the alignment quality.
The best existing trace alignment techniques use progressive guide-trees to heuristically approximate the optimal alignment in O(N2L2) time.
These algorithms are heavily dependent on the selected guide-tree metric, often return sum-of-pairs-score-reducing errors that interfere with interpretation, and are computationally intensive for large datasets.
To alleviate these issues, we propose process-oriented iterative multiple alignment (PIMA), which contains specialized optimizations to better handle workflow data.
We demonstrate that PIMA is a flexible framework capable of achieving better sum-of-pairs score than existing trace alignment algorithms in only O(NL2) time.
We applied PIMA to analyzing medical workflow data, showing how iterative alignment can better represent the data and facilitate the extraction of insights from data visualization.
Optical coherence tomography (OCT) is used for non-invasive diagnosis of diabetic macular edema assessing the retinal layers.
In this paper, we propose a new fully convolutional deep architecture, termed ReLayNet, for end-to-end segmentation of retinal layers and fluid masses in eye OCT scans.
ReLayNet uses a contracting path of convolutional blocks (encoders) to learn a hierarchy of contextual features, followed by an expansive path of convolutional blocks (decoders) for semantic segmentation.
ReLayNet is trained to optimize a joint loss function comprising of weighted logistic regression and Dice overlap loss.
The framework is validated on a publicly available benchmark dataset with comparisons against five state-of-the-art segmentation methods including two deep learning based approaches to substantiate its effectiveness.
Higher-dimensional analogs of the predictable degree property and column reducedness are defined, and it is proved that the two properties are equivalent.
It is shown that every multidimensional convolutional code has, what is called, a minimal reduced polynomial resolution.
It is uniquely determined (up to isomorphism) and leads to a number of important integer invariants of the code generalizing classical Forney's indices.
A few decades of work in the AI field have focused efforts on developing a new generation of systems which can acquire knowledge via interaction with the world.
Yet, until very recently, most such attempts were underpinned by research which predominantly regarded linguistic phenomena as separated from the brain and body.
This could lead one into believing that to emulate linguistic behaviour, it suffices to develop 'software' operating on abstract representations that will work on any computational machine.
This picture is inaccurate for several reasons, which are elucidated in this paper and extend beyond sensorimotor and semantic resonance.
Beginning with a review of research, I list several heterogeneous arguments against disembodied language, in an attempt to draw conclusions for developing embodied multisensory agents which communicate verbally and non-verbally with their environment.
Without taking into account both the architecture of the human brain, and embodiment, it is unrealistic to replicate accurately the processes which take place during language acquisition, comprehension, production, or during non-linguistic actions.
While robots are far from isomorphic with humans, they could benefit from strengthened associative connections in the optimization of their processes and their reactivity and sensitivity to environmental stimuli, and in situated human-machine interaction.
The concept of multisensory integration should be extended to cover linguistic input and the complementary information combined from temporally coincident sensory impressions.
Until now mean-field-type game theory was not focused on cognitively-plausible models of choices in humans, animals, machines, robots, software-defined and mobile devices strategic interactions.
This work presents some effects of users' psychology in mean-field-type games.
In addition to the traditional "material" payoff modelling, psychological patterns are introduced in order to better capture and understand behaviors that are observed in engineering practice or in experimental settings.
The psychological payoff value depends upon choices, mean-field states, mean-field actions, empathy and beliefs.
It is shown that the affective empathy enforces mean-field equilibrium payoff equity and improves fairness between the players.
It establishes equilibrium systems for such interactive decision-making problems.
Basic empathy concepts are illustrated in several important problems in engineering including resource sharing, packet collision minimization, energy markets, and forwarding in Device-to-Device communications.
The work conducts also an experiment with 47 people who have to decide whether to cooperate or not.
The basic Interpersonal Reactivity Index of empathy metrics were used to measure the empathy distribution of each participant.
Android app called Empathizer is developed to analyze systematically the data obtained from the participants.
The experimental results reveal that the dominated strategies of the classical game theory are not dominated any more when users' psychology is involved, and a significant level of cooperation is observed among the users who are positively partially empathetic.
Learning by integrating multiple heterogeneous data sources is a common requirement in many tasks.
Collective Matrix Factorization (CMF) is a technique to learn shared latent representations from arbitrary collections of matrices.
It can be used to simultaneously complete one or more matrices, for predicting the unknown entries.
Classical CMF methods assume linearity in the interaction of latent factors which can be restrictive and fails to capture complex non-linear interactions.
In this paper, we develop the first deep-learning based method, called dCMF, for unsupervised learning of multiple shared representations, that can model such non-linear interactions, from an arbitrary collection of matrices.
We address optimization challenges that arise due to dependencies between shared representations through Multi-Task Bayesian Optimization and design an acquisition function adapted for collective learning of hyperparameters.
Our experiments show that dCMF significantly outperforms previous CMF algorithms in integrating heterogeneous data for predictive modeling.
Further, on two tasks - recommendation and prediction of gene-disease association - dCMF outperforms state-of-the-art matrix completion algorithms that can utilize auxiliary sources of information.
Learning about the social structure of hidden and hard-to-reach populations --- such as drug users and sex workers --- is a major goal of epidemiological and public health research on risk behaviors and disease prevention.
Respondent-driven sampling (RDS) is a peer-referral process widely used by many health organizations, where research subjects recruit other subjects from their social network.
In such surveys, researchers observe who recruited whom, along with the time of recruitment and the total number of acquaintances (network degree) of respondents.
However, due to privacy concerns, the identities of acquaintances are not disclosed.
In this work, we show how to reconstruct the underlying network structure through which the subjects are recruited.
We formulate the dynamics of RDS as a continuous-time diffusion process over the underlying graph and derive the likelihood for the recruitment time series under an arbitrary recruitment time distribution.
We develop an efficient stochastic optimization algorithm called RENDER (REspoNdent-Driven nEtwork Reconstruction) that finds the network that best explains the collected data.
We support our analytical results through an exhaustive set of experiments on both synthetic and real data.
Estimating the influence of a given feature to a model prediction is challenging.
We introduce ROAR, RemOve And Retrain, a benchmark to evaluate the accuracy of interpretability methods that estimate input feature importance in deep neural networks.
We remove a fraction of input features deemed to be most important according to each estimator and measure the change to the model accuracy upon retraining.
The most accurate estimator will identify inputs as important whose removal causes the most damage to model performance relative to all other estimators.
This evaluation produces thought-provoking results -- we find that several estimators are less accurate than a random assignment of feature importance.
However, averaging a set of squared noisy estimators (a variant of a technique proposed by Smilkov et al.(2017)), leads to significant gains in accuracy for each method considered and far outperforms such a random guess.
Efficient symbol detection algorithms carry critical importance for achieving the spatial multiplexing gains promised by multi-input multi-output (MIMO) systems.
In this paper, we consider a maximum a posteriori probability (MAP) based symbol detection algorithm, called M-BLAST, over uncoded quasi-static MIMO channels.
Relying on the successive interference cancellation (SIC) receiver, M-BLAST algorithm offers a superior error performance over its predecessor V-BLAST with a signal-to-noise ratio (SNR) gain of as large as 2 dB under various settings of recent interest.
Performance analysis of the M-BLAST algorithm is very complicated since the proposed detection order depends on the decision errors dynamically, which makes an already complex analysis of the conventional ordered SIC receivers even more difficult.
To this end, a rigorous analytical framework is proposed to analyze the outage behavior of the M-BLAST algorithm over binary complex alphabets and two transmitting antennas, which has a potential to be generalized to multiple transmitting antennas and multidimensional constellation sets.
The numerical results show a very good match between the analytical and simulation data under various SNR values and modulation alphabets.
We study how we can adapt a predictor to a non-stationary environment with advises from multiple experts.
We study the problem under complete feedback when the best expert changes over time from a decision theoretic point of view.
Proposed algorithm is based on popular exponential weighing method with exponential discounting.
We provide theoretical results bounding regret under the exponential discounting setting.
Upper bound on regret is derived for finite time horizon problem.
Numerical verification of different real life datasets are provided to show the utility of proposed algorithm.
An accurate knowledge of the per-unit length impedance of power cables is necessary to correctly predict electromagnetic transients in power systems.
In particular, skin, proximity, and ground return effects must be properly estimated.
In many applications, the medium that surrounds the cable is not uniform and can consist of multiple layers of different conductivity, such as dry and wet soil, water, or air.
We introduce a multilayer ground model for the recently-proposed MoM-SO method, suitable to accurately predict ground return effects in such scenarios.
The proposed technique precisely accounts for skin, proximity, ground and tunnel effects, and is applicable to a variety of cable configurations, including underground and submarine cables.
Numerical results show that the proposed method is more accurate than analytic formulas typically employed for transient analyses, and delivers an accuracy comparable to the finite element method (FEM).
With respect to FEM, however, MoM-SO is over 1000 times faster, and can calculate the impedance of a submarine cable inside a three-layer medium in 0.10~s per frequency point.
We investigate the scenario that a robot needs to reach a designated goal after taking a sequence of appropriate actions in a non-static environment that is partially structured.
One application example is to control a marine vehicle to move in the ocean.
The ocean environment is dynamic and oftentimes the ocean waves result in strong disturbances that can disturb the vehicle's motion.
Modeling such dynamic environment is non-trivial, and integrating such model in the robotic motion control is particularly difficult.
Fortunately, the ocean currents usually form some local patterns (e.g. vortex) and thus the environment is partially structured.
The historically observed data can be used to train the robot to learn to interact with the ocean tidal disturbances.
In this paper we propose a method that applies the deep reinforcement learning framework to learn such partially structured complex disturbances.
Our results show that, by training the robot under artificial and real ocean disturbances, the robot is able to successfully act in complex and spatiotemporal environments.
In this paper, we propose an efficient coding scheme for the two-link binary Chief Executive Officer (CEO) problem under logarithmic loss criterion.
The exact rate-distortion bound for a two-link binary CEO problem under the logarithmic loss has been obtained by Courtade and Weissman.
We propose an encoding scheme based on compound LDGM-LDPC codes to achieve the theoretical bounds.
In the proposed encoding, a binary quantizer using LDGM codes and a syndrome-coding employing LDPC codes are applied.
An iterative joint decoding is also designed as a fusion center.
The proposed CEO decoder is based on the sum-product algorithm and a soft estimator.
Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performance in many tasks such as image classification and language understanding.
However, most existing works only optimize for model accuracy and largely ignore other important factors imposed by the underlying hardware and devices, such as latency and energy, when making inference.
In this paper, we first introduce the problem of NAS and provide a survey on recent works.
Then we deep dive into two recent advancements on extending NAS into multiple-objective frameworks: MONAS and DPP-Net.
Both MONAS and DPP-Net are capable of optimizing accuracy and other objectives imposed by devices, searching for neural architectures that can be best deployed on a wide spectrum of devices: from embedded systems and mobile devices to workstations.
Experimental results are poised to show that architectures found by MONAS and DPP-Net achieves Pareto optimality w.r.t the given objectives for various devices.
Generalized Fibonacci-like sequences appear in finite difference approximations of the Partial Differential Equations based upon replacing partial differential equations by finite difference equations.
This paper studies properties of the generalized Fibonacci-like sequence F_(n+2)=A+BF_(n+1)+CF_n.
It is shown that this sequence is periodic with the period T>2 if C=-1,|B|<2.
This chapter reviews the purpose and use of models from the field of complex systems and, in particular, the implications of trying to use models to understand or make decisions within complex situations, such as policy makers usually face.
A discussion of the different dimensions one can formalise situations, the different purposes for models and the different kinds of relationship they can have with the policy making process, is followed by an examination of the compromises forced by the complexity of the target issues.
Several modelling approaches from complexity science are briefly described, with notes as to their abilities and limitations.
These approaches include system dynamics, network theory, information theory, cellular automata, and agent-based modelling.
Some examples of policy models are presented and discussed in the context of the previous analysis.
Finally we conclude by outlining some of the major pitfalls facing those wishing to use such models for policy evaluation.
Object recognition in the video sequence or images is one of the sub-field of computer vision.
Moving object recognition from a video sequence is an appealing topic with applications in various areas such as airport safety, intrusion surveillance, video monitoring, intelligent highway, etc.
Moving object recognition is the most challenging task in intelligent video surveillance system.
In this regard, many techniques have been proposed based on different methods.
Despite of its importance, moving object recognition in complex environments is still far from being completely solved for low resolution videos, foggy videos, and also dim video sequences.
All in all, these make it necessary to develop exceedingly robust techniques.
This paper introduces multiple moving object recognition in the video sequence based on LoG Gabor-PCA approach and Angle based distance Similarity measures techniques used to recognize the object as a human, vehicle etc.
Number of experiments are conducted for indoor and outdoor video sequences of standard datasets and also our own collection of video sequences comprising of partial night vision video sequences.
Experimental results show that our proposed approach achieves an excellent recognition rate.
Results obtained are satisfactory and competent.
While deep learning models and techniques have achieved great empirical success, our understanding of the source of success in many aspects remains very limited.
In an attempt to bridge the gap, we investigate the decision boundary of a production deep learning architecture with weak assumptions on both the training data and the model.
We demonstrate, both theoretically and empirically, that the last weight layer of a neural network converges to a linear SVM trained on the output of the last hidden layer, for both the binary case and the multi-class case with the commonly used cross-entropy loss.
Furthermore, we show empirically that training a neural network as a whole, instead of only fine-tuning the last weight layer, may result in better bias constant for the last weight layer, which is important for generalization.
In addition to facilitating the understanding of deep learning, our result can be helpful for solving a broad range of practical problems of deep learning, such as catastrophic forgetting and adversarial attacking.
The experiment codes are available at https://github.com/lykaust15/NN_decision_boundary
Question answering (QA) systems are sensitive to the many different ways natural language expresses the same information need.
In this paper we turn to paraphrases as a means of capturing this knowledge and present a general framework which learns felicitous paraphrases for various QA tasks.
Our method is trained end-to-end using question-answer pairs as a supervision signal.
A question and its paraphrases serve as input to a neural scoring model which assigns higher weights to linguistic expressions most likely to yield correct answers.
We evaluate our approach on QA over Freebase and answer sentence selection.
Experimental results on three datasets show that our framework consistently improves performance, achieving competitive results despite the use of simple QA models.
The Arbitrary Pattern Formation problem asks to design a distributed algorithm that allows a set of autonomous mobile robots to form any specific but arbitrary geometric pattern given as input.
The problem has been extensively studied in literature in continuous domains.
This paper investigates a discrete version of the problem where the robots are operating on a two dimensional infinite grid.
The robots are assumed to be autonomous, identical, anonymous and oblivious.
They operate in Look-Compute-Move cycles under a fully asynchronous scheduler.
The robots do not agree on any common global coordinate system or chirality.
We have shown that a set of robots can form any arbitrary pattern, if their starting configuration is asymmetric.
The OpenAI Gym provides researchers and enthusiasts with simple to use environments for reinforcement learning.
Even the simplest environment have a level of complexity that can obfuscate the inner workings of RL approaches and make debugging difficult.
This whitepaper describes a Python framework that makes it very easy to create simple Markov-Decision-Process environments programmatically by specifying state transitions and rewards of deterministic and non-deterministic MDPs in a domain-specific language in Python.
It then presents results and visualizations created with this MDP framework.
Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity.
This disrupts the learning of higher order abstracts using deep RNN.
In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible.
In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections.
The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights.
The number of layers in RMN signifies both the hierarchical processing depth and temporal depth.
The computational complexity in training RMN is significantly less when compared to deep recurrent networks.
RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information.
Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information.
Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems.
The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks.
Hierarchical Classification (HC) is a supervised learning problem where unlabeled instances are classified into a taxonomy of classes.
Several methods that utilize the hierarchical structure have been developed to improve the HC performance.
However, in most cases apriori defined hierarchical structure by domain experts is inconsistent; as a consequence performance improvement is not noticeable in comparison to flat classification methods.
We propose a scalable data-driven filter based rewiring approach to modify an expert-defined hierarchy.
Experimental comparisons of top-down HC with our modified hierarchy, on a wide range of datasets shows classification performance improvement over the baseline hierarchy (i:e:, defined by expert), clustered hierarchy and flattening based hierarchy modification approaches.
In comparison to existing rewiring approaches, our developed method (rewHier) is computationally efficient, enabling it to scale to datasets with large numbers of classes, instances and features.
We also show that our modified hierarchy leads to improved classification performance for classes with few training samples in comparison to flat and state-of-the-art HC approaches.
Logit-response dynamics (Alos-Ferrer and Netzer, Games and Economic Behavior 2010) are a rich and natural class of noisy best-response dynamics.
In this work we revise the price of anarchy and the price of stability by considering the quality of long-run equilibria in these dynamics.
Our results show that prior studies on simpler dynamics of this type can strongly depend on a synchronous schedule of the players' moves.
In particular, a small noise by itself is not enough to improve the quality of equilibria as soon as other very natural schedules are used.
The expansion of the electronic commerce, together with an increasing confidence of customers in electronic payments, makes of fraud detection a critical factor.
Detecting frauds in (nearly) real time setting demands the design and the implementation of scalable learning techniques able to ingest and analyse massive amounts of streaming data.
Recent advances in analytics and the availability of open source solutions for Big Data storage and processing open new perspectives to the fraud detection field.
In this paper we present a SCAlable Real-time Fraud Finder (SCARFF) which integrates Big Data tools (Kafka, Spark and Cassandra) with a machine learning approach which deals with imbalance, nonstationarity and feedback latency.
Experimental results on a massive dataset of real credit card transactions show that this framework is scalable, efficient and accurate over a big stream of transactions.
Inverse imaging problems are inherently under-determined, and hence it is important to employ appropriate image priors for regularization.
One recent popular prior---the graph Laplacian regularizer---assumes that the target pixel patch is smooth with respect to an appropriately chosen graph.
However, the mechanisms and implications of imposing the graph Laplacian regularizer on the original inverse problem are not well understood.
To address this problem, in this paper we interpret neighborhood graphs of pixel patches as discrete counterparts of Riemannian manifolds and perform analysis in the continuous domain, providing insights into several fundamental aspects of graph Laplacian regularization for image denoising.
Specifically, we first show the convergence of the graph Laplacian regularizer to a continuous-domain functional, integrating a norm measured in a locally adaptive metric space.
Focusing on image denoising, we derive an optimal metric space assuming non-local self-similarity of pixel patches, leading to an optimal graph Laplacian regularizer for denoising in the discrete domain.
We then interpret graph Laplacian regularization as an anisotropic diffusion scheme to explain its behavior during iterations, e.g., its tendency to promote piecewise smooth signals under certain settings.
To verify our analysis, an iterative image denoising algorithm is developed.
Experimental results show that our algorithm performs competitively with state-of-the-art denoising methods such as BM3D for natural images, and outperforms them significantly for piecewise smooth images.
Progressive efforts have been evolving continuously for the betterment of the services of the Information Technology for Educational Management(ITEM).
These services require data intensive and communication intensive applications.
Due to the massive growth of information, situation becomes difficult to manage these services.
Here the role of the Information and Communication Technology (ICT) infrastructure particularly data centre with communication components becomes important to facilitate these services.
The present paper discusses the related issues such as competent staff, appropriate ICT infrastructure, ICT acceptance level etc. required for ITEM competence building framework considering the earlier approach for core competences for ITEM.
It this connection, it is also necessary to consider the procurement of standard and appropriate ICT facilities.
This will help in the integration of these facilities for the future expansion.
This will also enable to create and foresee the impact of the pairing the management with information, technology, and education components individually and combined.
These efforts will establish a strong coupling between the ITEM activities and resource management for effective implementation of the framework.
The problem of computing the Betweenness Centrality (BC) is important in analyzing graphs in many practical applications like social networks, biological networks, transportation networks, electrical circuits, etc.
Since this problem is computation intensive, researchers have been developing algorithms using high performance computing resources like supercomputers, clusters, and Graphics Processing Units (GPUs).
Current GPU algorithms for computing BC employ Brandes' sequential algorithm with different trade-offs for thread scheduling, data structures, and atomic operations.
In this paper, we study three GPU algorithms for computing BC of unweighted, directed, scale-free networks.
We discuss and measure the trade-offs of their design choices about balanced thread scheduling, atomic operations, synchronizations and latency hiding.
Our program is written in NVIDIA CUDA C and was tested on an NVIDIA Tesla M2050 GPU.
We present a prototype of an integrated reasoning environment for educational purposes.
The presented tool is a fragment of a proof assistant and automated theorem prover.
We describe the existing and planned functionality of the theorem prover and especially the functionality of the educational fragment.
This currently supports working with terms of the untyped lambda calculus and addresses both undergraduate students and researchers.
We show how the tool can be used to support the students' understanding of functional programming and discuss general problems related to the process of building theorem proving software that aims at supporting both research and education.
Machine vision for plant phenotyping is an emerging research area for producing high throughput in agriculture and crop science applications.
Since 2D based approaches have their inherent limitations, 3D plant analysis is becoming state of the art for current phenotyping technologies.
We present an automated system for analyzing plant growth in indoor conditions.
A gantry robot system is used to perform scanning tasks in an automated manner throughout the lifetime of the plant.
A 3D laser scanner mounted as the robot's payload captures the surface point cloud data of the plant from multiple views.
The plant is monitored from the vegetative to reproductive stages in light/dark cycles inside a controllable growth chamber.
An efficient 3D reconstruction algorithm is used, by which multiple scans are aligned together to obtain a 3D mesh of the plant, followed by surface area and volume computations.
The whole system, including the programmable growth chamber, robot, scanner, data transfer and analysis is fully automated in such a way that a naive user can, in theory, start the system with a mouse click and get back the growth analysis results at the end of the lifetime of the plant with no intermediate intervention.
As evidence of its functionality, we show and analyze quantitative results of the rhythmic growth patterns of the dicot Arabidopsis thaliana(L.), and the monocot barley (Hordeum vulgare L.) plants under their diurnal light/dark cycles.
In this work we propose a novel approach to remove undesired objects from RGB-D sequences captured with freely moving cameras, which enables static 3D reconstruction.
Our method jointly uses existing information from multiple frames as well as generates new one via inpainting techniques.
We use balanced rules to select source frames; local homography based image warping method for alignment and Markov random field (MRF) based approach for combining existing information.
For the left holes, we employ exemplar based multi-view inpainting method to deal with the color image and coherently use it as guidance to complete the depth correspondence.
Experiments show that our approach is qualified for removing the undesired objects and inpainting the holes.
In this paper, we study how to fold a specified origami crease pattern in order to minimize the impact of paper thickness.
Specifically, origami designs are often expressed by a mountain-valley pattern (plane graph of creases with relative fold orientations), but in general this specification is consistent with exponentially many possible folded states.
We analyze the complexity of finding the best consistent folded state according to two metrics: minimizing the total number of layers in the folded state (so that a "flat folding" is indeed close to flat), and minimizing the total amount of paper required to execute the folding (where "thicker" creases consume more paper).
We prove both problems strongly NP-complete even for 1D folding.
On the other hand, we prove the first problem fixed-parameter tractable in 1D with respect to the number of layers.
We present a new model for singing synthesis based on a modified version of the WaveNet architecture.
Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre.
This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times.
Our model makes frame-wise predictions using mixture density outputs rather than categorical outputs in order to reduce the required parameter count.
As we found overfitting to be an issue with the relatively small datasets used in our experiments, we propose a method to regularize the model and make the autoregressive generation process more robust to prediction errors.
Using a simple multi-stream architecture, harmonic, aperiodic and voiced/unvoiced components can all be predicted in a coherent manner.
We compare our method to existing parametric statistical and state-of-the-art concatenative methods using quantitative metrics and a listening test.
While naive implementations of the autoregressive generation algorithm tend to be inefficient, using a smart algorithm we can greatly speed up the process and obtain a system that's competitive in both speed and quality.
Convolutional neural network (CNN) based methods have recently achieved great success for image super-resolution (SR).
However, most deep CNN based SR models attempt to improve distortion measures (e.g.PSNR, SSIM, IFC, VIF) while resulting in poor quantified perceptual quality (e.g. human opinion score, no-reference quality measures such as NIQE).
Few works have attempted to improve the perceptual quality at the cost of performance reduction in distortion measures.
A very recent study has revealed that distortion and perceptual quality are at odds with each other and there is always a trade-off between the two.
Often the restoration algorithms that are superior in terms of perceptual quality, are inferior in terms of distortion measures.
Our work attempts to analyze the trade-off between distortion and perceptual quality for the problem of single image SR. To this end, we use the well-known SR architecture-enhanced deep super-resolution (EDSR) network and show that it can be adapted to achieve better perceptual quality for a specific range of the distortion measure.
While the original network of EDSR was trained to minimize the error defined based on per-pixel accuracy alone, we train our network using a generative adversarial network framework with EDSR as the generator module.
Our proposed network, called enhanced perceptual super-resolution network (EPSR), is trained with a combination of mean squared error loss, perceptual loss, and adversarial loss.
Our experiments reveal that EPSR achieves the state-of-the-art trade-off between distortion and perceptual quality while the existing methods perform well in either of these measures alone.
The problem of knowing who knows what is multi-faceted.
Knowledge and expertise lie on a spectrum and one's expertise in one topic area may have little bearing on one's knowledge in a disparate topic area.
In addition, we continue to learn new things over time.
Each of us see but a sliver of our acquaintances' and co-workers' areas of expertise.
By making explicit and visible many individual perceptions of cognitive authority, this work shows that a group can know what its members know about in a relatively efficient and inexpensive manner.
Structure from motion is an import theme in computer vision.
Although great progress has been made both in theory and applications, most of the algorithms only work for static scenes and rigid objects.
In recent years, structure and motion recovery of non-rigid objects and dynamic scenes have received a lot of attention.
In this paper, the state-of-the-art techniques for structure and motion factorization of non-rigid objects are reviewed and discussed.
First, an introduction of the structure from motion problem is presented, followed by a general formulation of non-rigid structure from motion.
Second, an augmented affined factorization framework, by using homogeneous representation, is presented to solve the registration issue in the presence of outlying and missing data.
Third, based on the observation that the reprojection residuals of outliers are significantly larger than those of inliers, a robust factorization strategy with outlier rejection is proposed by means of the reprojection residuals, followed by some comparative experimental evaluations.
Finally, some future research topics in non-rigid structure from motion are discussed.
Increasing distributed energy resources (DERs) may result in reactive power imbalance in a transmission power system (TPS).
An active distribution power system (DPS) having DERs reportedly can work as a reactive power prosumer to help balance the reactive power in the TPS.
The reactive power potential (RPP) of a DPS, which is the range between the maximal inductive and capacitive reactive power the DPS can reliably provide, should be accurately estimated.
However, an accurate estimation is difficult because of the network constraints, mixed discrete and continuous variables, and the nonnegligible uncertainty in the DPS.
To solve this problem, this paper proposes a robust RPP estimation method based on two-stage robust optimization, where the uncertainty in DERs and the boundary-bus voltage is considered.
In this two-stage robust model, the RPP is pre-estimated in the first stage and its robust feasibility for any possible instance of the uncertainty is checked via a tractable problem in the second stage.
The column-and-constraint generation algorithm is adopted, which solves this model in finite iterations.
Case studies show that this robust method excels in yielding a completely reliable RPP, and also that a DPS, even under the uncertainty, is still an effective reactive power prosumer for the TPS.
Handwritten mathematical expression recognition is a challenging problem due to the complicated two-dimensional structures, ambiguous handwriting input and variant scales of handwritten math symbols.
To settle this problem, we utilize the attention based encoder-decoder model that recognizes mathematical expression images from two-dimensional layouts to one-dimensional LaTeX strings.
We improve the encoder by employing densely connected convolutional networks as they can strengthen feature extraction and facilitate gradient propagation especially on a small training set.
We also present a novel multi-scale attention model which is employed to deal with the recognition of math symbols in different scales and save the fine-grained details that will be dropped by pooling operations.
Validated on the CROHME competition task, the proposed method significantly outperforms the state-of-the-art methods with an expression recognition accuracy of 52.8% on CROHME 2014 and 50.1% on CROHME 2016, by only using the official training dataset.
The ConditionaL Neural Network (CLNN) exploits the nature of the temporal sequencing of the sound signal represented in a spectrogram, and its variant the Masked ConditionaL Neural Network (MCLNN) induces the network to learn in frequency bands by embedding a filterbank-like sparseness over the network's links using a binary mask.
Additionally, the masking automates the exploration of different feature combinations concurrently analogous to handcrafting the optimum combination of features for a recognition task.
We have evaluated the MCLNN performance using the Urbansound8k dataset of environmental sounds.
Additionally, we present a collection of manually recorded sounds for rail and road traffic, YorNoise, to investigate the confusion rates among machine generated sounds possessing low-frequency components.
MCLNN has achieved competitive results without augmentation and using 12% of the trainable parameters utilized by an equivalent model based on state-of-the-art Convolutional Neural Networks on the Urbansound8k.
We extended the Urbansound8k dataset with YorNoise, where experiments have shown that common tonal properties affect the classification performance.
Enterprise databases usually contain large and complex schemas.
Authoring complete schema mapping queries in this case requires deep knowledge about the source and target schemas and is thereby very challenging to programmers.
Sample-driven schema mapping allows the user to describe the schema mapping using data records.
However, real data records are still harder to specify than other useful insights about the desired schema mapping the user might have.
In this project, we develop a schema mapping system, PRISM, that enables multiresolution schema mapping.
The end user is not limited to providing high-resolution constraints like exact data records but may also provide constraints of various resolutions, like incomplete data records, value ranges, and data types.
This new interaction paradigm gives the user more flexibility in describing the desired schema mapping.
This demonstration showcases how to use PRISM for schema mapping in a real database.
Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle.
Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if traceability links between code and tests are not available.
This paper introduces Retecs, a new method for automatically learning test case selection and prioritization in CI with the goal to minimize the round-trip time between code commits and developer feedback on failed test cases.
The Retecs method uses reinforcement learning to select and prioritize test cases according to their duration, previous last execution and failure history.
In a constantly changing environment, where new test cases are created and obsolete test cases are deleted, the Retecs method learns to prioritize error-prone test cases higher under guidance of a reward function and by observing previous CI cycles.
By applying Retecs on data extracted from three industrial case studies, we show for the first time that reinforcement learning enables fruitful automatic adaptive test case selection and prioritization in CI and regression testing.
We focus on the problem of language modeling for code-switched language, in the context of automatic speech recognition (ASR).
Language modeling for code-switched language is challenging for (at least) three reasons: (1) lack of available large-scale code-switched data for training; (2) lack of a replicable evaluation setup that is ASR directed yet isolates language modeling performance from the other intricacies of the ASR system; and (3) the reliance on generative modeling.
We tackle these three issues: we propose an ASR-motivated evaluation setup which is decoupled from an ASR system and the choice of vocabulary, and provide an evaluation dataset for English-Spanish code-switching.
This setup lends itself to a discriminative training approach, which we demonstrate to work better than generative language modeling.
Finally, we present an effective training protocol that integrates small amounts of code-switched data with large amounts of monolingual data, for both the generative and discriminative cases.
We present PEC, an Event Calculus (EC) style action language for reasoning about probabilistic causal and narrative information.
It has an action language style syntax similar to that of the EC variant Modular-E. Its semantics is given in terms of possible worlds which constitute possible evolutions of the domain, and builds on that of EFEC, an epistemic extension of EC.
We also describe an ASP implementation of PEC and show the sense in which this is sound and complete.
In this paper an analysis of the presence and possibilities of altmetrics for bibliometric and performance analysis is carried out.
Using the web based tool Impact Story, we have collected metrics for 20,000 random publications from the Web of Science.
We studied the presence and frequency of altmetrics in the set of publications, across fields, document types and also through the years.
The main result of the study is that less than 50% of the publications have some kind of altmetrics.
The source that provides most metrics is Mendeley, with metrics on readerships for around 37% of all the publications studied.
Other sources only provide marginal information.
Possibilities and limitations of these indicators are discussed and future research lines are outlined.
We also assessed the accuracy of the data retrieved through Impact Story by focusing on the analysis of the accuracy of data from Mendeley; in a follow up study, the accuracy and validity of other data sources not included here will be assessed.
Despite of the progress achieved by deep learning in face recognition (FR), more and more people find that racial bias explicitly degrades the performance in realistic FR systems.
Facing the fact that existing training and testing databases consist of almost Caucasian subjects, there are still no independent testing databases to evaluate racial bias and even no training databases and methods to reduce it.
To facilitate the research towards conquering those unfair issues, this paper contributes a new dataset called Racial Faces in-the-Wild (RFW) database with two important uses, 1) racial bias testing: four testing subsets, namely Caucasian, Asian, Indian and African, are constructed, and each contains about 3000 individuals with 6000 image pairs for face verification, 2) racial bias reducing: one labeled training subset with Caucasians and three unlabeled training subsets with Asians, Indians and Africans are offered to encourage FR algorithms to transfer recognition knowledge from Caucasians to other races.
For we all know, RFW is the first database for measuring racial bias in FR algorithms.
After proving the existence of domain gap among different races and the existence of racial bias in FR algorithms, we further propose a deep information maximization adaptation network (IMAN) to bridge the domain gap, and comprehensive experiments show that the racial bias could be narrowed-down by our algorithm.
The paper addresses the problem of vehicle rollover avoidance using reference governors applied to modify the driver steering input in vehicles with an active steering system.
Several reference governor designs are presented and tested with a detailed nonlinear simulation model.
The vehicle dynamics are highly nonlinear for large steering angles, including the conditions where the vehicle approaches a rollover onset, which necessitates reference governor design changes.
Simulation results show that reference governor designs are effective in avoiding rollover.
The results also demonstrate that the controllers are not overly conservative, adjusting the driver steering input only for very high steering angles.
Clustering is one of the most fundamental problems in data analysis and it has been studied extensively in the literature.
Though many clustering algorithms have been proposed, clustering theories that justify the use of these clustering algorithms are still unsatisfactory.
In particular, one of the fundamental challenges is to address the following question:   What is a cluster in a set of data points?
In this paper, we make an attempt to address such a question by considering a set of data points associated with a distance measure (metric).
We first propose a new cohesion measure in terms of the distance measure.
Using the cohesion measure, we define a cluster as a set of points that are cohesive to themselves.
For such a definition, we show there are various equivalent statements that have intuitive explanations.
We then consider the second question:   How do we find clusters and good partitions of clusters under such a definition?
For such a question, we propose a hierarchical agglomerative algorithm and a partitional algorithm.
Unlike standard hierarchical agglomerative algorithms, our hierarchical agglomerative algorithm has a specific stopping criterion and it stops with a partition of clusters.
Our partitional algorithm, called the K-sets algorithm in the paper, appears to be a new iterative algorithm.
Unlike the Lloyd iteration that needs two-step minimization, our K-sets algorithm only takes one-step minimization.
One of the most interesting findings of our paper is the duality result between a distance measure and a cohesion measure.
Such a duality result leads to a dual K-sets algorithm for clustering a set of data points with a cohesion measure.
The dual K-sets algorithm converges in the same way as a sequential version of the classical kernel K-means algorithm.
The key difference is that a cohesion measure does not need to be positive semi-definite.
Template-based code generation (TBCG) is a synthesis technique that produces code from high-level specifications, called templates.
TBCG is a popular technique in model-driven engineering (MDE) given that they both emphasize abstraction and automation.
Given the diversity of tools and approaches, it is necessary to classify existing TBCG techniques to better guide developers in their choices.
The goal of this article is to better understand the characteristics of TBCG techniques and associated tools, identify research trends, and assess the importance of the role of MDE in this code synthesis approach.
We conducted a systematic mapping study of the literature to paint an interesting picture about the trends and uses of TBCG.
Our study shows that the community has been diversely using TBCG over the past 15 years.
TBCG has greatly benefited from MDE.
It has favored a template style that is output-based and high level modeling languages as input.
TBCG is mainly used to generate source code and has been applied in a variety of domains.
Furthermore, both MDE and non-MDE tools are becoming effective development resources in industry.
Nearly all previous work on geo-locating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs.
Activities, such as alcohol consumption, may occur at different places and types of places, and it is important not only to detect the local regions where these activities occur, but also to analyze the degree of participation in them by local residents.
In this paper, we develop new machine learning based methods for fine-grained localization of activities and home locations from Twitter data.
We apply these methods to discover and compare alcohol consumption patterns in a large urban area, New York City, and a more suburban and rural area, Monroe County.
We find positive correlations between the rate of alcohol consumption reported among a community's Twitter users and the density of alcohol outlets, demonstrating that the degree of correlation varies significantly between urban and suburban areas.
While our experiments are focused on alcohol use, our methods for locating homes and distinguishing temporally-specific self-reports are applicable to a broad range of behaviors and latent states.
In this paper, we propose a novel neural approach for paraphrase generation.
Conventional para- phrase generation methods either leverage hand-written rules and thesauri-based alignments, or use statistical machine learning principles.
To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation.
Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers.
This allows for efficient training of deep LSTMs.
We evaluate our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO.
Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi- directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.
This paper presents a new approach in understanding how deep neural networks (DNNs) work by applying homomorphic signal processing techniques.
Focusing on the task of multi-pitch estimation (MPE), this paper demonstrates the equivalence relation between a generalized cepstrum and a DNN in terms of their structures and functionality.
Such an equivalence relation, together with pitch perception theories and the recently established rectified-correlations-on-a-sphere (RECOS) filter analysis, provide an alternative way in explaining the role of the nonlinear activation function and the multi-layer structure, both of which exist in a cepstrum and a DNN.
To validate the efficacy of this new approach, a new feature designed in the same fashion is proposed for pitch salience function.
The new feature outperforms the one-layer spectrum in the MPE task and, as predicted, it addresses the issue of the missing fundamental effect and also achieves better robustness to noise.
We consider the task of evaluating a policy for a Markov decision process (MDP).
The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance.
We show that the data collected from deploying a different policy, commonly called the behavior policy, can be used to produce unbiased estimates with lower mean squared error than this standard technique.
We derive an analytic expression for the optimal behavior policy --- the behavior policy that minimizes the mean squared error of the resulting estimates.
Because this expression depends on terms that are unknown in practice, we propose a novel policy evaluation sub-problem, behavior policy search: searching for a behavior policy that reduces mean squared error.
We present a behavior policy search algorithm and empirically demonstrate its effectiveness in lowering the mean squared error of policy performance estimates.
The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language.
Each project also provides corpora of annotated text in many languages - UD at the token level and UniMorph at the type level.
As each corpus is built by different annotators, language-specific decisions hinder the goal of universal schemata.
With compatibility of tags, each project's annotations could be used to validate the other's.
Additionally, the availability of both type- and token-level resources would be a boon to tasks such as parsing and homograph disambiguation.
To ease this interoperability, we present a deterministic mapping from Universal Dependencies v2 features into the UniMorph schema.
We validate our approach by lookup in the UniMorph corpora and find a macro-average of 64.13% recall.
We also note incompatibilities due to paucity of data on either side.
Finally, we present a critical evaluation of the foundations, strengths, and weaknesses of the two annotation projects.
Kernel audit logs are an invaluable source of information in the forensic investigation of a cyber-attack.
However, the coarse granularity of dependency information in audit logs leads to the construction of huge attack graphs which contain false or inaccurate dependencies.
To overcome this problem, we propose a system, called ProPatrol, which leverages the open compartmentalized design in families of enterprise applications used in security-sensitive contexts (e.g., browser, chat client, email client).
To achieve its goal, ProPatrol infers a model for an application's high-level tasks as input-processing compartments using purely the audit log events generated by that application.
The main benefit of this approach is that it does not rely on source code or binary instrumentation, but only on a preliminary and general knowledge of an application's architecture to bootstrap the analysis.
Our experiments with enterprise-level attacks demonstrate that ProPatrol significantly cuts down the forensic investigation effort and quickly pinpoints the root- cause of attacks.
ProPatrol incurs less than 2% runtime overhead on a commodity operating system.
Extracting text objects from the PDF images is a challenging problem.
The text data present in the PDF images contain certain useful information for automatic annotation, indexing etc.
However variations of the text due to differences in text style, font, size, orientation, alignment as well as complex structure make the problem of automatic text extraction extremely difficult and challenging job.
This paper presents two techniques under block-based classification.
After a brief introduction of the classification methods, two methods were enhanced and results were evaluated.
The performance metrics for segmentation and time consumption are tested for both the models.
Identifying the factors that influence academic performance is an essential part of educational research.
Previous studies have documented the importance of personality traits, class attendance, and social network structure.
Because most of these analyses were based on a single behavioral aspect and/or small sample sizes, there is currently no quantification of the interplay of these factors.
Here, we study the academic performance among a cohort of 538 undergraduate students forming a single, densely connected social network.
Our work is based on data collected using smartphones, which the students used as their primary phones for two years.
The availability of multi-channel data from a single population allows us to directly compare the explanatory power of individual and social characteristics.
We find that the most informative indicators of performance are based on social ties and that network indicators result in better model performance than individual characteristics (including both personality and class attendance).
We confirm earlier findings that class attendance is the most important predictor among individual characteristics.
Finally, our results suggest the presence of strong homophily and/or peer effects among university students.
Considering the level of competition prevailing in Business-to-Consumer (B2C) E-Commerce domain and the huge investments required to attract new customers, firms are now giving more focus to reduce their customer churn rate.
Churn rate is the ratio of customers who part away with the firm in a specific time period.
One of the best mechanism to retain current customers is to identify any potential churn and respond fast to prevent it.
Detecting early signs of a potential churn, recognizing what the customer is looking for by the movement and automating personalized win back campaigns are essential to sustain business in this era of competition.
E-Commerce firms normally possess large volume of data pertaining to their existing customers like transaction history, search history, periodicity of purchases, etc.
Data mining techniques can be applied to analyse customer behaviour and to predict the potential customer attrition so that special marketing strategies can be adopted to retain them.
This paper proposes an integrated model that can predict customer churn and also recommend personalized win back actions.
This paper introduces a novel anchor design to support anchor-based face detection for superior scale-invariant performance, especially on tiny faces.
To achieve this, we explicitly address the problem that anchor-based detectors drop performance drastically on faces with tiny sizes, e.g. less than 16x16 pixels.
In this paper, we investigate why this is the case.
We discover that current anchor design cannot guarantee high overlaps between tiny faces and anchor boxes, which increases the difficulty of training.
The new Expected Max Overlapping (EMO) score is proposed which can theoretically explain the low overlapping issue and inspire several effective strategies of new anchor design leading to higher face overlaps, including anchor stride reduction with new network architectures, extra shifted anchors, and stochastic face shifting.
Comprehensive experiments show that our proposed method significantly outperforms the baseline anchor-based detector, while consistently achieving state-of-the-art results on challenging face detection datasets with competitive runtime speed.
Image-to-image translation has recently received significant attention due to advances in deep learning.
Most works focus on learning either a one-to-one mapping in an unsupervised way or a many-to-many mapping in a supervised way.
However, a more practical setting is many-to-many mapping in an unsupervised way, which is harder due to the lack of supervision and the complex inner- and cross-domain variations.
To alleviate these issues, we propose the Exemplar Guided & Semantically Consistent Image-to-image Translation (EGSC-IT) network which conditions the translation process on an exemplar image in the target domain.
We assume that an image comprises of a content component which is shared across domains, and a style component specific to each domain.
Under the guidance of an exemplar from the target domain we apply Adaptive Instance Normalization to the shared content component, which allows us to transfer the style information of the target domain to the source domain.
To avoid semantic inconsistencies during translation that naturally appear due to the large inner- and cross-domain variations, we introduce the concept of feature masks that provide coarse semantic guidance without requiring the use of any semantic labels.
Experimental results on various datasets show that EGSC-IT does not only translate the source image to diverse instances in the target domain, but also preserves the semantic consistency during the process.
We present a hierarchical regression framework for estimating hand joint positions from single depth images based on local surface normals.
The hierarchical regression follows the tree structured topology of hand from wrist to finger tips.
We propose a conditional regression forest, i.e., the Frame Conditioned Regression Forest (FCRF) which uses a new normal difference feature.
At each stage of the regression, the frame of reference is established from either the local surface normal or previously estimated hand joints.
By making the regression with respect to the local frame, the pose estimation is more robust to rigid transformations.
We also introduce a new efficient approximation to estimate surface normals.
We verify the effectiveness of our method by conducting experiments on two challenging real-world datasets and show consistent improvements over previous discriminative pose estimation methods.
Existing visual tracking methods usually localize a target object with a bounding box, in which the performance of the foreground object trackers or detectors is often affected by the inclusion of background clutter.
To handle this problem, we learn a patch-based graph representation for visual tracking.
The tracked object is modeled by with a graph by taking a set of non-overlapping image patches as nodes, in which the weight of each node indicates how likely it belongs to the foreground and edges are weighted for indicating the appearance compatibility of two neighboring nodes.
This graph is dynamically learned and applied in object tracking and model updating.
During the tracking process, the proposed algorithm performs three main steps in each frame.
First, the graph is initialized by assigning binary weights of some image patches to indicate the object and background patches according to the predicted bounding box.
Second, the graph is optimized to refine the patch weights by using a novel alternating direction method of multipliers.
Third, the object feature representation is updated by imposing the weights of patches on the extracted image features.
The object location is predicted by maximizing the classification score in the structured support vector machine.
Extensive experiments show that the proposed tracking algorithm performs well against the state-of-the-art methods on large-scale benchmark datasets.
Social networks offer a ready channel for fake and misleading news to spread and exert influence.
This paper examines the performance of different reputation algorithms when applied to a large and statistically significant portion of the news that are spread via Twitter.
Our main result is that simple algorithms based on the identity of the users spreading the news, as well as the words appearing in the titles and descriptions of the linked articles, are able to identify a large portion of fake or misleading news, while incurring only very low (<1%) false positive rates for mainstream websites.
We believe that these algorithms can be used as the basis of practical, large-scale systems for indicating to consumers which news sites deserve careful scrutiny and skepticism.
This paper introduces an automated heuristic process able to achieve high accuracy when matching graphical user interface widgets across multiple versions of a target application.
The proposed implementation is flexible as it allows full customization of the process and easy integration with existing tools for long term graphical user interface test case maintenance, software visualization and analysis.
Social capital has been studied in economics, sociology and political science as one of the key elements that promote the development of modern societies.
It can be defined as the source of capital that facilitates cooperation through shared social norms.
In this work, we investigate whether and to what extent synchronization aspects of mobile communication patterns are associated with social capital metrics.
Interestingly, our results show that our synchronization-based approach well correlates with existing social capital metrics (i.e., Referendum turnout, Blood donations, and Association density), being also able to characterize the different role played by high synchronization within a close proximity-based community and high synchronization among different communities.
Hence, the proposed approach can provide timely, effective analysis at a limited cost over a large territory.
In this paper we describe the implementation of a convolutional neural network (CNN) used to assess online review helpfulness.
To our knowledge, this is the first use of this architecture to address this problem.
We explore the impact of two related factors impacting CNN performance: different word embedding initializations and different input review lengths.
We also propose an approach to combining rating star information with review text to further improve prediction accuracy.
We demonstrate that this can improve the overall accuracy by 2%.
Finally, we evaluate the method on a benchmark dataset and show an improvement in accuracy relative to published results for traditional methods of 2.5% for a model trained using only review text and 4.24% for a model trained on a combination of rating star information and review text.
The sharing of network traces is an important prerequisite for the development and evaluation of efficient anomaly detection mechanisms.
Unfortunately, privacy concerns and data protection laws prevent network operators from sharing these data.
Anonymization is a promising solution in this context; however, it is unclear if the sanitization of data preserves the traffic characteristics or introduces artifacts that may falsify traffic analysis results.
In this paper, we examine the utility of anonymized flow traces for anomaly detection.
We quantitatively evaluate the impact of IP address anonymization, namely variations of permutation and truncation, on the detectability of large-scale anomalies.
Specifically, we analyze three weeks of un-sampled and non-anonymized network traces from a medium-sized backbone network.
We find that all anonymization techniques, except prefix-preserving permutation, degrade the utility of data for anomaly detection.
We show that the degree of degradation depends to a large extent on the nature and mix of anomalies present in a trace.
Moreover, we present a case study that illustrates how traffic characteristics of individual hosts are distorted by anonymization.
Currently, software industries are using different SDLC (software development life cycle) models which are designed for specific purposes.
The use of technology is booming in every perspective of life and the software behind the technology plays an enormous role.
As the technical complexities are increasing, successful development of software solely depends on the proper management of development processes.
So, it is inevitable to introduce improved methodologies in the industry so that modern human centred software applications development can be managed and delivered to the user successfully.
So, in this paper, we have explored the facts of different SDLC models and perform their comparative analysis.
Deep neural networks require large amounts of resources which makes them hard to use on resource constrained devices such as Internet-of-things devices.
Offloading the computations to the cloud can circumvent these constraints but introduces a privacy risk since the operator of the cloud is not necessarily trustworthy.
We propose a technique that obfuscates the data before sending it to the remote computation node.
The obfuscated data is unintelligible for a human eavesdropper but can still be classified with a high accuracy by a neural network trained on unobfuscated images.
The reuse of code fragments by copying and pasting is widely practiced in software development and results in code clones.
Cloning is considered an anti-pattern as it negatively affects program correctness and increases maintenance efforts.
Programmable Logic Controller (PLC) software is no exception in the code clone discussion as reuse in development and maintenance is frequently achieved through copy, paste, and modification.
Even though the presence of code clones may not necessary be a problem per se, it is important to detect, track and manage clones as the software system evolves.
Unfortunately, tool support for clone detection and management is not commonly available for PLC software systems or limited to generic tools with a reduced set of features.
In this paper, we investigate code clones in a real-world PLC software system based on IEC 61131-3 Structured Text and C/C++.
We extended a widely used tool for clone detection with normalization support.
Furthermore, we evaluated the different types and natures of code clones in the studied system and their relevance for refactoring.
Results shed light on the applicability and usefulness of clone detection in the context of industrial automation systems and it demonstrates the benefit of adapting detection and management tools for IEC 611313-3 languages.
We investigate optimal geographical caching in heterogeneous cellular networks, where different types of base stations (BSs) have different cache capacities.
The content library contains files with different popularities.
The performance metric is the total hit probability.
The problem of optimally placing content in all BSs jointly is not convex in general.
However, we show that when BSs are deployed according to homogeneous Poisson point processes (PPP), independently for each type, we can formulate the problem as a convex problem.
We give the optimal solution to the joint problem for PPP deployment.
For the general case, we provide a distributed local optimization algorithm (LOA) that finds the optimal placement policies for different types of BSs.
We find the optimal placement policy of the small BSs (SBSs) depending on the placement policy of the macro BSs (MBSs).
We show that storing the most popular content in the MBSs is almost optimal if the SBSs are using an optimal placement policy.
Also, for the SBSs no such heuristic can be used; the optimal placement is significantly better than storing the most popular content.
Finally, we numerically verify that LOA gives the same hit probability as the joint optimal solution for the PPP model.
In the analysis of logic programs, abstract domains for detecting sharing and linearity information are widely used.
Devising abstract unification algorithms for such domains has proved to be rather hard.
At the moment, the available algorithms are correct but not optimal, i.e., they cannot fully exploit the information conveyed by the abstract domains.
In this paper, we define a new (infinite) domain ShLin-w which can be thought of as a general framework from which other domains can be easily derived by abstraction.
ShLin-w makes the interaction between sharing and linearity explicit.
We provide a constructive characterization of the optimal abstract unification operator on ShLin-w and we lift it to two well-known abstractions of ShLin-w. Namely, to the classical Sharing X Lin abstract domain and to the more precise ShLin-2 abstract domain by Andy King.
In the case of single binding substitutions, we obtain optimal abstract unification algorithms for such domains.
To appear in Theory and Practice of Logic Programming (TPLP).
Software-Defined Networks have seen an increasing in their deployment because they offer better network manageability compared to traditional networks.
Despite their immense success and popularity, various security issues in SDN remain open problems for research.
Particularly, the problem of securing the controllers in distributed environment is still short of any solutions.
This paper proposes a scheme to identify any rogue/malicious controller(s) in a distributed environment.
Our scheme is based on trust and reputation system which is centrally managed.
As such, our scheme identifies any controllers acting maliciously by comparing the state of installed flows/policies with policies that should be installed.
Controllers rate each other on this basis and report the results to a central entity, which reports it to the network administrator.
In this contribution, a direct comparison of the Offset-QAM-OFDM (OQAM-OFDM) and the Cyclic Prefix OFDM (CP-OFDM) scheme is given for an 802.11a based system.
Therefore, the chosen algorithms and choices of design are described and evaluated as a whole system in terms of bit and frame error rate (BER/FER) performance as well as spectral efficiency and complexity in the presence of multipath propagation for different modulation orders.
The results show that the OQAM-OFDM scheme exhibits similar BER and FER performance at a 24% higher spectral efficiency and achievable throughput at the cost of an up to five times increased computational complexity.
This paper focuses on numeric data, with emphasis on distinct characteristics like varying significance, unstructured format, mass volume and real-time processing.
We propose a novel, context-dependent valuation framework specifically devised to assess quality in numeric datasets.
Our framework uses eight relevant data quality dimensions, and provide a simple metric to evaluate dataset quality along each dimension.
We argue that the proposed set of dimensions and corresponding metrics adequately captures the unique quality antipatterns that are typically associated with numerical data.
The introduction of our framework is part of a wider research effort that aims at developing an articulated numerical data quality improvement approach for Oil and Gas exploration and production workflows that is based on artificial intelligence techniques.
The digitalization of the legal domain has been ongoing for a couple of years.
In that process, the application of different machine learning (ML) techniques is crucial.
Tasks such as the classification of legal documents or contract clauses as well as the translation of those are highly relevant.
On the other side, digitized documents are barely accessible in this field, particularly in Germany.
Today, deep learning (DL) is one of the hot topics with many publications and various applications.
Sometimes it provides results outperforming the human level.
Hence this technique may be feasible for the legal domain as well.
However, DL requires thousands of samples to provide decent results.
A potential solution to this problem is multi-task DL to enable transfer learning.
This approach may be able to overcome the data scarcity problem in the legal domain, specifically for the German language.
We applied the state of the art multi-task model on three tasks: translation, summarization, and multi-label classification.
The experiments were conducted on legal document corpora utilizing several task combinations as well as various model parameters.
The goal was to find the optimal configuration for the tasks at hand within the legal domain.
The multi-task DL approach outperformed the state of the art results in all three tasks.
This opens a new direction to integrate DL technology more efficiently in the legal domain.
Intelligent code completion has become an essential research task to accelerate modern software development.
To facilitate effective code completion for dynamically-typed programming languages, we apply neural language models by learning from large codebases, and develop a tailored attention mechanism for code completion.
However, standard neural language models even with attention mechanism cannot correctly predict the out-of-vocabulary (OoV) words that restrict the code completion performance.
In this paper, inspired by the prevalence of locally repeated terms in program source code, and the recently proposed pointer copy mechanism, we propose a pointer mixture network for better predicting OoV words in code completion.
Based on the context, the pointer mixture network learns to either generate a within-vocabulary word through an RNN component, or regenerate an OoV word from local context through a pointer component.
Experiments on two benchmarked datasets demonstrate the effectiveness of our attention mechanism and pointer mixture network on the code completion task.
I describe how real quantum annealers may be used to perform local (in state space) searches around specified states, rather than the global searches traditionally implemented in the quantum annealing algorithm.
The quantum annealing algorithm is an analogue of simulated annealing, a classical numerical technique which is now obsolete.
Hence, I explore strategies to use an annealer in a way which takes advantage of modern classical optimization algorithms, and additionally should be less sensitive to problem mis-specification then the traditional quantum annealing algorithm.
We constraint on computer the best linear unbiased generalized statistics of random field for the best linear unbiased generalized statistics of an unknown constant mean of random field and derive the numerical generalized least-squares estimator of an unknown constant mean of random field.
We derive the third constraint of spatial statistics and show that the classic generalized least-squares estimator of an unknown constant mean of the field is only an asymptotic disjunction of the numerical one.
Quaternion-valued wireless communication systems have been studied in the past.
Although progress has been made in this promising area, a crucial missing link is lack of effective and efficient quaternion-valued signal processing algorithms for channel equalisation and beamforming.
With most recent developments in quaternion-valued signal processing, in this work, we fill the gap to solve the problem and further derive the quaternion-valued Wiener solution for block-based calculation.
Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry.
Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes.
Our approach uses graph neural networks to express probabilistic dependencies among a graph's nodes and edges, and can, in principle, learn distributions over any arbitrary graph.
In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data.
Compared to baselines that do not use graph-structured representations, our models often perform far better.
We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions.
Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.
Building self-adaptive and self-organizing (SASO) systems is a challenging problem, in part because SASO principles are not yet well understood and few platforms exist for exploring them.
Cellular automata (CA) are a well-studied approach to exploring the principles underlying self-organization.
A CA comprises a lattice of cells whose states change over time based on a discrete update function.
One challenge to developing CA is that the relationship of an update function, which describes the local behavior of each cell, to the global behavior of the entire CA is often unclear.
As a result, many researchers have used stochastic search techniques, such as evolutionary algorithms, to automatically discover update functions that produce a desired global behavior.
However, these update functions are typically defined in a way that does not provide for self-adaptation.
Here we describe an approach to discovering CA update functions that are both self-adaptive and self-organizing.
Specifically, we use a novel evolutionary algorithm-based approach to discover finite state machines (FSMs) that implement update functions for CA.
We show how this approach is able to evolve FSM-based update functions that perform well on the density classification task for 1-, 2-, and 3-dimensional CA.
Moreover, we show that these FSMs are self-adaptive, self-organizing, and highly scalable, often performing well on CA that are orders of magnitude larger than those used to evaluate performance during the evolutionary search.
These results demonstrate that CA are a viable platform for studying the integration of self-adaptation and self-organization, and strengthen the case for using evolutionary algorithms as a component of SASO systems.
Deep reinforcement learning, and especially the Asynchronous Advantage Actor-Critic algorithm, has been successfully used to achieve super-human performance in a variety of video games.
Starcraft II is a new challenge for the reinforcement learning community with the release of pysc2 learning environment proposed by Google Deepmind and Blizzard Entertainment.
Despite being a target for several AI developers, few have achieved human level performance.
In this project we explain the complexities of this environment and discuss the results from our experiments on the environment.
We have compared various architectures and have proved that transfer learning can be an effective paradigm in reinforcement learning research for complex scenarios requiring skill transfer.
Deep Neural Networks (DNNs) have advanced the state-of-the-art in a variety of machine learning tasks and are deployed in increasing numbers of products and services.
However, the computational requirements of training and evaluating large-scale DNNs are growing at a much faster pace than the capabilities of the underlying hardware platforms that they are executed upon.
In this work, we propose Dynamic Variable Effort Deep Neural Networks (DyVEDeep) to reduce the computational requirements of DNNs during inference.
Previous efforts propose specialized hardware implementations for DNNs, statically prune the network, or compress the weights.
Complementary to these approaches, DyVEDeep is a dynamic approach that exploits the heterogeneity in the inputs to DNNs to improve their compute efficiency with comparable classification accuracy.
DyVEDeep equips DNNs with dynamic effort mechanisms that, in the course of processing an input, identify how critical a group of computations are to classify the input.
DyVEDeep dynamically focuses its compute effort only on the critical computa- tions, while skipping or approximating the rest.
We propose 3 effort knobs that operate at different levels of granularity viz. neuron, feature and layer levels.
We build DyVEDeep versions for 5 popular image recognition benchmarks - one for CIFAR-10 and four for ImageNet (AlexNet, OverFeat and VGG-16, weight-compressed AlexNet).
Across all benchmarks, DyVEDeep achieves 2.1x-2.6x reduction in the number of scalar operations, which translates to 1.8x-2.3x performance improvement over a Caffe-based implementation, with < 0.5% loss in accuracy.
Application Programming Interfaces (APIs) often have usage constraints, such as restrictions on call order or call conditions.
API misuses, i.e., violations of these constraints, may lead to software crashes, bugs, and vulnerabilities.
Though researchers developed many API-misuse detectors over the last two decades, recent studies show that API misuses are still prevalent.
Therefore, we need to understand the capabilities and limitations of existing detectors in order to advance the state of the art.
In this paper, we present the first-ever qualitative and quantitative evaluation that compares static API-misuse detectors along the same dimensions, and with original author validation.
To accomplish this, we develop MUC, a classification of API misuses, and MUBenchPipe, an automated benchmark for detector comparison, on top of our misuse dataset, MUBench.
Our results show that the capabilities of existing detectors vary greatly and that existing detectors, though capable of detecting misuses, suffer from extremely low precision and recall.
A systematic root-cause analysis reveals that, most importantly, detectors need to go beyond the naive assumption that a deviation from the most-frequent usage corresponds to a misuse and need to obtain additional usage examples to train their models.
We present possible directions towards more-powerful API-misuse detectors.
Semiconstrained systems were recently suggested as a generalization of constrained systems, commonly used in communication and data-storage applications that require certain offending subsequences be avoided.
In an attempt to apply techniques from constrained systems, we study sequences of constrained systems that are contained in, or contain, a given semiconstrained system, while approaching its capacity.
In the case of contained systems we describe to such sequences resulting in constant-to-constant bit-rate block encoders and sliding-block encoders.
Surprisingly, in the case of containing systems we show that a "generic" semiconstrained system is never contained in a proper fully-constrained system.
Transfer learning using pre-trained Convolutional Neural Networks (CNNs) has been successfully applied to images for different classification tasks.
In this paper, we propose a new pipeline for pain expression recognition in neonates using transfer learning.
Specifically, we propose to exploit a pre-trained CNN that was originally trained on a relatively similar dataset for face recognition (VGG Face) as well as CNNs that were pre-trained on a relatively different dataset for image classification (iVGG F,M, and S) to extract deep features from neonates' faces.
In the final stage, several supervised machine learning classifiers are trained to classify neonates' facial expression into pain or no pain expression.
The proposed pipeline achieved, on a testing dataset, 0.841 AUC and 90.34 accuracy, which is approx.
7 higher than the accuracy of handcrafted traditional features.
We also propose to combine deep features with traditional features and hypothesize that the mixed features would improve pain classification performance.
Combining deep features with traditional features achieved 92.71 accuracy and 0.948 AUC.
These results show that transfer learning, which is a faster and more practical option than training CNN from the scratch, can be used to extract useful features for pain expression recognition in neonates.
It also shows that combining deep features with traditional handcrafted features is a good practice to improve the performance of pain expression recognition and possibly the performance of similar applications.
In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces.
In order to directly exploit this transition structure for planning, we present two novel compilations of the learned factored planning problem with BNNs based on reductions to Weighted Partial Maximum Boolean Satisfiability (FD-SAT-Plan+) as well as Binary Linear Programming (FD-BLP-Plan+).
Theoretically, we show that our SAT-based Bi-Directional Neuron Activation Encoding is asymptotically the most compact encoding in the literature and maintains the generalized arc-consistency property through unit propagation -- an important property that facilitates efficiency in SAT solvers.
Experimentally, we validate the computational efficiency of our Bi-Directional Neuron Activation Encoding in comparison to an existing neuron activation encoding and demonstrate the effectiveness of learning complex transition models with BNNs.
We test the runtime efficiency of both FD-SAT-Plan+ and FD-BLP-Plan+ on the learned factored planning problem showing that FD-SAT-Plan+ scales better with increasing BNN size and complexity.
Finally, we present a finite-time incremental constraint generation algorithm based on generalized landmark constraints to improve the planning accuracy of our encodings through simulated or real-world interaction.
The main aim of this paper is to discuss how the combination of Web 2.0, social media and geographic technologies can provide opportunities for learning and new forms of participation in an urban design studio.
This discussion is mainly based on our recent findings from two experimental urban design studio setups as well as former research and literature studies.
In brief, the web platform enabled us to extend the learning that took place in the design studio beyond the studio hours, to represent the design information in novel ways and allocate multiple communication forms.
We found that the student activity in the introduced web platform was related to their progress up to a certain extent.
Moreover, the students perceived the platform as a convenient medium and addressed it as a valuable resource for learning.
This study should be conceived as a continuation of a series of our Design Studio 2.0 experiments which involve the exploitation of opportunities provided by novel socio-geographic information and communication technologies for the improvement of the design learning processes.
We formalize the problem of multi-agent path finding with deadlines (MAPF-DL).
The objective is to maximize the number of agents that can reach their given goal vertices from their given start vertices within a given deadline, without colliding with each other.
We first show that the MAPF-DL problem is NP-hard to solve optimally.
We then present an optimal MAPF-DL algorithm based on a reduction of the MAPF-DL problem to a flow problem and a subsequent compact integer linear programming formulation of the resulting reduced abstracted multi-commodity flow network.
The considered problem is how to optimally allocate a set of jobs to technicians of different skills such that the number of technicians of each skill does not exceed the number of persons with that skill designation.
The key motivation is the quick sensitivity analysis in terms of the workforce size which is quite necessary in many industries in the presence of unexpected work orders.
A time-indexed mathematical model is proposed to minimize the total weighted completion time of the jobs.
The proposed model is decomposed into a number of single-skill sub-problems so that each one is a combination of a series of nested binary Knapsack problems.
A heuristic procedure is proposed to solve the problem.
Our experimental results, based on a real-world case study, reveal that the proposed method quickly produces a schedule statistically close to the optimal one while the classical optimal procedure is very time-consuming.
We introduce a new artificial intelligence (AI) approach called, the 'Digital Synaptic Neural Substrate' (DSNS).
It uses selected attributes from objects in various domains (e.g. chess problems, classical music, renowned artworks) and recombines them in such a way as to generate new attributes that can then, in principle, be used to create novel objects of creative value to humans relating to any one of the source domains.
This allows some of the burden of creative content generation to be passed from humans to machines.
The approach was tested in the domain of chess problem composition.
We used it to automatically compose numerous sets of chess problems based on attributes extracted and recombined from chess problems and tournament games by humans, renowned paintings, computer-evolved abstract art, photographs of people, and classical music tracks.
The quality of these generated chess problems was then assessed automatically using an existing and experimentally-validated computational chess aesthetics model.
They were also assessed by human experts in the domain.
The results suggest that attributes collected and recombined from chess and other domains using the DSNS approach can indeed be used to automatically generate chess problems of reasonably high aesthetic quality.
In particular, a low quality chess source (i.e. tournament game sequences between weak players) used in combination with actual photographs of people was able to produce three-move chess problems of comparable quality or better to those generated using a high quality chess source (i.e. published compositions by human experts), and more efficiently as well.
Why information from a foreign domain can be integrated and functional in this way remains an open question for now.
The DSNS approach is, in principle, scalable and applicable to any domain in which objects have attributes that can be represented using real numbers.
Content based Document Classification is one of the biggest challenges in the context of free text mining.
Current algorithms on document classifications mostly rely on cluster analysis based on bag-of-words approach.
However that method is still being applied to many modern scientific dilemmas.
It has established a strong presence in fields like economics and social science to merit serious attention from the researchers.
In this paper we would like to propose and explore an alternative grounded more securely on the dictionary classification and correlatedness of words and phrases.
It is expected that application of our existing knowledge about the underlying classification structure may lead to improvement of the classifier's performance.
Theories for visually guided action account for online control in the presence of reliable sources of visual information, and predictive control to compensate for visuomotor delay and temporary occlusion.
In this study, we characterize the temporal relationship between information integration window and prediction distance using computational models.
Subjects were immersed in a simulated environment and attempted to catch virtual balls that were transiently "blanked" during flight.
Recurrent neural networks were trained to reproduce subject's gaze and hand movements during blank.
The models successfully predict gaze behavior within 3 degrees, and hand movements within 8.5 cm as far as 500 ms in time, with integration window as short as 27 ms.
Furthermore, we quantified the contribution of each input source of information to motor output through an ablation study.
The model is a proof of concept for prediction as a discrete mapping between information integrated over time and a temporally distant motor output.
We outline a program in the area of formalization of mathematics to automate theorem proving in algebra and algebraic geometry.
We propose a construction of a dictionary between automated theorem provers and (La)TeX exploiting syntactic parsers.
We describe its application to a repository of human-written facts and definitions in algebraic geometry (The Stacks Project).
We use deep learning techniques.
We introduce a class of causal video understanding models that aims to improve efficiency of video processing by maximising throughput, minimising latency, and reducing the number of clock cycles.
Leveraging operation pipelining and multi-rate clocks, these models perform a minimal amount of computation (e.g. as few as four convolutional layers) for each frame per timestep to produce an output.
The models are still very deep, with dozens of such operations being performed but in a pipelined fashion that enables depth-parallel computation.
We illustrate the proposed principles by applying them to existing image architectures and analyse their behaviour on two video tasks: action recognition and human keypoint localisation.
The results show that a significant degree of parallelism, and implicitly speedup, can be achieved with little loss in performance.
In a voice-controlled smart-home, a controller must respond not only to user's requests but also according to the interaction context.
This paper describes Arcades, a system which uses deep reinforcement learning to extract context from a graphical representation of home automation system and to update continuously its behavior to the user's one.
This system is robust to changes in the environment (sensor breakdown or addition) through its graphical representation (scale well) and the reinforcement mechanism (adapt well).
The experiments on realistic data demonstrate that this method promises to reach long life context-aware control of smart-home.
The main thrust of the article is to provide interesting example, useful for students of using bitwise operations in the programming languages C ++ and Java.
As an example, we describe an algorithm for obtaining a Latin square of arbitrary order.
We will outline some techniques for the use of bitwise operations.
The interconnection network comprises a significant portion of the cost of large parallel computers, both in economic terms and power consumption.
Several previous proposals exploit large-radix routers to build scalable low-distance topologies with the aim of minimizing these costs.
However, they fail to consider potential unbalance in the network utilization, which in some cases results in suboptimal designs.
Based on an appropriate cost model, this paper advocates the use of networks based on incidence graphs of projective planes, broadly denoted as Projective Networks.
Projective Networks rely on highly symmetric generalized Moore graphs and encompass several proposed direct (PN and demi-PN) and indirect (OFT) topologies under a common mathematical framework.
Compared to other proposals with average distance between 2 and 3 hops, these networks provide very high scalability while preserving a balanced network utilization, resulting in low network costs.
Overall, Projective Networks constitute a competitive alternative for exascale-level interconnection network design.
The main goal of the paper is to provide Pepper with a near real-time object recognition system based on deep neural networks.
The proposed system is based on YOLO (You Only Look Once), a deep neural network that is able to detect and recognize objects robustly and at a high speed.
In addition, considering that YOLO cannot be run in the Pepper's internal computer in near real-time, we propose to use a Backpack for Pepper, which holds a Jetson TK1 card and a battery.
By using this card, Pepper is able to robustly detect and recognize objects in images of 320x320 pixels at about 5 frames per second.
We study two-receiver Poisson channels using tools derived from stochastic calculus.
We obtain a general formula for the mutual information over the Poisson channel that allows for conditioning and the use of auxiliary random variables.
We then use this formula to compute necessary and sufficient conditions under which one Poisson channel is less noisy and/or more capable than another, which turn out to be distinct from the conditions under which this ordering holds for the discretized versions of the channels.
We also use general formula to determine the capacity region of the more capable Poisson broadcast channel with independent message sets, the more capable Poisson wiretap channel, and the general two-decoder Poisson broadcast channel with degraded message sets.
Pre-operative Abdominal Aortic Aneurysm (AAA) 3D shape is critical for customized stent-graft design in Fenestrated Endovascular Aortic Repair (FEVAR).
Traditional segmentation approaches implement expert-designed feature extractors while recent deep neural networks extract features automatically with multiple non-linear modules.
Usually, a large training dataset is essential for applying deep learning on AAA segmentation.
In this paper, the AAA was segmented using U-net with a small number (two) of training subjects.
Firstly, Computed Tomography Angiography (CTA) slices were augmented with gray value variation and translation to avoid the overfitting caused by the small number of training subjects.
Then, U-net was trained to segment the AAA.
Dice Similarity Coefficients (DSCs) over 0.8 were achieved on the testing subjects.
The PLZ, DLZ and aortic branches are all reconstructed reasonably, which will facilitate stent graft customization and help shape instantiation for intra-operative surgery navigation in FEVAR.
Recent work on loglinear models in probabilistic constraint logic programming is applied to first-order probabilistic reasoning.
Probabilities are defined directly on the proofs of atomic formulae, and by marginalisation on the atomic formulae themselves.
We use Stochastic Logic Programs (SLPs) composed of labelled and unlabelled definite clauses to define the proof probabilities.
We have a conservative extension of first-order reasoning, so that, for example, there is a one-one mapping between logical and random variables.
We show how, in this framework, Inductive Logic Programming (ILP) can be used to induce the features of a loglinear model from data.
We also compare the presented framework with other approaches to first-order probabilistic reasoning.
Recommender systems in academia are not widely available.
This may be in part due to the difficulty and cost of developing and maintaining recommender systems.
Many operators of academic products such as digital libraries and reference managers avoid this effort, although a recommender system could provide significant benefits to their users.
In this paper, we introduce Mr. DLib's "Recommendations as-a-Service" (RaaS) API that allows operators of academic products to easily integrate a scientific recommender system into their products.
Mr. DLib generates recommendations for research articles but in the future, recommendations may include call for papers, grants, etc.
Operators of academic products can request recommendations from Mr. DLib and display these recommendations to their users.
Mr. DLib can be integrated in just a few hours or days; creating an equivalent recommender system from scratch would require several months for an academic operator.
Mr. DLib has been used by GESIS Sowiport and by the reference manager JabRef.
Mr. DLib is open source and its goal is to facilitate the application of, and research on, scientific recommender systems.
In this paper, we present the motivation for Mr. DLib, the architecture and details about the effectiveness.
Mr. DLib has delivered 94m recommendations over a span of two years with an average click-through rate of 0.12%.
Android is an open software platform for mobile devices with a large market share in the smartphone sector.
The openness of the system as well as its wide adoption lead to an increasing amount of malware developed for this platform.
ANANAS is an expandable and modular framework for analyzing Android applications.
It takes care of common needs for dynamic malware analysis and provides an interface for the development of plugins.
Adaptability and expandability have been main design goals during the development process.
An abstraction layer for simple user interaction and phone event simulation is also part of the framework.
It allows an analyst to script the required user simulation or phone events on demand or adjust the simulation to his needs.
Six plugins have been developed for ANANAS.
They represent well known techniques for malware analysis, such as system call hooking and network traffic analysis.
The focus clearly lies on dynamic analysis, as five of the six plugins are dynamic analysis methods.
In this paper, we bridge the gap between hyperparameter optimization and ensemble learning by performing Bayesian optimization of an ensemble with regards to its hyperparameters.
Our method consists in building a fixed-size ensemble, optimizing the configuration of one classifier of the ensemble at each iteration of the hyperparameter optimization algorithm, taking into consideration the interaction with the other models when evaluating potential performances.
We also consider the case where the ensemble is to be reconstructed at the end of the hyperparameter optimization phase, through a greedy selection over the pool of models generated during the optimization.
We study the performance of our proposed method on three different hyperparameter spaces, showing that our approach is better than both the best single model and a greedy ensemble construction over the models produced by a standard Bayesian optimization.
Reason and inference require process as well as memory skills by humans.
Neural networks are able to process tasks like image recognition (better than humans) but in memory aspects are still limited (by attention mechanism, size).
Recurrent Neural Network (RNN) and it's modified version LSTM are able to solve small memory contexts, but as context becomes larger than a threshold, it is difficult to use them.
The Solution is to use large external memory.
Still, it poses many challenges like, how to train neural networks for discrete memory representation, how to describe long term dependencies in sequential data etc.
Most prominent neural architectures for such tasks are Memory networks: inference components combined with long term memory and Neural Turing Machines: neural networks using external memory resources.
Also, additional techniques like attention mechanism, end to end gradient descent on discrete memory representation are needed to support these solutions.
Preliminary results of above neural architectures on simple algorithms (sorting, copying) and Question Answering (based on story, dialogs) application are comparable with the state of the art.
In this paper, I explain these architectures (in general), the additional techniques used and the results of their application.
In this project we propose a new approach for emotion recognition using web-based similarity (e.g. confidence, PMI and PMING).
We aim to extract basic emotions from short sentences with emotional content (e.g. news titles, tweets, captions), performing a web-based quantitative evaluation of semantic proximity between each word of the analyzed sentence and each emotion of a psychological model (e.g. Plutchik, Ekman, Lovheim).
The phases of the extraction include: text preprocessing (tokenization, stop words, filtering), search engine automated query, HTML parsing of results (i.e. scraping), estimation of semantic proximity, ranking of emotions according to proximity measures.
The main idea is that, since it is possible to generalize semantic similarity under the assumption that similar concepts co-occur in documents indexed in search engines, therefore also emotions can be generalized in the same way, through tags or terms that express them in a particular language, ranking emotions.
Training results are compared to human evaluation, then additional comparative tests on results are performed, both for the global ranking correlation (e.g. Kendall, Spearman, Pearson) both for the evaluation of the emotion linked to each single word.
Different from sentiment analysis, our approach works at a deeper level of abstraction, aiming at recognizing specific emotions and not only the positive/negative sentiment, in order to predict emotions as semantic data.
We compute the free energy of the planar monomer-dimer model.
Unlike the classical planar dimer model, an exact solution is not known in this case.
Even the computation of the low-density power series expansion requires heavy and nontrivial computations.
Despite of the exponential computational complexity, we compute almost three times more terms than were previously known.
Such an expansion provides both lower and upper bound for the free energy, and allows to obtain more accurate numerical values than previously possible.
We expect that our methods can be applied to other similar problems.
Cyber-physical systems of today are generating large volumes of time-series data.
As manual inspection of such data is not tractable, the need for learning methods to help discover logical structure in the data has increased.
We propose a logic-based framework that allows domain-specific knowledge to be embedded into formulas in a parametric logical specification over time-series data.
The key idea is to then map a time series to a surface in the parameter space of the formula.
Given this mapping, we identify the Hausdorff distance between boundaries as a natural distance metric between two time-series data under the lens of the parametric specification.
This enables embedding non-trivial domain-specific knowledge into the distance metric and then using off-the-shelf machine learning tools to label the data.
After labeling the data, we demonstrate how to extract a logical specification for each label.
Finally, we showcase our technique on real world traffic data to learn classifiers/monitors for slow-downs and traffic jams.
Flow is a new computational framework, built to support a key need triggered by the rapid growth of autonomy in ground traffic: controllers for autonomous vehicles in the presence of complex nonlinear dynamics in traffic.
Leveraging recent advances in deep Reinforcement Learning (RL), Flow enables the use of RL methods such as policy gradient for traffic control and enables benchmarking the performance of classical (including hand-designed) controllers with learned policies (control laws).
Flow integrates traffic microsimulator SUMO with deep reinforcement learning library rllab and enables the easy design of traffic tasks, including different networks configurations and vehicle dynamics.
We use Flow to develop reliable controllers for complex problems, such as controlling mixed-autonomy traffic (involving both autonomous and human-driven vehicles) in a ring road.
For this, we first show that state-of-the-art hand-designed controllers excel when in-distribution, but fail to generalize; then, we show that even simple neural network policies can solve the stabilization task across density settings and generalize to out-of-distribution settings.
Image caption generation systems are typically evaluated against reference outputs.
We show that it is possible to predict output quality without generating the captions, based on the probability assigned by the neural model to the reference captions.
Such pre-gen metrics are strongly correlated to standard evaluation metrics.
BRDF of most real world materials has two components, the surface BRDF due to the light reflecting at the surface of the material and the subsurface BRDF due to the light entering and going through many scattering events inside the material.
Each of these events modifies light's path, power, polarization state.
Computing polarized subsurface BRDF of a material requires simulating the light transport inside the material.
The transport of polarized light is modeled by the Vector Radiative Transfer Equation (VRTE), an integro-differential equation.
Computing solution to that equation is expensive.
The Discrete Ordinate Method (DOM) is a common approach to solving the VRTE.
Such solvers are very time consuming for complex uses such as BRDF computation, where one must solve VRTE for surface radiance distribution due to light incident from every direction of the hemisphere above the surface.
In this paper, we present a GPU based DOM solution of the VRTE to expedite the subsurface BRDF computation.
As in other DOM based solutions, our solution is based on Fourier expansions of the phase function and the radiance function.
This allows us to independently solve the VRTE for each order of expansion.
We take advantage of those repetitions and of the repetitions in each of the sub-steps of the solution process.
Our solver is implemented to run mainly on graphics hardware using the OpenCL library and runs up to seven times faster than its CPU equivalent, allowing the computation of subsurface BRDF in a matter of minutes.
We compute and present the subsurface BRDF lobes due to powders and paints of a few materials.
We also show the rendering of objects with the computed BRDF.
The solver is available for public use through the authors' web site.
Disjunctive Logic Programming (DLP) is a very expressive formalism: it allows for expressing every property of finite structures that is decidable in the complexity class SigmaP2 (= NP^NP).
Despite this high expressiveness, there are some simple properties, often arising in real-world applications, which cannot be encoded in a simple and natural manner.
Especially properties that require the use of arithmetic operators (like sum, times, or count) on a set or multiset of elements, which satisfy some conditions, cannot be naturally expressed in classic DLP.
To overcome this deficiency, we extend DLP by aggregate functions in a conservative way.
In particular, we avoid the introduction of constructs with disputed semantics, by requiring aggregates to be stratified.
We formally define the semantics of the extended language (called DLP^A), and illustrate how it can be profitably used for representing knowledge.
Furthermore, we analyze the computational complexity of DLP^A, showing that the addition of aggregates does not bring a higher cost in that respect.
Finally, we provide an implementation of DLP^A in DLV -- a state-of-the-art DLP system -- and report on experiments which confirm the usefulness of the proposed extension also for the efficiency of computation.
White-Fi refers to WiFi deployed in the TV white spaces.
Unlike its ISM band counterparts, White-Fi must obey requirements that protect TV reception.
As a result, optimization of citywide White-Fi networks faces the challenges of heterogeneous channel availability and link quality, over location.
The former is because, at any location, channels in use by TV networks are not available for use by White-Fi.
The latter is because the link quality achievable at a White-Fi receiver is determined by not only its link gain to its transmitter but also by its link gains to TV transmitters and its transmitter's link gains to TV receivers.
In this work, we model the medium access control (MAC) throughput of a White-Fi network.
We propose heuristic algorithms to optimize the throughput, given the described heterogeneity.
The algorithms assign power, access probability, and channels to nodes in the network, under the constraint that reception at TV receivers is not compromised.
We evaluate the efficacy of our approach over example city-wide White-Fi networks deployed over Denver and Columbus (respectively, low and high channel availability) in the USA, and compare with assignments cognizant of heterogeneity to a lesser degree, for example, akin to FCC regulations.
There has been an increasing interest in the millimeter wave (mmW) frequency regime in the design of next-generation wireless systems.
The focus of this work is on understanding mmW channel properties that have an important bearing on the feasibility of mmW systems in practice and have a significant impact on physical (PHY) layer design.
In this direction, simultaneous channel sounding measurements at 2.9, 29 and 61 GHz are performed at a number of transmit-receive location pairs in indoor office, shopping mall and outdoor environments.
Based on these measurements, this paper first studies large-scale properties such as path loss and delay spread across different carrier frequencies in these scenarios.
Towards the goal of understanding the feasibility of outdoor-to-indoor coverage, material measurements corresponding to mmW reflection and penetration are studied and significant notches in signal reception spread over a few GHz are reported.
Finally, implications of these measurements on system design are discussed and multiple solutions are proposed to overcome these impairments.
There has been significant interest in parallel graph processing recently due to the need to quickly analyze the large graphs available today.
Many graph codes have been designed for distributed memory or external memory.
However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server.
Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones that use the Hyperlink graph are done in distributed or external memory.
Therefore it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes.
We give implementations of theoretically-efficient parallel algorithms for 13 important graph problems.
We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly.
We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs.
For many of the problems that we consider, this is the first time they have been solved on graphs at this scale.
We have created a problem-based benchmark suite containing these problems that will be made publicly-available.
Difference sets and their generalisations to difference families arise from the study of designs and many other applications.
Here we give a brief survey of some of these applications, noting in particular the diverse definitions of difference families and the variations in priorities in constructions.
We propose a definition of disjoint difference families that encompasses these variations and allows a comparison of the similarities and disparities.
We then focus on two constructions of disjoint difference families arising from frequency hopping sequences and showed that they are in fact the same.
We conclude with a discussion of the notion of equivalence for frequency hopping sequences and for disjoint difference families.
Literature search is critical for any scientific research.
Different from Web or general domain search, a large portion of queries in scientific literature search are entity-set queries, that is, multiple entities of possibly different types.
Entity-set queries reflect user's need for finding documents that contain multiple entities and reveal inter-entity relationships and thus pose non-trivial challenges to existing search algorithms that model each entity separately.
However, entity-set queries are usually sparse (i.e., not so repetitive), which makes ineffective many supervised ranking models that rely heavily on associated click history.
To address these challenges, we introduce SetRank, an unsupervised ranking framework that models inter-entity relationships and captures entity type information.
Furthermore, we develop a novel unsupervised model selection algorithm, based on the technique of weighted rank aggregation, to automatically choose the parameter settings in SetRank without resorting to a labeled validation set.
We evaluate our proposed unsupervised approach using datasets from TREC Genomics Tracks and Semantic Scholar's query log.
The experiments demonstrate that SetRank significantly outperforms the baseline unsupervised models, especially on entity-set queries, and our model selection algorithm effectively chooses suitable parameter settings.
Social network platforms can use the data produced by their users to serve them better.
One of the services these platforms provide is recommendation service.
Recommendation systems can predict the future preferences of users using their past preferences.
In the recommendation systems literature there are various techniques, such as neighborhood based methods, machine-learning based methods and matrix-factorization based methods.
In this work, a set of well known methods from natural language processing domain, namely Word2Vec, is applied to recommendation systems domain.
Unlike previous works that use Word2Vec for recommendation, this work uses non-textual features, the check-ins, and it recommends venues to visit/check-in to the target users.
For the experiments, a Foursquare check-in dataset is used.
The results show that use of continuous vector space representations of items modeled by techniques of Word2Vec is promising for making recommendations.
This paper presents preliminary results of our work with a major financial company, where we try to use methods of plan recognition in order to investigate the interactions of a costumer with the company's online interface.
In this paper, we present the first steps of integrating a plan recognition algorithm in a real-world application for detecting and analyzing the interactions of a costumer.
It uses a novel approach for plan recognition from bare-bone UI data, which reasons about the plan library at the lowest recognition level in order to define the relevancy of actions in our domain, and then uses it to perform plan recognition.
We present preliminary results of inference on three different use-cases modeled by domain experts from the company, and show that this approach manages to decrease the overload of information required from an analyst to evaluate a costumer's session - whether this is a malicious or benign session, whether the intended tasks were completed, and if not - what actions are expected next.
The first cluster-based public computing for Monte Carlo simulation in Indonesia is introduced.
The system has been developed to enable public to perform Monte Carlo simulation on a parallel computer through an integrated and user friendly dynamic web interface.
The beta version, so called publicMC@BATAN, has been released and implemented for internal users at the National Nuclear Energy Agency (BATAN).
In this paper the concept and architecture of publicMC@BATAN are presented.
Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images.
We tackle this problem by releasing the HAM10000 ("Human Against Machine with 10000 training images") dataset.
We collected dermatoscopic images from different populations acquired and stored by different modalities.
Given this diversity we had to apply different acquisition and cleaning methods and developed semi-automatic workflows utilizing specifically trained neural networks.
The final dataset consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive.
This benchmark dataset can be used for machine learning and for comparisons with human experts.
Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions.
More than 50% of lesions have been confirmed by pathology, while the ground truth for the rest of the cases was either follow-up, expert consensus, or confirmation by in-vivo confocal microscopy.
Active learning algorithms propose which unlabeled objects should be queried for their labels to improve a predictive model the most.
We study active learners that minimize generalization bounds and uncover relationships between these bounds that lead to an improved approach to active learning.
In particular we show the relation between the bound of the state-of-the-art Maximum Mean Discrepancy (MMD) active learner, the bound of the Discrepancy, and a new and looser bound that we refer to as the Nuclear Discrepancy bound.
We motivate this bound by a probabilistic argument: we show it considers situations which are more likely to occur.
Our experiments indicate that active learning using the tightest Discrepancy bound performs the worst in terms of the squared loss.
Overall, our proposed loosest Nuclear Discrepancy generalization bound performs the best.
We confirm our probabilistic argument empirically: the other bounds focus on more pessimistic scenarios that are rarer in practice.
We conclude that tightness of bounds is not always of main importance and that active learning methods should concentrate on realistic scenarios in order to improve performance.
Split and rephrase is the task of breaking down a sentence into shorter ones that together convey the same meaning.
We extract a rich new dataset for this task by mining Wikipedia's edit history: WikiSplit contains one million naturally occurring sentence rewrites, providing sixty times more distinct split examples and a ninety times larger vocabulary than the WebSplit corpus introduced by Narayan et al.(2017) as a benchmark for this task.
Incorporating WikiSplit as training data produces a model with qualitatively better predictions that score 32 BLEU points above the prior best result on the WebSplit benchmark.
A large part of modern day communications are carried out through the medium of E-mails, especially corporate communications.
More and more people are using E-mail for personal uses too.
Companies also send notifications to their customers in E-mail.
In fact, in the Multinational business scenario E-mail is the most convenient and sought-after method of communication.
Important features of E-mail such as its speed, reliability, efficient storage options and a large number of added facilities make it highly popular among people from all sectors of business and society.
But being largely popular has its negative aspects too.
E-mails are the preferred medium for a large number of attacks over the internet.
Some of the most popular attacks over the internet include spams, and phishing mails.
Both spammers and phishers utilize E-mail services quite efficiently in spite of a large number of detection and prevention techniques already in place.
Very few methods are actually good in detection/prevention of spam/phishing related mails but they have higher false positives.
These techniques are implemented at the server and in addition to giving higher number of false positives, they add to the processing load on the server.
This paper outlines a novel approach to detect not only spam, but also scams, phishing and advertisement related mails.
In this method, we overcome the limitations of server-side detection techniques by utilizing some intelligence on the part of users.
Keywords parsing, token separation and knowledge bases are used in the background to detect almost all E-mail attacks.
The proposed methodology, if implemented, can help protect E-mail users from almost all kinds of unwanted mails with enhanced efficiency, reduced number of false positives while not increasing the load on E-mail servers.
Today's cloud networks are shared among many tenants.
Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers.
Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so.
In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously.
QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding.
QShare is readily implementable with existing switching chips.
We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations.
Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.
The relationship between the complexity classes P and NP is an unsolved question in the field of theoretical computer science.
In this paper, we look at the link between the P - NP question and the "Deterministic" versus "Non Deterministic" nature of a problem, and more specifically at the temporal nature of the complexity within the NP class of problems.
Let us remind that the NP class is called the class of "Non Deterministic Polynomial" languages.
Using the meta argument that results in Mathematics should be "time independent" as they are reproducible, the paper shows that the P!=NP assertion is impossible to prove in the a-temporal framework of Mathematics.
In a previous version of the report, we use a similar argument based on randomness to show that the P = NP assertion was also impossible to prove, but this part of the paper was shown to be incorrect.
So, this version deletes it.
In fact, this paper highlights the time dependence of the complexity for any NP problem, linked to some pseudo-randomness in its heart.
Frame duplication is to duplicate a sequence of consecutive frames and insert or replace to conceal or imitate a specific event/content in the same source video.
To automatically detect the duplicated frames in a manipulated video, we propose a coarse-to-fine deep convolutional neural network framework to detect and localize the frame duplications.
We first run an I3D network to obtain the most candidate duplicated frame sequences and selected frame sequences, and then run a Siamese network with ResNet network to identify each pair of a duplicated frame and the corresponding selected frame.
We also propose a heuristic strategy to formulate the video-level score.
We then apply our inconsistency detector fine-tuned on the I3D network to distinguish duplicated frames from selected frames.
With the experimental evaluation conducted on two video datasets, we strongly demonstrate that our proposed method outperforms the current state-of-the-art methods.
Constraint Programming (CP) is a powerful declarative programming paradigm combining inference and search in order to find solutions to various type of constraint systems.
Dealing with highly disjunctive constraint systems is notoriously difficult in CP.
Apart from trying to solve each disjunct independently from each other, there is little hope and effort to succeed in constructing intermediate results combining the knowledge originating from several disjuncts.
In this paper, we propose If Then Else (ITE), a lightweight approach for implementing stratified constructive disjunction and negation on top of an existing CP solver, namely SICStus Prolog clp(FD).
Although constructive disjunction is known for more than three decades, it does not have straightforward implementations in most CP solvers.
ITE is a freely available library proposing stratified and constructive reasoning for various operators, including disjunction and negation, implication and conditional.
Our preliminary experimental results show that ITE is competitive with existing approaches that handle disjunctive constraint systems.
The analysis of algorithms mostly relies on counting classic elementary operations like additions, multiplications, comparisons, swaps etc.
This approach is often sufficient to quantify an algorithm's efficiency.
In some cases, however, features of modern processor architectures like pipelined execution and memory hierarchies have significant impact on running time and need to be taken into account to get a reliable picture.
One such example is Quicksort: It has been demonstrated experimentally that under certain conditions on the hardware the classically optimal balanced choice of the pivot as median of a sample gets harmful.
The reason lies in mispredicted branches whose rollback costs become dominating.
In this paper, we give the first precise analytical investigation of the influence of pipelining and the resulting branch mispredictions on the efficiency of (classic) Quicksort and Yaroslavskiy's dual-pivot Quicksort as implemented in Oracle's Java 7 library.
For the latter it is still not fully understood why experiments prove it 10% faster than a highly engineered implementation of a classic single-pivot version.
For different branch prediction strategies, we give precise asymptotics for the expected number of branch misses caused by the aforementioned Quicksort variants when their pivots are chosen from a sample of the input.
We conclude that the difference in branch misses is too small to explain the superiority of the dual-pivot algorithm.
Drones are driving numerous and evolving use cases, and creating transformative socio-economic benefits.
Drone operation needs wireless connectivity for communication between drones and ground control systems, among drones, and between drones and air traffic management systems.
Mobile networks are well positioned to identify, track, and control the growing fleet of drones.
The wide-area, quality, and secure connectivity provided by mobile networks can enhance the efficiency and effectiveness of drone operations beyond visual line-of-sight range.
In this article, we elaborate how the drone ecosystem can benefit from mobile technologies, summarize key capabilities required by drone applications, and analyze the service requirements on mobile networks.
We present field trial results collected in LTE-Advanced networks to gain insights into the capabilities of the current 4G+ networks for connected drones and share our vision on how 5G networks can further support diversified drone applications.
Recent works on single-image super-resolution are concentrated on improving performance through enhancing spatial encoding between convolutional layers.
In this paper, we focus on modeling the correlations between channels of convolutional features.
We present an effective deep residual network based on squeeze-and-excitation blocks (SEBlock) to reconstruct high-resolution (HR) image from low-resolution (LR) image.
SEBlock is used to adaptively recalibrate channel-wise feature mappings.
Further, short connections between each SEBlock are used to remedy information loss.
Extensive experiments show that our model can achieve the state-of-the-art performance and get finer texture details.
Indexing the Web is becoming a laborious task for search engines as the Web exponentially grows in size and distribution.
Presently, the most effective known approach to overcome this problem is the use of focused crawlers.
A focused crawler applies a proper algorithm in order to detect the pages on the Web that relate to its topic of interest.
For this purpose we proposed a custom method that uses specific HTML elements of a page to predict the topical focus of all the pages that have an unvisited link within the current page.
These recognized on-topic pages have to be sorted later based on their relevance to the main topic of the crawler for further actual downloads.
In the Treasure-Crawler, we use a hierarchical structure called the T-Graph which is an exemplary guide to assign appropriate priority score to each unvisited link.
These URLs will later be downloaded based on this priority.
This paper outlines the architectural design and embodies the implementation, test results and performance evaluation of the Treasure-Crawler system.
The Treasure-Crawler is evaluated in terms of information retrieval criteria such as recall and precision, both with values close to 0.5.
Gaining such outcome asserts the significance of the proposed approach.
Data stored in a data warehouse are inherently multidimensional, but most data-pruning techniques (such as iceberg and top-k queries) are unidimensional.
However, analysts need to issue multidimensional queries.
For example, an analyst may need to select not just the most profitable stores or--separately--the most profitable products, but simultaneous sets of stores and products fulfilling some profitability constraints.
To fill this need, we propose a new operator, the diamond dice.
Because of the interaction between dimensions, the computation of diamonds is challenging.
We present the first diamond-dicing experiments on large data sets.
Experiments show that we can compute diamond cubes over fact tables containing 100 million facts in less than 35 minutes using a standard PC.
The paper illustrates the research result of the application of semantic technology to ease the use and reuse of digital contents exposed as Linked Data on the web.
It focuses on the specific issue of explorative research for the resource selection: a context dependent semantic similarity assessment is proposed in order to compare datasets annotated through terminologies exposed as Linked Data (e.g. habitats, species).
Semantic similarity is shown as a building block technology to sift linked data resources.
From semantic similarity application, we derived a set of recommendations underlying open issues in scaling the similarity assessment up to the Web of Data.
This work proposed a novel learning objective to train a deep neural network to perform end-to-end image pixel clustering.
We applied the approach to instance segmentation, which is at the intersection of image semantic segmentation and object detection.
We utilize the most fundamental property of instance labeling -- the pairwise relationship between pixels -- as the supervision to formulate the learning objective, then apply it to train a fully convolutional network (FCN) for learning to perform pixel-wise clustering.
The resulting clusters can be used as the instance labeling directly.
To support labeling of an unlimited number of instance, we further formulate ideas from graph coloring theory into the proposed learning objective.
The evaluation on the Cityscapes dataset demonstrates strong performance and therefore proof of the concept.
Moreover, our approach won the second place in the lane detection competition of 2017 CVPR Autonomous Driving Challenge, and was the top performer without using external data.
The image biomarker standardisation initiative (IBSI) is an independent international collaboration which works towards standardising the extraction of image biomarkers from acquired imaging for the purpose of high-throughput quantitative image analysis (radiomics).
Lack of reproducibility and validation of high-throughput quantitative image analysis studies is considered to be a major challenge for the field.
Part of this challenge lies in the scantiness of consensus-based guidelines and definitions for the process of translating acquired imaging into high-throughput image biomarkers.
The IBSI therefore seeks to provide image biomarker nomenclature and definitions, benchmark data sets, and benchmark values to verify image processing and image biomarker calculations, as well as reporting guidelines, for high-throughput image analysis.
A public decision-making problem consists of a set of issues, each with multiple possible alternatives, and a set of competing agents, each with a preferred alternative for each issue.
We study adaptations of market economies to this setting, focusing on binary issues.
Issues have prices, and each agent is endowed with artificial currency that she can use to purchase probability for her preferred alternatives (we allow randomized outcomes).
We first show that when each issue has a single price that is common to all agents, market equilibria can be arbitrarily bad.
This negative result motivates a different approach.
We present a novel technique called "pairwise issue expansion", which transforms any public decision-making instance into an equivalent Fisher market, the simplest type of private goods market.
This is done by expanding each issue into many goods: one for each pair of agents who disagree on that issue.
We show that the equilibrium prices in the constructed Fisher market yield a "pairwise pricing equilibrium" in the original public decision-making problem which maximizes Nash welfare.
More broadly, pairwise issue expansion uncovers a powerful connection between the public decision-making and private goods settings; this immediately yields several interesting results about public decisions markets, and furthers the hope that we will be able to find a simple iterative voting protocol that leads to near-optimum decisions.
Growing interest in automatic speaker verification (ASV)systems has lead to significant quality improvement of spoofing attackson them.
Many research works confirm that despite the low equal er-ror rate (EER) ASV systems are still vulnerable to spoofing attacks.
Inthis work we overview different acoustic feature spaces and classifiersto determine reliable and robust countermeasures against spoofing at-tacks.
We compared several spoofing detection systems, presented so far,on the development and evaluation datasets of the Automatic SpeakerVerification Spoofing and Countermeasures (ASVspoof) Challenge 2015.Experimental results presented in this paper demonstrate that the useof magnitude and phase information combination provides a substantialinput into the efficiency of the spoofing detection systems.
Also wavelet-based features show impressive results in terms of equal error rate.
Inour overview we compare spoofing performance for systems based on dif-ferent classifiers.
Comparison results demonstrate that the linear SVMclassifier outperforms the conventional GMM approach.
However, manyresearchers inspired by the great success of deep neural networks (DNN)approaches in the automatic speech recognition, applied DNN in thespoofing detection task and obtained quite low EER for known and un-known type of spoofing attacks.
This paper reconstructs the Freebase data dumps to understand the underlying ontology behind Google's semantic search feature.
The Freebase knowledge base was a major Semantic Web and linked data technology that was acquired by Google in 2010 to support the Google Knowledge Graph, the backend for Google search results that include structured answers to queries instead of a series of links to external resources.
After its shutdown in 2016, Freebase is contained in a data dump of 1.9 billion Resource Description Format (RDF) triples.
A recomposition of the Freebase ontology will be analyzed in relation to concepts and insights from the literature on classification by Bowker and Star.
This paper will explore how the Freebase ontology is shaped by many of the forces that also shape classification systems through a deep dive into the ontology and a small correlational study.
These findings will provide a glimpse into the proprietary blackbox Knowledge Graph and what is meant by Google's mission to "organize the world's information and make it universally accessible and useful".
The distribution of impact factors has been modeled in the recent informetric literature using two-exponent law proposed by Mansilla et al.(2007).
This paper shows that two distributions widely-used in economics, namely the Dagum and Singh-Maddala models, possess several advantages over the two-exponent model.
Compared to the latter, the former give as good as or slightly better fit to data on impact factors in eight important scientific fields.
In contrast to the two-exponent model, both proposed distributions have closed-from probability density functions and cumulative distribution functions, which facilitates fitting these distributions to data and deriving their statistical properties.
We analyze the opportunistic relaying based on HARQ transmission over the block-fading channel with absence of channel state information (CSI) at the transmitter nodes.
We assume that both the source and the relay are allowed to vary their transmission rate between the HARQ transmission rounds.
We solve the problem of throughput maximization with respect to the transmission rates using double-recursive Dynamic Programming.
Simplifications are also proposed to diminish the complexity of the optimization.
The numerical results confirm that the variable-rate HARQ can increase the throughput significantly comparing to its fixed-rate counterpart.
The propagation of unreliable information is on the rise in many places around the world.
This expansion is facilitated by the rapid spread of information and anonymity granted by the Internet.
The spread of unreliable information is a wellstudied issue and it is associated with negative social impacts.
In a previous work, we have identified significant differences in the structure of news articles from reliable and unreliable sources in the US media.
Our goal in this work was to explore such differences in the Brazilian media.
We found significant features in two data sets: one with Brazilian news in Portuguese and another one with US news in English.
Our results show that features related to the writing style were prominent in both data sets and, despite the language difference, some features have a universal behavior, being significant to both US and Brazilian news articles.
Finally, we combined both data sets and used the universal features to build a machine learning classifier to predict the source type of a news article as reliable or unreliable.
We introduce autoregressive implicit quantile networks (AIQN), a fundamentally different approach to generative modeling than those commonly used, that implicitly captures the distribution using quantile regression.
AIQN is able to achieve superior perceptual quality and improvements in evaluation metrics, without incurring a loss of sample diversity.
The method can be applied to many existing models and architectures.
In this work we extend the PixelCNN model with AIQN and demonstrate results on CIFAR-10 and ImageNet using Inception score, FID, non-cherry-picked samples, and inpainting results.
We consistently observe that AIQN yields a highly stable algorithm that improves perceptual quality while maintaining a highly diverse distribution.
This is the preprint version of our paper on Advances in Engineering Software.
With several characteristics, such as large scale, diverse predictability and timeliness, the city traffic data falls in the range of definition of Big Data.
A Virtual Reality GIS based traffic analysis and visualization system is proposed as a promising and inspiring approach to manage and develop traffic big data.
In addition to the basic GIS interaction functions, the proposed system also includes some intelligent visual analysis and forecasting functions.
The passenger flow forecasting algorithm is introduced in detail.
Today dropshipping wins the Internet promptly and transformed to one of the basic tools of marketing in e-commerce.
Marketing features, mechanisms and value dropshipping in the conditions of network economy of the XXI century reveal in article.
The author carries out the comparative analysis of institutional development dropshipping in the USA, China and Russia.
We present two new large-scale datasets aimed at evaluating systems designed to comprehend a natural language query and extract its answer from a large corpus of text.
The Quasar-S dataset consists of 37000 cloze-style (fill-in-the-gap) queries constructed from definitions of software entity tags on the popular website Stack Overflow.
The posts and comments on the website serve as the background corpus for answering the cloze questions.
The Quasar-T dataset consists of 43000 open-domain trivia questions and their answers obtained from various internet sources.
ClueWeb09 serves as the background corpus for extracting these answers.
We pose these datasets as a challenge for two related subtasks of factoid Question Answering: (1) searching for relevant pieces of text that include the correct answer to a query, and (2) reading the retrieved text to answer the query.
We also describe a retrieval system for extracting relevant sentences and documents from the corpus given a query, and include these in the release for researchers wishing to only focus on (2).
We evaluate several baselines on both datasets, ranging from simple heuristics to powerful neural models, and show that these lag behind human performance by 16.4% and 32.1% for Quasar-S and -T respectively.
The datasets are available at https://github.com/bdhingra/quasar .
The objectives of cyber attacks are becoming sophisticated and the attackers are concealing their identity by disguising their characteristics to be others.
Cyber Threat Intelligence (CTI) analysis is gaining attention to generate meaningful knowledge for understanding the intention of an attacker and, eventually, to make predictions.
Developing the analysis technique requires a high volume and fine quality dataset.
However, the organizations which have useful data do not release it to the research community because they do not want to disclose threats toward them and the data assets they have.
Due to data inaccessibility, academic research tends to be biased towards the techniques for steps among each CTI process except for the analysis and production step.
In this paper, we propose the automated dataset generation system named CTIMiner.
The system collects threat data from publicly available security reports and malware repositories.
The data is stored in the structured format.
We release the source codes and the dataset to the public that includes about 628,000 records from 423 security reports published from 2008 to 2017.
Also, we present a statistical feature of the dataset and the techniques that can be developed using it.
Moreover, we demonstrate one application example of the dataset that analyzes the correlation and characteristics of incidents.
We believe our dataset promotes collaborative research of the threat information analysis to generate CTI.
Large-scale quantum computation is likely to require massive quantum error correction (QEC).
QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them.
Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase only) and can be represented compactly on conventional computers using Omega(n^2) bits, where n is the number of qubits.
Although techniques for the efficient simulation of stabilizer circuits have been studied extensively, techniques for efficient manipulation of stabilizer states are not currently available.
To this end, we design new algorithms for: (i) obtaining canonical generators for stabilizer states, (ii) obtaining canonical stabilizer circuits, and (iii) computing the inner product between stabilizer states.
Our inner-product algorithm takes O(n^3) time in general, but observes quadratic behavior for many practical instances relevant to QECC (e.g., GHZ states).
We prove that each n-qubit stabilizer state has exactly 4(2^n - 1) nearest-neighbor stabilizer states, and verify this claim experimentally using our algorithms.
We design techniques for representing arbitrary quantum states using stabilizer frames and generalize our algorithms to compute the inner product between two such frames.
The Internet of Things (IoTs) is an evolving new face of technology that provides state of the art services using ubiquitously connected smart objects.
These smart objects are capable of sensing, processing, collaborating, communicating the events and provide services.
The IoT is a collection of heterogeneous technologies like Sensor, RFID, Communication and nanotechnology.
These technologies enable smart objects to identify objects, collect information about their status,communicating the collected information for taking some desired actions.
Widespread adaptations of IoT based devices and services raised the ethical challenges for their users.
In this paper we highlight ethical challenges raised by IoT and discuss the solutions and methods for encouraging people to properly use these technologies according to Islamic teachings.
We consider the effects of decoding costs in energy harvesting communication systems.
In our setting, receivers, in addition to transmitters, rely solely on energy harvested from nature, and need to spend some energy in order to decode their intended packets.
We model the decoding energy as an increasing convex function of the rate of the incoming data.
In this setting, in addition to the traditional energy causality constraints at the transmitters, we have the decoding causality constraints at the receivers, where energy spent by the receiver for decoding cannot exceed its harvested energy.
We first consider the point-to-point single-user problem where the goal is to maximize the total throughput by a given deadline subject to both energy and decoding causality constraints.
We show that decoding costs at the receiver can be represented as generalized data arrivals at the transmitter, and thereby moving all system constraints to the transmitter side.
Then, we consider several multi-user settings.
We start with a two-hop network where the relay and the destination have decoding costs, and show that separable policies, where the transmitter's throughput is maximized irrespective of the relay's transmission energy profile, are optimal.
Next, we consider the multiple access channel (MAC) and the broadcast channel (BC) where the transmitters and the receivers harvest energy from nature, and characterize the maximum departure region.
In all multi-user settings considered, we decompose our problems into inner and outer problems.
We solve the inner problems by exploiting the structure of the particular model, and solve the outer problems by water-filling algorithms.
We propose a method to learn a distribution of shape trajectories from longitudinal data, i.e. the collection of individual objects repeatedly observed at multiple time-points.
The method allows to compute an average spatiotemporal trajectory of shape changes at the group level, and the individual variations of this trajectory both in terms of geometry and time dynamics.
First, we formulate a non-linear mixed-effects statistical model as the combination of a generic statistical model for manifold-valued longitudinal data, a deformation model defining shape trajectories via the action of a finite-dimensional set of diffeomorphisms with a manifold structure, and an efficient numerical scheme to compute parallel transport on this manifold.
Second, we introduce a MCMC-SAEM algorithm with a specific approach to shape sampling, an adaptive scheme for proposal variances, and a log-likelihood tempering strategy to estimate our model.
Third, we validate our algorithm on 2D simulated data, and then estimate a scenario of alteration of the shape of the hippocampus 3D brain structure during the course of Alzheimer's disease.
The method shows for instance that hippocampal atrophy progresses more quickly in female subjects, and occurs earlier in APOE4 mutation carriers.
We finally illustrate the potential of our method for classifying pathological trajectories versus normal ageing.
Convolutional neural networks (CNN) for medical imaging are constrained by the number of annotated data required in the training stage.
Usually, manual annotation is considered to be the "gold standard".
However, medical imaging datasets that include expert manual segmentation are scarce as this step is time-consuming, and therefore expensive.
Moreover, single-rater manual annotation is most often used in data-driven approaches making the network optimal with respect to only that single expert.
In this work, we propose a CNN for brain extraction in magnetic resonance (MR) imaging, that is fully trained with what we refer to as silver standard masks.
Our method consists of 1) developing a dataset with "silver standard" masks as input, and implementing both 2) a tri-planar method using parallel 2D U-Net-based CNNs (referred to as CONSNet) and 3) an auto-context implementation of CONSNet.
The term CONSNet refers to our integrated approach, i.e., training with silver standard masks and using a 2D U-Net-based architecture.
Our results showed that we outperformed (i.e., larger Dice coefficients) the current state-of-the-art SS methods.
Our use of silver standard masks reduced the cost of manual annotation, decreased inter-intra-rater variability, and avoided CNN segmentation super-specialization towards one specific manual annotation guideline that can occur when gold standard masks are used.
Moreover, the usage of silver standard masks greatly enlarges the volume of input annotated data because we can relatively easily generate labels for unlabeled data.
In addition, our method has the advantage that, once trained, it takes only a few seconds to process a typical brain image volume using modern hardware, such as a high-end graphics processing unit.
In contrast, many of the other competitive methods have processing times in the order of minutes.
A wide array of dynamic bandwidth allocation (DBA) mechanisms have recently been proposed for improving bandwidth utilization and reducing idle times and packets delays in passive optical networks (PONs).
The DBA evaluation studies commonly assumed that the report message for communicating the bandwidth demands of the distributed optical network units (ONUs) to the central optical line terminal (OLT) is scheduled for the end of an ONU's upstream transmission, after the ONU's payload data transmissions.
In this article, we conduct a detailed investigation of the impact of the report message scheduling (RMS), either at the beginning (i.e., before the pay load data) or the end of an ONU upstream transmission on PON performance.
We analytically characterize the reduction in channel idle time with reporting at the beginning of an upstream transmission compared to reporting at the end.
Our extensive simulation experiments consider both the Ethernet Passive Optical Networking (EPON) standard and the Gigabit PON (GPON) standard.
We find that for DBAs with offline sizing and scheduling of ONU upstream transmission grants at the end of a polling cycle, which processes requests from all ONUs, reporting at the beginning gives substantial reductions of mean packet delay at high loads.
For high-performing DBAs with online grant sizing and scheduling, which immediately processes individual ONU requests, or interleaving of ONUs groups, both reporting at the beginning or end give essentially the same average packet delays.
Real-world image recognition is often challenged by the variability of visual styles including object textures, lighting conditions, filter effects, etc.
Although these variations have been deemed to be implicitly handled by more training data and deeper networks, recent advances in image style transfer suggest that it is also possible to explicitly manipulate the style information.
Extending this idea to general visual recognition problems, we present Batch-Instance Normalization (BIN) to explicitly normalize unnecessary styles from images.
Considering certain style features play an essential role in discriminative tasks, BIN learns to selectively normalize only disturbing styles while preserving useful styles.
The proposed normalization module is easily incorporated into existing network architectures such as Residual Networks, and surprisingly improves the recognition performance in various scenarios.
Furthermore, experiments verify that BIN effectively adapts to completely different tasks like object classification and style transfer, by controlling the trade-off between preserving and removing style variations.
BIN can be implemented with only a few lines of code using popular deep learning frameworks.
Understanding and interacting with everyday physical scenes requires rich knowledge about the structure of the world, represented either implicitly in a value or policy function, or explicitly in a transition model.
Here we introduce a new class of learnable models--based on graph networks--which implement an inductive bias for object- and relation-centric representations of complex, dynamical systems.
Our results show that as a forward model, our approach supports accurate predictions from real and simulated data, and surprisingly strong and efficient generalization, across eight distinct physical systems which we varied parametrically and structurally.
We also found that our inference model can perform system identification.
Our models are also differentiable, and support online planning via gradient-based trajectory optimization, as well as offline policy optimization.
Our framework offers new opportunities for harnessing and exploiting rich knowledge about the world, and takes a key step toward building machines with more human-like representations of the world.
With the availability of vast amounts of user visitation history on location-based social networks (LBSN), the problem of Point-of-Interest (POI) prediction has been extensively studied.
However, much of the research has been conducted solely on voluntary checkin datasets collected from social apps such as Foursquare or Yelp.
While these data contain rich information about recreational activities (e.g., restaurants, nightlife, and entertainment), information about more prosaic aspects of people's lives is sparse.
This not only limits our understanding of users' daily routines, but more importantly the modeling assumptions developed based on characteristics of recreation-based data may not be suitable for richer check-in data.
In this work, we present an analysis of education "check-in" data using WiFi access logs collected at Purdue University.
We propose a heterogeneous graph-based method to encode the correlations between users, POIs, and activities, and then jointly learn embeddings for the vertices.
We evaluate our method compared to previous state-of-the-art POI prediction methods, and show that the assumptions made by previous methods significantly degrade performance on our data with dense(r) activity signals.
We also show how our learned embeddings could be used to identify similar students (e.g., for friend suggestions).
This work investigates into cost behaviors of binary classification measures in a background of class-imbalanced problems.
Twelve performance measures are studied, such as F measure, G-means in terms of accuracy rates, and of recall and precision, balance error rate (BER), Matthews correlation coefficient (MCC), Kappa coefficient, etc.
A new perspective is presented for those measures by revealing their cost functions with respect to the class imbalance ratio.
Basically, they are described by four types of cost functions.
The functions provides a theoretical understanding why some measures are suitable for dealing with class-imbalanced problems.
Based on their cost functions, we are able to conclude that G-means of accuracy rates and BER are suitable measures because they show "proper" cost behaviors in terms of "a misclassification from a small class will cause a greater cost than that from a large class".
On the contrary, F1 measure, G-means of recall and precision, MCC and Kappa coefficient measures do not produce such behaviors so that they are unsuitable to serve our goal in dealing with the problems properly.
We consider how fair treatment in society for people with disabilities might be impacted by the rise in the use of artificial intelligence, and especially machine learning methods.
We argue that fairness for people with disabilities is different to fairness for other protected attributes such as age, gender or race.
One major difference is the extreme diversity of ways disabilities manifest, and people adapt.
Secondly, disability information is highly sensitive and not always shared, precisely because of the potential for discrimination.
Given these differences, we explore definitions of fairness and how well they work in the disability space.
Finally, we suggest ways of approaching fairness for people with disabilities in AI applications.
Hybrid driving-stepping locomotion is an effective approach for navigating in a variety of environments.
Long, sufficiently even distances can be quickly covered by driving while obstacles can be overcome by stepping.
Our quadruped robot Momaro, with steerable pairs of wheels located at the end of each of its compliant legs, allows such locomotion.
Planning respective paths attracted only little attention so far.
We propose a navigation planning method which generates hybrid locomotion paths.
The planner chooses driving mode whenever possible and takes into account the detailed robot footprint.
If steps are required, the planner includes those.
To accelerate planning, steps are planned first as abstract manoeuvres and are expanded afterwards into detailed motion sequences.
Our method ensures at all times that the robot stays stable.
Experiments show that the proposed planner is capable of providing paths in feasible time, even for challenging terrain.
This summary of the doctoral thesis is created to emphasize the close connection of the proposed spectral analysis method with the Discrete Fourier Transform (DFT), the most extensively studied and frequently used approach in the history of signal processing.
It is shown that in a typical application case, where uniform data readings are transformed to the same number of uniformly spaced frequencies, the results of the classical DFT and proposed approach coincide.
The difference in performance appears when the length of the DFT is selected to be greater than the length of the data.
The DFT solves the unknown data problem by padding readings with zeros up to the length of the DFT, while the proposed Extended DFT (EDFT) deals with this situation in a different way, it uses the Fourier integral transform as a target and optimizes the transform basis in the extended frequency range without putting such restrictions on the time domain.
Consequently, the Inverse DFT (IDFT) applied to the result of EDFT returns not only known readings, but also the extrapolated data, where classical DFT is able to give back just zeros, and higher resolution are achieved at frequencies where the data has been successfully extended.
It has been demonstrated that EDFT able to process data with missing readings or gaps inside or even nonuniformly distributed data.
Thus, EDFT significantly extends the usability of the DFT-based methods, where previously these approaches have been considered as not applicable.
The EDFT founds the solution in an iterative way and requires repeated calculations to get the adaptive basis, and this makes it numerical complexity much higher compared to DFT.
This disadvantage was a serious problem in the 1990s, when the method has been proposed.
Fortunately, since then the power of computers has increased so much that nowadays EDFT application could be considered as a real alternative.
The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments.
This makes the cross-paper tracker comparison difficult.
Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased towards particular tracking aspects.
In this paper we revisit the popular performance measures and tracker performance visualizations and analyze them theoretically and experimentally.
We show that several measures are equivalent from the point of information they provide for tracker comparison and, crucially, that some are more brittle than the others.
Based on our analysis we narrow down the set of potential measures to only two complementary ones, describing accuracy and robustness, thus pushing towards homogenization of the tracker evaluation methodology.
These two measures can be intuitively interpreted and visualized and have been employed by the recent Visual Object Tracking (VOT) challenges as the foundation for the evaluation methodology.
Phase retrieval refers to recovering a signal from its Fourier magnitude.
This problem arises naturally in many scientific applications, such as ultra-short laser pulse characterization and diffraction imaging.
Unfortunately, phase retrieval is ill-posed for almost all one-dimensional signals.
In order to characterize a laser pulse and overcome the ill-posedness, it is common to use a technique called Frequency-Resolved Optical Gating (FROG).
In FROG, the measured data, referred to as FROG trace, is the Fourier magnitude of the product of the underlying signal with several translated versions of itself.
The FROG trace results in a system of phaseless quartic Fourier measurements.
In this paper, we prove that it suffices to consider only three translations of the signal to determine almost all bandlimited signals, up to trivial ambiguities.
In practice, one usually also has access to the signal's Fourier magnitude.
We show that in this case only two translations suffice.
Our results significantly improve upon earlier work.
In this paper, we present a simple and efficient method for training deep neural networks in a semi-supervised setting where only a small portion of training data is labeled.
We introduce self-ensembling, where we form a consensus prediction of the unknown labels using the outputs of the network-in-training on different epochs, and most importantly, under different regularization and input augmentation conditions.
This ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.
Using our method, we set new records for two standard semi-supervised learning benchmarks, reducing the (non-augmented) classification error rate from 18.44% to 7.05% in SVHN with 500 labels and from 18.63% to 16.55% in CIFAR-10 with 4000 labels, and further to 5.12% and 12.16% by enabling the standard augmentations.
We additionally obtain a clear improvement in CIFAR-100 classification accuracy by using random images from the Tiny Images dataset as unlabeled extra inputs during training.
Finally, we demonstrate good tolerance to incorrect labels.
Agile software development (ASD) methods were introduced as a reaction to traditional software development methods.
Principles of these methods are different from traditional methods and so there are some different processes and activities in agile methods comparing to traditional methods.
Thus ASD methods require different measurement practices comparing to traditional methods.
Agile teams often do their projects in the simplest and most effective way so, measurement practices in agile methods are more important than traditional methods, because lack of appropriate and effective measurement practices, will increase risk of project.
The aims of this paper are investigation on current measurement practices in ASD methods, collecting them together in one study and also reviewing agile version of Common Software Measurement International Consortium (COSMIC) publication.
Smoothed analysis is a new way of analyzing algorithms introduced by Spielman and Teng (J. ACM, 2004).
Classical methods like worst-case or average-case analysis have accompanying complexity classes, like P and AvgP, respectively.
While worst-case or average-case analysis give us a means to talk about the running time of a particular algorithm, complexity classes allows us to talk about the inherent difficulty of problems.
Smoothed analysis is a hybrid of worst-case and average-case analysis and compensates some of their drawbacks.
Despite its success for the analysis of single algorithms and problems, there is no embedding of smoothed analysis into computational complexity theory, which is necessary to classify problems according to their intrinsic difficulty.
We propose a framework for smoothed complexity theory, define the relevant classes, and prove some first hardness results (of bounded halting and tiling) and tractability results (binary optimization problems, graph coloring, satisfiability).
Furthermore, we discuss extensions and shortcomings of our model and relate it to semi-random models.
Making the right decision in traffic is a challenging task that is highly dependent on individual preferences as well as the surrounding environment.
Therefore it is hard to model solely based on expert knowledge.
In this work we use Deep Reinforcement Learning to learn maneuver decisions based on a compact semantic state representation.
This ensures a consistent model of the environment across scenarios as well as a behavior adaptation function, enabling on-line changes of desired behaviors without re-training.
The input for the neural network is a simulated object list similar to that of Radar or Lidar sensors, superimposed by a relational semantic scene description.
The state as well as the reward are extended by a behavior adaptation function and a parameterization respectively.
With little expert knowledge and a set of mid-level actions, it can be seen that the agent is capable to adhere to traffic rules and learns to drive safely in a variety of situations.
A growing issue in the modern cyberspace world is the direct identification of malicious activity over network connections.
The boom of the machine learning industry in the past few years has led to the increasing usage of machine learning technologies, which are especially prevalent in the network intrusion detection research community.
When utilizing these fairly contemporary techniques, the community has realized that datasets are pivotal for identifying malicious packets and connections, particularly ones associated with information concerning labeling in order to construct learning models.
However, there exists a shortage of publicly available, relevant datasets to researchers in the network intrusion detection community.
Thus, in this paper, we introduce a method to construct labeled flow data by combining the packet meta-information with IDS logs to infer labels for intrusion detection research.
Specifically, we designed a NetFlow-compatible format due to the capability of a a large body of network devices, such as routers and switches, to export NetFlow records from raw traffic.
In doing so, the introduced method at hand would aid researchers to access relevant network flow datasets along with label information.
Tensegrity mechanisms are composed of rigid and tensile parts that are in equilibrium.
They are interesting alternative designs for some applications, such as modelling musculo-skeleton systems.
Tensegrity mechanisms are more difficult to analyze than classical mechanisms as the static equilibrium conditions that must be satisfied generally result in complex equations.
A class of planar one-degree-of-freedom tensegrity mechanisms with three linear springs is analyzed in detail for the sake of systematic solution classifications.
The kinetostatic equations are derived and solved under several loading and geometric conditions.
It is shown that these mechanisms exhibit up to six equilibrium configurations, of which one or two are stable, depending on the geometric and loading conditions.
Discriminant varieties and cylindrical algebraic decomposition combined with Groebner base elimination are used to classify solutions as a function of the geometric, loading and actuator input parameters.
We present FoamGrid, a new implementation of the DUNE grid interface.
FoamGrid implements one- and two-dimensional grids in a physical space of arbitrary dimension, which allows for grids for curved domains.
Even more, the grids are not expected to have a manifold structure, i.e., more than two elements can share a common facet.
This makes FoamGrid the grid data structure of choice for simulating structures such as foams, discrete fracture networks, or network flow problems.
FoamGrid implements adaptive non-conforming refinement with element parametrizations.
As an additional feature it allows removal and addition of elements in an existing grid, which makes FoamGrid suitable for network growth problems.
We show how to use FoamGrid, with particular attention to the extensions of the grid interface needed to handle non-manifold topology and grid growth.
Three numerical examples demonstrate the possibilities offered by FoamGrid.
This article describes their biopolitical implications for design from psychological, cultural, legal, functional and aesthetic/perceptive ways, in the framework of Hyperconnectivity: the condition according to which person-to-person, person-to-machine and machine-to-machine communication progressively shift to networked and digital means.
A definition is given for the terms of "interface biopolitics" and "data biopolitics", as well as evidence supporting these definitions and a description of the technological, theoretical and practice-based innovations bringing them into meaningful existence.
Interfaces, algorithms, artificial intelligences of various types, the tendency in quantified self and the concept of "information bubbles" will be examined in terms of interface and data biopolitics, from the point of view of design, and for their implications in terms of freedoms, transparency, justice and accessibility to human rights.
A working hypothesis is described for technologically relevant design practices and education processes, in order to confront with these issues in critical, ethical and inclusive ways.
Finite volume methods (FVMs) constitute a popular class of methods for the numerical simulation of fluid flows.
Among the various components of these methods, the discretisation of the gradient operator has received less attention despite its fundamental importance with regards to the accuracy of the FVM.
The most popular gradient schemes are the divergence theorem (DT) (or Green-Gauss) scheme, and the least-squares (LS) scheme.
Both are widely believed to be second-order accurate, but the present study shows that in fact the common variant of the DT gradient is second-order accurate only on structured meshes whereas it is zeroth-order accurate on general unstructured meshes, and the LS gradient is second-order and first-order accurate, respectively.
This is explained through a theoretical analysis and is confirmed by numerical tests.
The schemes are then used within a FVM to solve a simple diffusion equation on unstructured grids generated by several methods; the results reveal that the zeroth-order accuracy of the DT gradient is inherited by the FVM as a whole, and the discretisation error does not decrease with grid refinement.
On the other hand, use of the LS gradient leads to second-order accurate results, as does the use of alternative, consistent, DT gradient schemes, including a new iterative scheme that makes the common DT gradient consistent at almost no extra cost.
The numerical tests are performed using both an in-house code and the popular public domain PDE solver OpenFOAM.
Recently, the hybrid convolutional neural network hidden Markov model (CNN-HMM) has been introduced for offline handwritten Chinese text recognition (HCTR) and has achieved state-of-the-art performance.
In a CNN-HMM system, a handwritten text line is modeled by a series of cascading HMMs, each representing one character, and the posterior distributions of HMM states are calculated by CNN.
However, modeling each of the large vocabulary of Chinese characters with a uniform and fixed number of hidden states requires high memory and computational costs and makes the tens of thousands of HMM state classes confusing.
Another key issue of CNN-HMM for HCTR is the diversified writing style, which leads to model strain and a significant performance decline for specific writers.
To address these issues, we propose a writer-aware CNN based on parsimonious HMM (WCNN-PHMM).
Validated on the ICDAR 2013 competition of CASIA-HWDB database, the more compact WCNN-PHMM of a 7360-class vocabulary can achieve a relative character error rate (CER) reduction of 16.6% over the conventional CNN-HMM without considering language modeling.
Moreover, the state-tying results of PHMM explicitly show the information sharing among similar characters and the confusion reduction of tied state classes.
Finally, we visualize the learned writer codes and demonstrate the strong relationship with the writing styles of different writers.
To the best of our knowledge, WCNN-PHMM yields the best results on the ICDAR 2013 competition set, demonstrating its power when enlarging the size of the character vocabulary.
A survey of dictionary models and formats is presented as well as a presentation of corresponding recent standardisation activities.
This Ontologies are widely used as a means for solving the information heterogeneity problems on the web because of their capability to provide explicit meaning to the information.
They become an efficient tool for knowledge representation in a structured manner.
There is always more than one ontology for the same domain.
Furthermore, there is no standard method for building ontologies, and there are many ontology building tools using different ontology languages.
Because of these reasons, interoperability between the ontologies is very low.
Current ontology tools mostly use functions to build, edit and inference the ontology.
Methods for merging heterogeneous domain ontologies are not included in most tools.
This paper presents ontology merging methodology for building a single global ontology from heterogeneous eXtensible Markup Language (XML) data sources to capture and maintain all the knowledge which XML data sources can contain
We introduce a novel, simple convolution neural network (CNN) architecture - multi-group norm constraint CNN (MGNC-CNN) that capitalizes on multiple sets of word embeddings for sentence classification.
MGNC-CNN extracts features from input embedding sets independently and then joins these at the penultimate layer in the network to form a final feature vector.
We then adopt a group regularization strategy that differentially penalizes weights associated with the subcomponents generated from the respective embedding sets.
This model is much simpler than comparable alternative architectures and requires substantially less training time.
Furthermore, it is flexible in that it does not require input word embeddings to be of the same dimensionality.
We show that MGNC-CNN consistently outperforms baseline models.
The rapid development of Internet of Things (IoT) technology, which is an inter connection of networks through an insecure public channel i.e.
Internet demands for authenticating the remote user trying to access the secure network resources.
In 2013, Ankita et al. proposed an improved three factor remote user authentication scheme.
In this poster we will show that Ankita et al scheme is vulnerable to known session specific temporary information attack, on successfully performing the attack, the adversary can perform all other major cryptographic attacks.
As a part of our contribution, we will propose an improved scheme which is resistance to all major cryptographic attacks and overcomes the defects in Ankita et al. scheme.
Human-swarm interaction (HSI) involves a number of human factors impacting human behaviour throughout the interaction.
As the technologies used within HSI advance, it is more tempting to increase the level of swarm autonomy within the interaction to reduce the workload on humans.
Yet, the prospective negative effects of high levels of autonomy on human situational awareness can hinder this process.
Flexible autonomy aims at trading-off these effects by changing the level of autonomy within the interaction when required; with mixed-initiatives combining human preferences and automation's recommendations to select an appropriate level of autonomy at a certain point of time.
However, the effective implementation of mixed-initiative systems raises fundamental questions on how to combine human preferences and automation recommendations, how to realise the selected level of autonomy, and what the future impacts on the cognitive states of a human are.
We explore open challenges that hamper the process of developing effective flexible autonomy.
We then highlight the potential benefits of using system modelling techniques in HSI by illustrating how they provide HSI designers with an opportunity to evaluate different strategies for assessing the state of the mission and for adapting the level of autonomy within the interaction to maximise mission success metrics.
We present Wasserstein introspective neural networks (WINN) that are both a generator and a discriminator within a single model.
WINN provides a significant improvement over the recent introspective neural networks (INN) method by enhancing INN's generative modeling capability.
WINN has three interesting properties: (1) A mathematical connection between the formulation of the INN algorithm and that of Wasserstein generative adversarial networks (WGAN) is made.
(2) The explicit adoption of the Wasserstein distance into INN results in a large enhancement to INN, achieving compelling results even with a single classifier --- e.g., providing nearly a 20 times reduction in model size over INN for unsupervised generative modeling.
(3) When applied to supervised classification, WINN also gives rise to improved robustness against adversarial examples in terms of the error reduction.
In the experiments, we report encouraging results on unsupervised learning problems including texture, face, and object modeling, as well as a supervised classification task against adversarial attacks.
We propose to consider ensembles of cycles (quadrics), which are interconnected through conformal-invariant geometric relations (e.g."to be orthogonal", "to be tangent", etc.), as new objects in an extended Moebius--Lie geometry.
It was recently demonstrated in several related papers, that such ensembles of cycles naturally parameterise many other conformally-invariant objects, e.g. loxodromes or continued fractions.
The paper describes a method, which reduces a collection of conformally invariant geometric relations to a system of linear equations, which may be accompanied by one fixed quadratic relation.
To show its usefulness, the method is implemented as a C++ library.
It operates with numeric and symbolic data of cycles in spaces of arbitrary dimensionality and metrics with any signatures.
Numeric calculations can be done in exact or approximate arithmetic.
In the two- and three-dimensional cases illustrations and animations can be produced.
An interactive Python wrapper of the library is provided as well.
With the rapid development of information technology and multimedia, the use of digital data is increasing day by day.
So it becomes very essential to protect multimedia information from piracy and also it is challenging.
A great deal of Copyright owners is worried about protecting any kind of illegal repetition of their information.
Hence, facing all these kinds of problems development of the techniques is very important.
Digital watermarking considered as a solution to prevent the multimedia data.
In this paper, an idea of watermarking is proposed and implemented.
In proposed watermarking method, the original image is rearranged using zigzag sequence and DWT is applied on rearranged image.
Then DCT and SVD are applied on all high bands LH, HL and HH.
Watermark is then embedded by modifying the singular values of these bands.
Extraction of watermark is performed by the inversion of watermark embedding process.
For choosing of these three bands it gives facility of mid-band and pure high band that ensures good imperceptibility and more robustness against different kinds of attacks.
Scenarios for the emergence or bootstrap of a lexicon involve the repeated interaction between at least two agents who must reach a consensus on how to name N objects using H words.
Here we consider minimal models of two types of learning algorithms: cross-situational learning, in which the individuals determine the meaning of a word by looking for something in common across all observed uses of that word, and supervised operant conditioning learning, in which there is strong feedback between individuals about the intended meaning of the words.
Despite the stark differences between these learning schemes, we show that they yield the same communication accuracy in the realistic limits of large N and H, which coincides with the result of the classical occupancy problem of randomly assigning N objects to H words.
The matching function for the problem of stereo reconstruction or optical flow has been traditionally designed as a function of the distance between the features describing matched pixels.
This approach works under assumption, that the appearance of pixels in two stereo cameras or in two consecutive video frames does not change dramatically.
However, this might not be the case, if we try to match pixels over a large interval of time.
In this paper we propose a method, which learns the matching function, that automatically finds the space of allowed changes in visual appearance, such as due to the motion blur, chromatic distortions, different colour calibration or seasonal changes.
Furthermore, it automatically learns the importance of matching scores of contextual features at different relative locations and scales.
Proposed classifier gives reliable estimations of pixel disparities already without any form of regularization.
We evaluated our method on two standard problems - stereo matching on KITTI outdoor dataset, optical flow on Sintel data set, and on newly introduced TimeLapse change detection dataset.
Our algorithm obtained very promising results comparable to the state-of-the-art.
This paper presents a novel decentralized control strategy for a multi-robot system that enables parallel multi-target exploration while ensuring a time-varying connected topology in cluttered 3D environments.
Flexible continuous connectivity is guaranteed by building upon a recent connectivity maintenance method, in which limited range, line-of-sight visibility, and collision avoidance are taken into account at the same time.
Completeness of the decentralized multi-target exploration algorithm is guaranteed by dynamically assigning the robots with different motion behaviors during the exploration task.
One major group is subject to a suitable downscaling of the main traveling force based on the traveling efficiency of the current leader and the direction alignment between traveling and connectivity force.
This supports the leader in always reaching its current target and, on a larger time horizon, that the whole team realizes the overall task in finite time.
Extensive Monte~Carlo simulations with a group of several quadrotor UAVs show the scalability and effectiveness of the proposed method and experiments validate its practicability.
Volume of text based documents have been increasing day by day.
Medical documents are located within this growing text documents.
In this study, the techniques used for text classification applied on medical documents and evaluated classification performance.
Used data sets are multi class and multi labelled.
Chi Square (CHI) technique was used for feature selection also SMO, NB, C4.5, RF and KNN algorithms was used for classification.
The aim of this study, success of various classifiers is evaluated on multi class and multi label data sets consisting of medical documents.
The first 400 features, while the most successful in the KNN classifier, feature number 400 and after the SMO has become the most successful classifier.
In this work, we consider a two-level hierarchical MIMO antenna array system, where each antenna of the upper level is made up of a subarray on the lower one.
The concept of spatial multiplexing is applied twice in this situation: Firstly, the spatial multiplexing of a Line-of-Sight (LoS) MIMO system is exploited.
It is based on appropriate (sub-)array distances and achieves multiplexing gain due to phase differences among the signals at the receive (sub-)arrays.
Secondly, one or more additional reflected paths of different angles (separated from the LoS path by different spatial beams at the subarrays) are used to exploit spatial multiplexing between paths.
By exploiting the above two multiplexing kinds simultaneously, a high dimensional system with maximum spatial multiplexing is proposed by jointly using 'phase differences' within paths and 'angular differences' between paths.
The system includes an advanced hybrid beamforming architecture with large subarray separation, which could occur in millimeter wave backhaul scenarios.
The possible gains of the system w.r.t. a pure LOS MIMO system are illustrated by evaluating the capacities with total transmit power constraints.
Mining social media messages for health and drug related information has received significant interest in pharmacovigilance research.
Social media sites (e.g., Twitter), have been used for monitoring drug abuse, adverse reactions of drug usage and analyzing expression of sentiments related to drugs.
Most of these studies are based on aggregated results from a large population rather than specific sets of individuals.
In order to conduct studies at an individual level or specific cohorts, identifying posts mentioning intake of medicine by the user is necessary.
Towards this objective, we train different deep neural network classification models on a publicly available annotated dataset and study their performances on identifying mentions of personal intake of medicine in tweets.
We also design and train a new architecture of a stacked ensemble of shallow convolutional neural network (CNN) ensembles.
We use random search for tuning the hyperparameters of the models and share the details of the values taken by the hyperparameters for the best learnt model in different deep neural network architectures.
Our system produces state-of-the-art results, with a micro- averaged F-score of 0.693.
We present a technique that uses images, videos and sensor data taken from first-person point-of-view devices to perform egocentric field-of-view (FOV) localization.
We define egocentric FOV localization as capturing the visual information from a person's field-of-view in a given environment and transferring this information onto a reference corpus of images and videos of the same space, hence determining what a person is attending to.
Our method matches images and video taken from the first-person perspective with the reference corpus and refines the results using the first-person's head orientation information obtained using the device sensors.
We demonstrate single and multi-user egocentric FOV localization in different indoor and outdoor environments with applications in augmented reality, event understanding and studying social interactions.
Message Passing Interface (MPI) is the most commonly used paradigm in writing parallel programs since it can be employed not only within a single processing node but also across several connected ones.
Data flow analysis concepts, techniques and tools are needed to understand and analyze MPI-based programs to detect bugs arise in these programs.
In this paper we propose two automated techniques to analyze and debug MPI-based programs source codes.
Kernel methods have produced state-of-the-art results for a number of NLP tasks such as relation extraction, but suffer from poor scalability due to the high cost of computing kernel similarities between natural language structures.
A recently proposed technique, kernelized locality-sensitive hashing (KLSH), can significantly reduce the computational cost, but is only applicable to classifiers operating on kNN graphs.
Here we propose to use random subspaces of KLSH codes for efficiently constructing an explicit representation of NLP structures suitable for general classification methods.
Further, we propose an approach for optimizing the KLSH model for classification problems by maximizing an approximation of mutual information between the KLSH codes (feature vectors) and the class labels.
We evaluate the proposed approach on biomedical relation extraction datasets, and observe significant and robust improvements in accuracy w.r.t. state-of-the-art classifiers, along with drastic (orders-of-magnitude) speedup compared to conventional kernel methods.
Recognising persons in everyday photos presents major challenges (occluded faces, different clothing, locations, etc.) for machine vision.
We propose a convnet based person recognition system on which we provide an in-depth analysis of informativeness of different body cues, impact of training data, and the common failure modes of the system.
In addition, we discuss the limitations of existing benchmarks and propose more challenging ones.
Our method is simple and is built on open source and open data, yet it improves the state of the art results on a large dataset of social media photos (PIPA).
In order to make a proper reaction to the collected information from internet of things (IoT) devices, location information of things should be available at the data center.
One challenge for the massive IoT networks is to identify the location map of whole sensor nodes from partially observed distance information.
In this paper, we propose a matrix completion based localization algorithm to reconstruct the location map of sensors using partially observed distance information.
From the numerical experiments, we show that the proposed method based on the modified conjugate gradient is effective in recovering the Euclidean distance matrix.
Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN.
However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics.
Here we show that no single saliency map can perform well under all metrics.
Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics.
Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density.
We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision.
We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics.
Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: "good" models will perform well in all metrics.
Resources such as labeled corpora are necessary to train automatic models within the natural language processing (NLP) field.
Historically, a large number of resources regarding a broad number of problems are available mostly in English.
One of such problems is known as Personality Identification where based on a psychological model (e.g.The Big Five Model), the goal is to find the traits of a subject's personality given, for instance, a text written by the same subject.
In this paper we introduce a new corpus in Spanish called Texts for Personality Identification (TxPI).
This corpus will help to develop models to automatically assign a personality trait to an author of a text document.
Our corpus, TxPI-u, contains information of 416 Mexican undergraduate students with some demographics information such as, age, gender, and the academic program they are enrolled.
Finally, as an additional contribution, we present a set of baselines to provide a comparison scheme for further research.
Since a tweet is limited to 140 characters, it is ambiguous and difficult for traditional Natural Language Processing (NLP) tools to analyse.
This research presents KeyXtract which enhances the machine learning based Stanford CoreNLP Part-of-Speech (POS) tagger with the Twitter model to extract essential keywords from a tweet.
The system was developed using rule-based parsers and two corpora.
The data for the research was obtained from a Twitter profile of a telecommunication company.
The system development consisted of two stages.
At the initial stage, a domain specific corpus was compiled after analysing the tweets.
The POS tagger extracted the Noun Phrases and Verb Phrases while the parsers removed noise and extracted any other keywords missed by the POS tagger.
The system was evaluated using the Turing Test.
After it was tested and compared against Stanford CoreNLP, the second stage of the system was developed addressing the shortcomings of the first stage.
It was enhanced using Named Entity Recognition and Lemmatization.
The second stage was also tested using the Turing test and its pass rate increased from 50.00% to 83.33%.
The performance of the final system output was measured using the F1 score.
Stanford CoreNLP with the Twitter model had an average F1 of 0.69 while the improved system had a F1 of 0.77.
The accuracy of the system could be improved by using a complete domain specific corpus.
Since the system used linguistic features of a sentence, it could be applied to other NLP tools.
Inspired by the great success of recurrent neural networks (RNNs) in sequential modeling, we introduce a novel RNN system to improve the performance of online signature verification.
The training objective is to directly minimize intra-class variations and to push the distances between skilled forgeries and genuine samples above a given threshold.
By back-propagating the training signals, our RNN network produced discriminative features with desired metrics.
Additionally, we propose a novel descriptor, called the length-normalized path signature (LNPS), and apply it to online signature verification.
LNPS has interesting properties, such as scale invariance and rotation invariance after linear combination, and shows promising results in online signature verification.
Experiments on the publicly available SVC-2004 dataset yielded state-of-the-art performance of 2.37% equal error rate (EER).
Single image super resolution is a very important computer vision task, with a wide range of applications.
In recent years, the depth of the super-resolution model has been constantly increasing, but with a small increase in performance, it has brought a huge amount of computation and memory consumption.
In this work, in order to make the super resolution models more effective, we proposed a novel single image super resolution method via recursive squeeze and excitation networks (SESR).
By introducing the squeeze and excitation module, our SESR can model the interdependencies and relationships between channels and that makes our model more efficiency.
In addition, the recursive structure and progressive reconstruction method in our model minimized the layers and parameters and enabled SESR to simultaneously train multi-scale super resolution in a single model.
After evaluating on four benchmark test sets, our model is proved to be above the state-of-the-art methods in terms of speed and accuracy.
In this paper, we address the symbol level precoding (SLP) design problem under max-min SINR criterion in the downlink of multiuser multiple-input single-output (MISO) channels.
First, we show that the distance preserving constructive interference regions (DPCIR) are always polyhedral angles (shifted pointed cones) for any given constellation point with unbounded decision region.
Then we prove that any signal in a given unbounded DPCIR has a norm larger than the norm of the corresponding vertex if and only if the convex hull of the constellation contains the origin.
Using these properties, we show that the power of the noiseless received signal lying on an unbounded DPCIR is an strictly increasing function of two parameters.
This allows us to reformulate the originally non-convex SLP max-min SINR as a convex optimization problem.
We discuss the loss due to our proposed convex reformulation and provide some simulation results.
In order to convey the most content in their limited space, advertisements embed references to outside knowledge via symbolism.
For example, a motorcycle stands for adventure (a positive property the ad wants associated with the product being sold), and a gun stands for danger (a negative property to dissuade viewers from undesirable behaviors).
We show how to use symbolic references to better understand the meaning of an ad.
We further show how anchoring ad understanding in general-purpose object recognition and image captioning improves results.
We formulate the ad understanding task as matching the ad image to human-generated statements that describe the action that the ad prompts, and the rationale it provides for taking this action.
Our proposed method outperforms the state of the art on this task, and on an alternative formulation of question-answering on ads.
We show additional applications of our learned representations for matching ads to slogans, and clustering ads according to their topic, without extra training.
Home automation platforms provide a new level of convenience by enabling consumers to automate various aspects of physical objects in their homes.
While the convenience is beneficial, security flaws in the platforms or integrated third-party products can have serious consequences for the integrity of a user's physical environment.
In this paper we perform a systematic security evaluation of two popular smart home platforms, Google's Nest platform and Philips Hue, that implement home automation "routines" (i.e., trigger-action programs involving apps and devices) via manipulation of state variables in a centralized data store.
Our semi-automated analysis examines, among other things, platform access control enforcement, the rigor of non-system enforcement procedures, and the potential for misuse of routines.
This analysis results in ten key findings with serious security implications.
For instance, we demonstrate the potential for the misuse of smart home routines in the Nest platform to perform a lateral privilege escalation, illustrate how Nest's product review system is ineffective at preventing multiple stages of this attack that it examines, and demonstrate how emerging platforms may fail to provide even bare-minimum security by allowing apps to arbitrarily add/remove other apps from the user's smart home.
Our findings draw attention to the unique security challenges of platforms that execute routines via centralized data stores and highlight the importance of enforcing security by design in emerging home automation platforms.
Several researchers have proposed solutions for secure data outsourcing on the public clouds based on encryption, secret-sharing, and trusted hardware.
Existing approaches, however, exhibit many limitations including high computational complexity, imperfect security, and information leakage.
This chapter describes an emerging trend in secure data processing that recognizes that an entire dataset may not be sensitive, and hence, non-sensitivity of data can be exploited to overcome some of the limitations of existing encryption-based approaches.
In particular, data and computation can be partitioned into sensitive or non-sensitive datasets - sensitive data can either be encrypted prior to outsourcing or stored/processed locally on trusted servers.
The non-sensitive dataset, on the other hand, can be outsourced and processed in the cleartext.
While partitioned computing can bring new efficiencies since it does not incur (expensive) encrypted data processing costs on non-sensitive data, it can lead to information leakage.
We study partitioned computing in two contexts - first, in the context of the hybrid cloud where local resources are integrated with public cloud resources to form an effective and secure storage and computational platform for enterprise data.
In the hybrid cloud, sensitive data is stored on the private cloud to prevent leakage and a computation is partitioned between private and public clouds.
Care must be taken that the public cloud cannot infer any information about sensitive data from inter-cloud data access during query processing.
We then consider partitioned computing in a public cloud only setting, where sensitive data is encrypted before outsourcing.
We formally define a partitioned security criterion that any approach to partitioned computing on public clouds must ensure in order to not introduce any new vulnerabilities to the existing secure solution.
We give a simple polynomial time approximation scheme for the weighted matroid matching problem on strongly base orderable matroids.
We also show that even the unweighted version of this problem is NP-complete and not in oracle-coNP.
Describing the color and textural information of a person image is one of the most crucial aspects of person re-identification (re-id).
In this paper, we present novel meta-descriptors based on a hierarchical distribution of pixel features.
Although hierarchical covariance descriptors have been successfully applied to image classification, the mean information of pixel features, which is absent from the covariance, tends to be the major discriminative information for person re-id.
To solve this problem, we describe a local region in an image via hierarchical Gaussian distribution in which both means and covariances are included in their parameters.
More specifically, the region is modeled as a set of multiple Gaussian distributions in which each Gaussian represents the appearance of a local patch.
The characteristics of the set of Gaussians are again described by another Gaussian distribution.
In both steps, we embed the parameters of the Gaussian into a point of Symmetric Positive Definite (SPD) matrix manifold.
By changing the way to handle mean information in this embedding, we develop two hierarchical Gaussian descriptors.
Additionally, we develop feature norm normalization methods with the ability to alleviate the biased trends that exist on the descriptors.
The experimental results conducted on five public datasets indicate that the proposed descriptors achieve remarkably high performance on person re-id.
We present Edina, the University of Edinburgh's social bot for the Amazon Alexa Prize competition.
Edina is a conversational agent whose responses utilize data harvested from Amazon Mechanical Turk (AMT) through an innovative new technique we call self-dialogues.
These are conversations in which a single AMT Worker plays both participants in a dialogue.
Such dialogues are surprisingly natural, efficient to collect and reflective of relevant and/or trending topics.
These self-dialogues provide training data for a generative neural network as well as a basis for soft rules used by a matching score component.
Each match of a soft rule against a user utterance is associated with a confidence score which we show is strongly indicative of reply quality, allowing this component to self-censor and be effectively integrated with other components.
Edina's full architecture features a rule-based system backing off to a matching score, backing off to a generative neural network.
Our hybrid data-driven methodology thus addresses both coverage limitations of a strictly rule-based approach and the lack of guarantees of a strictly machine-learning approach.
Datacenters running on-line, data-intensive applications (OLDIs) consume significant amounts of energy.
However, reducing their energy is challenging due to their tight response time requirements.
A key aspect of OLDIs is that each user query goes to all or many of the nodes in the cluster, so that the overall time budget is dictated by the tail of the replies' latency distribution; replies see latency variations both in the network and compute.
Previous work proposes to achieve load-proportional energy by slowing down the computation at lower datacenter loads based directly on response times (i.e., at lower loads, the proposal exploits the average slack in the time budget provisioned for the peak load).
In contrast, we propose TimeTrader to reduce energy by exploiting the latency slack in the sub- critical replies which arrive before the deadline (e.g., 80% of replies are 3-4x faster than the tail).
This slack is present at all loads and subsumes the previous work's load-related slack.
While the previous work shifts the leaves' response time distribution to consume the slack at lower loads, TimeTrader reshapes the distribution at all loads by slowing down individual sub-critical nodes without increasing missed deadlines.
TimeTrader exploits slack in both the network and compute budgets.
Further, TimeTrader leverages Earliest Deadline First scheduling to largely decouple critical requests from the queuing delays of sub- critical requests which can then be slowed down without hurting critical requests.
A combination of real-system measurements and at-scale simulations shows that without adding to missed deadlines, TimeTrader saves 15-19% and 41-49% energy at 90% and 30% loading, respectively, in a datacenter with 512 nodes, whereas previous work saves 0% and 31-37%.
Prediction of new drug-target interactions is extremely important as it can lead the researchers to find new uses for old drugs and to realize the therapeutic profiles or side effects thereof.
However, experimental prediction of drug-target interactions is expensive and time-consuming.
As a result, computational methods for prediction of new drug-target interactions have gained much interest in recent times.
We present iDTI-ESBoost, a prediction model for identification of drug-target interactions using evolutionary and structural features.
Our proposed method uses a novel balancing technique and a boosting technique for the binary classification problem of drug-target interaction.
On four benchmark datasets taken from a gold standard data, iDTI-ESBoost outperforms the state-of-the-art methods in terms of area under Receiver operating characteristic (auROC) curve. iDTI-ESBoost also outperforms the latest and the best-performing method in the literature to-date in terms of area under precision recall (auPR) curve.
This is significant as auPR curves are argued to be more appropriate as a metric for comparison for imbalanced datasets, like the one studied in this research.
In the sequel, our experiments establish the effectiveness of the classifier, balancing methods and the novel features incorporated in iDTI-ESBoost. iDTI-ESBoost is a novel prediction method that has for the first time exploited the structural features along with the evolutionary features to predict drug-protein interactions.
We believe the excellent performance of iDTI-ESBoost both in terms of auROC and auPR would motivate the researchers and practitioners to use it to predict drug-target interactions.
To facilitate that, iDTI-ESBoost is readily available for use at: http://farshidrayhan.pythonanywhere.com/iDTI-ESBoost/
We investigate the complexity of bounding the uncertainty of graphical games, and we provide new insight into the intrinsic difficulty of computing Nash equilibria.
In particular, we show that, if one adds very simple and natural additional requirements to a graphical game, the existence of Nash equilibria is no longer guaranteed, and computing an equilibrium is an intractable problem.
Moreover, if stronger equilibrium conditions are required for the game, we get hardness results for the second level of the polynomial hierarchy.
Our results offer a clear picture of the complexity of mixed Nash equilibria in graphical games, and answer some open research questions posed by Conitzer and Sandholm (2003).
This book consists of the chapters describing novel approaches to integrating fault tolerance into software development process.
They cover a wide range of topics focusing on fault tolerance during the different phases of the software development, software engineering techniques for verification and validation of fault tolerance means, and languages for supporting fault tolerance specification and implementation.
Accordingly, the book is structured into the following three parts: Part A: Fault tolerance engineering: from requirements to code; Part B: Verification and validation of fault tolerant systems; Part C: Languages and Tools for engineering fault tolerant systems.
One of the novelties brought by 5G is that wireless system design has increasingly turned its focus on guaranteeing reliability and latency.
This shifts the design objective of random access protocols from throughput optimization towards constraints based on reliability and latency.
For this purpose, we use frameless ALOHA, which relies on successive interference cancellation (SIC), and derive its exact finite-length analysis of the statistics of the unresolved users (reliability) as a function of the contention period length (latency).
The presented analysis can be used to derive the reliability-latency guarantees.
We also optimize the scheme parameters in order to maximize the reliability within a given latency.
Our approach represents an important step towards the general area of design and analysis of access protocols with reliability-latency guarantees.
Music, speech, and acoustic scene sound are often handled separately in the audio domain because of their different signal characteristics.
However, as the image domain grows rapidly by versatile image classification models, it is necessary to study extensible classification models in the audio domain as well.
In this study, we approach this problem using two types of sample-level deep convolutional neural networks that take raw waveforms as input and uses filters with small granularity.
One is a basic model that consists of convolution and pooling layers.
The other is an improved model that additionally has residual connections, squeeze-and-excitation modules and multi-level concatenation.
We show that the sample-level models reach state-of-the-art performance levels for the three different categories of sound.
Also, we visualize the filters along layers and compare the characteristics of learned filters.
The future of Internet of Things (IoT) is already upon us.
IoT applications have been widely used in many field of social production and social living such as healthcare, energy and industrial automation.
While enjoying the convenience and efficiency that IoT brings to us, new threats from IoT also have emerged.
There are increasing research works to ease these threats, but many problems remain open.
To better understand the essential reasons of new threats and the challenges in current research, this survey first proposes the concept of "IoT features".
Then, the security and privacy effects of eight IoT new features were discussed including the threats they cause, existing solutions and challenges yet to be solved.
To help researchers follow the up-to-date works in this field, this paper finally illustrates the developing trend of IoT security research and reveals how IoT features affect existing security research by investigating most existing research works related to IoT security from 2013 to 2017.
With the proliferation of web technologies it becomes more and more important to make the traditional negotiation pricing mechanism automated and intelligent.
The behaviour of software agents which negotiate on behalf of humans is determined by their tactics in the form of decision functions.
Prediction of partners behaviour in negotiation has been an active research direction in recent years as it will improve the utility gain for the adaptive negotiation agent and also achieve the agreement much quicker or look after much higher benefits.
In this paper we review the various negotiation methods and the existing architecture.
Although negotiation is practically very complex activity to automate without human intervention we have proposed architecture for predicting the opponents behaviour which will take into consideration various factors which affect the process of negotiation.
The basic concept is that the information about negotiators, their individual actions and dynamics can be used by software agents equipped with adaptive capabilities to learn from past negotiations and assist in selecting appropriate negotiation tactics.
Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications.
In this work, we propose a modified Mel filter bank to extract MFCCs from subsampled speech.
We also propose a stronger metric which effectively captures the correlation between MFCCs of original speech and MFCC of resampled speech.
It is found that the proposed method of filter bank construction performs distinguishably well and gives recognition performance on resampled speech close to recognition accuracies on original speech.
This paper performs the first investigation into depth for large-scale human action recognition in video where the depth cues are estimated from the videos themselves.
We develop a new framework called depth2action and experiment thoroughly into how best to incorporate the depth information.
We introduce spatio-temporal depth normalization (STDN) to enforce temporal consistency in our estimated depth sequences.
We also propose modified depth motion maps (MDMM) to capture the subtle temporal changes in depth.
These two components significantly improve the action recognition performance.
We evaluate our depth2action framework on three large-scale action recognition video benchmarks.
Our model achieves state-of-the-art performance when combined with appearance and motion information thus demonstrating that depth2action is indeed complementary to existing approaches.
This paper studies a spectrum estimation method for the case that the samples are obtained at a rate lower than the Nyquist rate.
The method is referred to as the correlogram for undersampled data.
The algorithm partitions the spectrum into a number of segments and estimates the average power within each spectral segment.
This method is able to estimate the power spectrum density of a signal from undersampled data without essentially requiring the signal to be sparse.
We derive the bias and the variance of the spectrum estimator, and show that there is a tradeoff between the accuracy of the estimation, the frequency resolution, and the complexity of the estimator.
A closed-form approximation of the estimation variance is also derived, which clearly shows how the variance is related to different parameters.
The asymptotic behavior of the estimator is also investigated, and it is proved that this spectrum estimator is consistent.
Moreover, the estimation made for different spectral segments becomes uncorrelated as the signal length tends to infinity.
Finally, numerical examples and simulation results are provided, which approve the theoretical conclusions.
Everyday place descriptions often contain place names of fine-grained features, such as buildings or businesses, that are more difficult to disambiguate than names referring to larger places, for example cities or natural geographic features.
Fine-grained places are often significantly more frequent and more similar to each other, and disambiguation heuristics developed for larger places, such as those based on population or containment relationships, are often not applicable in these cases.
In this research, we address the disambiguation of fine-grained place names from everyday place descriptions.
For this purpose, we evaluate the performance of different existing clustering-based approaches, since clustering approaches require no more knowledge other than the locations of ambiguous place names.
We consider not only approaches developed specifically for place name disambiguation, but also clustering algorithms developed for general data mining that could potentially be leveraged.
We compare these methods with a novel algorithm, and show that the novel algorithm outperforms the other algorithms in terms of disambiguation precision and distance error over several tested datasets.
In this work we introduce the application of black-box quantum control as an interesting rein- forcement learning problem to the machine learning community.
We analyze the structure of the reinforcement learning problems arising in quantum physics and argue that agents parameterized by long short-term memory (LSTM) networks trained via stochastic policy gradients yield a general method to solving them.
In this context we introduce a variant of the proximal policy optimization (PPO) algorithm called the memory proximal policy optimization (MPPO) which is based on this analysis.
We then show how it can be applied to specific learning tasks and present results of nu- merical experiments showing that our method achieves state-of-the-art results for several learning tasks in quantum control with discrete and continouous control parameters.
Storms and other severe weather events can result in fatalities, injuries, and property damage.
Therefore, preventing such outcomes to the extent possible is a key concern, and the scientific community faces an increasing demand for regularly updated appraisals of evolving climate conditions and extreme weather.
NOAA's Storm Events Database is undoubtedly an invaluable resource to the general public, to the professional, and to the researcher.
Due to such importance, the primary objective of this study was to explore this database and get clues about its reliability.
A complete investigation of the damage estimates, injuries or fatalities figures is unfeasible due to the extension of the database.
However, an exploratory data analysis with the resources of the R statistical data analysis language found that damage reports are missing in more than half of the records, that part of the damage values are incorrect, and that, despite all efforts of standardizations, non-standard event type names are still finding their way into the database.
These few results are enough to demonstrate that the database suffers from incompleteness and inconsistencies and should not be used without taking reservations and appropriate precautions before advancing any inferences from the data.
This paper presents a parallel memetic algorithm for solving the vehicle routing problem with time windows (VRPTW).
The VRPTW is a well-known NP-hard discrete optimization problem with two objectives.
The main objective is to minimize the number of vehicles serving customers scattered on the map, and the second one is to minimize the total distance traveled by the vehicles.
Here, the fleet size is minimized in the first phase of the proposed method using the parallel heuristic algorithm (PHA), and the traveled distance is minimized in the second phase by the parallel memetic algorithm (PMA).
In both parallel algorithms, the parallel components co-operate periodically in order to exchange the best solutions found so far.
An extensive experimental study performed on the Gehring and Homberger's benchmark proves the high convergence capabilities and robustness of both PHA and PMA.
Also, we present the speedup analysis of the PMA.
It is generally believed that the preference ranking method PROMETHEE has a quadratic time complexity.
In this paper, however, we present an exact algorithm that computes PROMETHEE's net flow scores in time O(qn log(n)), where q represents the number of criteria and n the number of alternatives.
The method is based on first sorting the alternatives after which the unicriterion flow scores of all alternatives can be computed in one scan over the sorted list of alternatives while maintaining a sliding window.
This method works with the linear and level criterion preference functions.
The algorithm we present is exact and, due to the sub-quadratic time complexity, vastly extends the applicability of the PROMETHEE method.
Experiments show that with the new algorithm, PROMETHEE can scale up to millions of tuples.
Lipschitz extensions were recently proposed as a tool for designing node differentially private algorithms.
However, efficiently computable Lipschitz extensions were known only for 1-dimensional functions (that is, functions that output a single real value).
In this paper, we study efficiently computable Lipschitz extensions for multi-dimensional (that is, vector-valued) functions on graphs.
We show that, unlike for 1-dimensional functions, Lipschitz extensions of higher-dimensional functions on graphs do not always exist, even with a non-unit stretch.
We design Lipschitz extensions with small stretch for the sorted degree list and for the degree distribution of a graph.
Crucially, our extensions are efficiently computable.
We also develop new tools for employing Lipschitz extensions in the design of differentially private algorithms.
Specifically, we generalize the exponential mechanism, a widely used tool in data privacy.
The exponential mechanism is given a collection of score functions that map datasets to real values.
It attempts to return the name of the function with nearly minimum value on the data set.
Our generalized exponential mechanism provides better accuracy when the sensitivity of an optimal score function is much smaller than the maximum sensitivity of score functions.
We use our Lipschitz extension and the generalized exponential mechanism to design a node-differentially private algorithm for releasing an approximation to the degree distribution of a graph.
Our algorithm is much more accurate than algorithms from previous work.
Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora.
There have been numerous attempts to extend these successes to low-resource language pairs, yet requiring tens of thousands of parallel sentences.
In this work, we take this research direction to the extreme and investigate whether it is possible to learn to translate even without any parallel data.
We propose a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space.
By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.
We demonstrate our model on two widely used datasets and two language pairs, reporting BLEU scores of 32.8 and 15.1 on the Multi30k and WMT English-French datasets, without using even a single parallel sentence at training time.
In this work we present a novel approach for single depth map super-resolution.
Modern consumer depth sensors, especially Time-of-Flight sensors, produce dense depth measurements, but are affected by noise and have a low lateral resolution.
We propose a method that combines the benefits of recent advances in machine learning based single image super-resolution, i.e. deep convolutional networks, with a variational method to recover accurate high-resolution depth maps.
In particular, we integrate a variational method that models the piecewise affine structures apparent in depth data via an anisotropic total generalized variation regularization term on top of a deep network.
We call our method ATGV-Net and train it end-to-end by unrolling the optimization procedure of the variational method.
To train deep networks, a large corpus of training data with accurate ground-truth is required.
We demonstrate that it is feasible to train our method solely on synthetic data that we generate in large quantities for this task.
Our evaluations show that we achieve state-of-the-art results on three different benchmarks, as well as on a challenging Time-of-Flight dataset, all without utilizing an additional intensity image as guidance.
Relational cost analysis aims at formally establishing bounds on the difference in the evaluation costs of two programs.
As a particular case, one can also use relational cost analysis to establish bounds on the difference in the evaluation cost of the same program on two different inputs.
One way to perform relational cost analysis is to use a relational type-and-effect system that supports reasoning about relations between two executions of two programs.
Building on this basic idea, we present a type-and-effect system, called ARel, for reasoning about the relative cost of array-manipulating, higher-order functional-imperative programs.
The key ingredient of our approach is a new lightweight type refinement discipline that we use to track relations (differences) between two arrays.
This discipline combined with Hoare-style triples built into the types allows us to express and establish precise relative costs of several interesting programs which imperatively update their data.
By providing substantial amounts of data and standardized evaluation protocols, datasets in computer vision have helped fuel advances across all areas of visual recognition.
But even in light of breakthrough results on recent benchmarks, it is still fair to ask if our recognition algorithms are doing as well as we think they are.
The vision sciences at large make use of a very different evaluation regime known as Visual Psychophysics to study visual perception.
Psychophysics is the quantitative examination of the relationships between controlled stimuli and the behavioral responses they elicit in experimental test subjects.
Instead of using summary statistics to gauge performance, psychophysics directs us to construct item-response curves made up of individual stimulus responses to find perceptual thresholds, thus allowing one to identify the exact point at which a subject can no longer reliably recognize the stimulus class.
In this article, we introduce a comprehensive evaluation framework for visual recognition models that is underpinned by this methodology.
Over millions of procedurally rendered 3D scenes and 2D images, we compare the performance of well-known convolutional neural networks.
Our results bring into question recent claims of human-like performance, and provide a path forward for correcting newly surfaced algorithmic deficiencies.
The unrestricted block relocation problem is an important optimization problem encountered at terminals, where containers are stored in stacks.
It consists in determining the minimum number of container moves so as to empty the considered bay following a certain retrieval sequence.
A container move can be either the retrieval of a container or the relocation of a certain container on top of a stack to another stack.
The latter types of moves are necessary so as to provide access to containers which are currently not on top of a stack.
They might also be useful to prepare future removals.
In this paper, we propose the first local search type improvement heuristic for the block relocation problem.
It relies on a clever definition of the state space which is explored by means of a dynamic programming algorithm so as to identify the locally optimal sequence of moves of a given container.
Our results on large benchmark instance reveal unexpectedly high improvement potentials (up to 50%) compared to results obtained by state-of-the-art constructive heuristics.
For a graph formed by vertices and weighted edges, a generalized minimum dominating set (MDS) is a vertex set of smallest cardinality such that the summed weight of edges from each outside vertex to vertices in this set is equal to or larger than certain threshold value.
This generalized MDS problem reduces to the conventional MDS problem in the limiting case of all the edge weights being equal to the threshold value.
We treat the generalized MDS problem in the present paper by a replica-symmetric spin glass theory and derive a set of belief-propagation equations.
As a practical application we consider the problem of extracting a set of sentences that best summarize a given input text document.
We carry out a preliminary test of the statistical physics-inspired method to this automatic text summarization problem.
Feature engineering is a crucial step in the process of predictive modeling.
It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target.
However, there is no well-defined basis for performing effective feature engineering.
It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error.
The human attention involved in overseeing this process significantly influences the cost of model generation.
We present a new framework to automate feature engineering.
It is based on performance driven exploration of a transformation graph, which systematically and compactly enumerates the space of given options.
A highly efficient exploration strategy is derived through reinforcement learning on past examples.
Leadership games provide a powerful paradigm to model many real-world settings.
Most literature focuses on games with a single follower who acts optimistically, breaking ties in favour of the leader.
Unfortunately, for real-world applications, this is unlikely.
In this paper, we look for efficiently solvable games with multiple followers who play either optimistically or pessimistically, i.e., breaking ties in favour or against the leader.
We study the computational complexity of finding or approximating an optimistic or pessimistic leader-follower equilibrium in specific classes of succinct games---polymatrix like---which are equivalent to 2-player Bayesian games with uncertainty over the follower, with interdependent or independent types.
Furthermore, we provide an exact algorithm to find a pessimistic equilibrium for those game classes.
Finally, we show that in general polymatrix games the computation is harder even when players are forced to play pure strategies.
Recently, many methods to interpret and visualize deep neural network predictions have been proposed and significant progress has been made.
However, a more class-discriminative and visually pleasing explanation is required.
Thus, this paper proposes a region-based approach that estimates feature importance in terms of appropriately segmented regions.
By fusing the saliency maps generated from multi-scale segmentations, a more class-discriminative and visually pleasing map is obtained.
We incorporate this regional multi-scale concept into a prediction difference method that is model-agnostic.
An input image is segmented in several scales using the super-pixel method, and exclusion of a region is simulated by sampling a normal distribution constructed using the boundary prior.
The experimental results demonstrate that the regional multi-scale method produces much more class-discriminative and visually pleasing saliency maps.
We describe an adaptation and application of a search-based structured prediction algorithm "Searn" to unsupervised learning problems.
We show that it is possible to reduce unsupervised learning to supervised learning and demonstrate a high-quality unsupervised shift-reduce parsing model.
We additionally show a close connection between unsupervised Searn and expectation maximization.
Finally, we demonstrate the efficacy of a semi-supervised extension.
The key idea that enables this is an application of the predict-self idea for unsupervised learning.
DeepPrior is a simple approach based on Deep Learning that predicts the joint 3D locations of a hand given a depth map.
Since its publication early 2015, it has been outperformed by several impressive works.
Here we show that with simple improvements: adding ResNet layers, data augmentation, and better initial hand localization, we achieve better or similar performance than more sophisticated recent methods on the three main benchmarks (NYU, ICVL, MSRA) while keeping the simplicity of the original method.
Our new implementation is available at https://github.com/moberweger/deep-prior-pp .
Given a binary nonlinear code, we provide a deterministic algorithm to compute its weight and distance distribution, and in particular its minimum weight and its minimum distance, which takes advantage of fast Fourier techniques.
This algorithm's performance is similar to that of best-known algorithms for the average case, while it is especially efficient for codes with low information rate.
We provide complexity estimates for several cases of interest.
Given a point set P in 2D, the problem of finding the smallest set of unit disks that cover all of P is NP-hard.
We present a simple algorithm for this problem with an approximation factor of 25/6 in the Euclidean norm and 2 in the max norm, by restricting the disk centers to lie on parallel lines.
The run time and space of this algorithm is O(n log n) and O(n) respectively.
This algorithm extends to any Lp norm and is asymptotically faster than known alternative approximation algorithms for the same approximation factor.
For future mmWave mobile communication systems the use of analog/hybrid beamforming is envisioned be a key as- pect.
The synthesis of beams is a key technology of enable the best possible operation during beamsearch, data transmission and MU MIMO operation.
The developed method for synthesizing beams is based on previous work in radar technology considering only phase array antennas.
With this technique it is possible to generate a desired beam of any shape with the constraints of the desired target transceiver antenna frontend.
It is not constraint to a certain antenna array geometry, but can handle 1d, 2d and even 3d antenna array geometries like cylindric arrays.
The numerical examples show that the method can synthesize beams by considering a user defined tradeoff between gain, transition width and passband ripples.
This report describes an initial replication study of the PRECISE system and develops a clearer, more formal description of the approach.
Based on our evaluation, we conclude that the PRECISE results do not fully replicate.
However the formalization developed here suggests a road map to further enhance and extend the approach pioneered by PRECISE.
After a long, productive discussion with Ana-Maria Popescu (one of the authors of PRECISE) we got more clarity on the PRECISE approach and how the lexicon was authored for the GEO evaluation.
Based on this we built a more direct implementation over a repaired formalism.
Although our new evaluation is not yet complete, it is clear that the system is performing much better now.
We will continue developing our ideas and implementation and generate a future report/publication that more accurately evaluates PRECISE like approaches.
Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks.
But, these tasks only evaluate lexical semantics indirectly.
In this paper, we study whether it is possible to utilize distributed representations to generate dictionary definitions of words, as a more direct and transparent representation of the embeddings' semantics.
We introduce definition modeling, the task of generating a definition for a given word and its embedding.
We present several definition model architectures based on recurrent neural networks, and experiment with the models over multiple data sets.
Our results show that a model that controls dependencies between the word being defined and the definition words performs significantly better, and that a character-level convolution layer designed to leverage morphology can complement word-level embeddings.
Finally, an error analysis suggests that the errors made by a definition model may provide insight into the shortcomings of word embeddings.
One way to analyze Cyber-Physical Systems is by modeling them as hybrid automata.
Since reachability analysis for hybrid nonlinear automata is a very challenging and computationally expensive problem, in practice, engineers try to solve the requirements falsification problem.
In one method, the falsification problem is solved by minimizing a robustness metric induced by the requirements.
This optimization problem is usually a non-convex non-smooth problem that requires heuristic and analytical guidance to be solved.
In this paper, functional gradient descent for hybrid systems is utilized for locally decreasing the robustness metric.
The local descent method is combined with Simulated Annealing as a global optimization method to search for unsafe behaviors.
Semantic segmentation is challenging as it requires both object-level information and pixel-level accuracy.
Recently, FCN-based systems gained great improvement in this area.
Unlike classification networks, combining features of different layers plays an important role in these dense prediction models, as these features contains information of different levels.
A number of models have been proposed to show how to use these features.
However, what is the best architecture to make use of features of different layers is still a question.
In this paper, we propose a module, called mixed context network, and show that our presented system outperforms most existing semantic segmentation systems by making use of this module.
Deformable image registration is a fundamental task in medical image analysis, aiming to establish a dense and non-linear correspondence between a pair of images.
Previous deep-learning studies usually employ supervised neural networks to directly learn the spatial transformation from one image to another, requiring task-specific ground-truth registration for model training.
Due to the difficulty in collecting precise ground-truth registration, implementation of these supervised methods is practically challenging.
Although several unsupervised networks have been recently developed, these methods usually ignore the inherent inverse-consistent property (essential for diffeomorphic mapping) of transformations between a pair of images.
Also, existing approaches usually encourage the to-be-estimated transformation to be locally smooth via a smoothness constraint only, which could not completely avoid folding in the resulting transformation.
To this end, we propose an Inverse-Consistent deep Network (ICNet) for unsupervised deformable image registration.
Specifically, we develop an inverse-consistent constraint to encourage that a pair of images are symmetrically deformed toward one another, until both warped images are matched.
Besides using the conventional smoothness constraint, we also propose an anti-folding constraint to further avoid folding in the transformation.
The proposed method does not require any supervision information, while encouraging the diffeomoprhic property of the transformation via the proposed inverse-consistent and anti-folding constraints.
We evaluate our method on T1-weighted brain magnetic resonance imaging (MRI) scans for tissue segmentation and anatomical landmark detection, with results demonstrating the superior performance of our ICNet over several state-of-the-art approaches for deformable image registration.
Our code will be made publicly available.
The concepts of MIMO MC-CDMA are not new but the new technologies to improve their functioning are an emerging area of research.
In general, most mobile communication systems transmit bits of information in the radio space to the receiver.
The radio channels in mobile radio systems are usually multipath fading channels, which cause inter-symbol interference (ISI) in the received signal.
To remove ISI from the signal, there is a need of strong equalizer.
In this thesis we have focused on simulating the MIMO MC-CDMA systems in MATLAB and designed the channel estimation for them.
Given a pseudoword over suitable pseudovarieties, we associate to it a labeled linear order determined by the factorizations of the pseudoword.
We show that, in the case of the pseudovariety of aperiodic finite semigroups, the pseudoword can be recovered from the labeled linear order.
Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks.
Training of the so-called FlowNet was enabled by a large synthetically generated dataset.
The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation.
To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks.
Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods.
Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results.
By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network.
We propose a straightforward method that simultaneously reconstructs the 3D facial structure and provides dense alignment.
To achieve this, we design a 2D representation called UV position map which records the 3D shape of a complete face in UV space, then train a simple Convolutional Neural Network to regress it from a single 2D image.
We also integrate a weight mask into the loss function during training to improve the performance of the network.
Our method does not rely on any prior face model, and can reconstruct full facial geometry along with semantic meaning.
Meanwhile, our network is very light-weighted and spends only 9.8ms to process an image, which is extremely faster than previous works.
Experiments on multiple challenging datasets show that our method surpasses other state-of-the-art methods on both reconstruction and alignment tasks by a large margin.
In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text.
Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models.
We propose a novel technique called dual language models, which involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two.
We evaluate the efficacy of our approach using a conversational Mandarin-English speech corpus.
We prove the robustness of our model by showing significant improvements in perplexity measures over the standard bilingual language model without the use of any external information.
Similar consistent improvements are also reflected in automatic speech recognition error rates.
Understanding the fundamental end-to-end delay performance in mobile ad hoc networks (MANETs) is of great importance for supporting Quality of Service (QoS) guaranteed applications in such networks.
While upper bounds and approximations for end-to-end delay in MANETs have been developed in literature, which usually introduce significant errors in delay analysis, the modeling of exact end-to-end delay in MANETs remains a technical challenge.
This is partially due to the highly dynamical behaviors of MANETs, but also due to the lack of an efficient theoretical framework to capture such dynamics.
This paper demonstrates the potential application of the powerful Quasi-Birth-and-Death (QBD) theory in tackling the challenging issue of exact end-to-end delay modeling in MANETs.
We first apply the QBD theory to develop an efficient theoretical framework for capturing the complex dynamics in MANETs.
We then show that with the help of this framework, closed form models can be derived for the analysis of exact end-to-end delay and also per node throughput capacity in MANETs.
Simulation and numerical results are further provided to illustrate the efficiency of these QBD theory-based models as well as our theoretical findings.
There are many Local texture features each very in way they implement and each of the Algorithm trying improve the performance.
An attempt is made in this paper to represent a theoretically very simple and computationally effective approach for face recognition.
In our implementation the face image is divided into 3x3 sub-regions from which the features are extracted using the Local Binary Pattern (LBP) over a window, fuzzy membership function and at the central pixel.
The LBP features possess the texture discriminative property and their computational cost is very low.
By utilising the information from LBP, membership function, and central pixel, the limitations of traditional LBP is eliminated.
The bench mark database like ORL and Sheffield Databases are used for the evaluation of proposed features with SVM classifier.
For the proposed approach K-fold and ROC curves are obtained and results are compared.
Resource leak bugs in Android apps are pervasive and can cause serious performance degradation and system crashes.
In recent years, several resource leak detection techniques have been proposed to assist Android developers in correctly managing system resources.
Yet, there exist no common bug benchmarks for effectively and reliably comparing such techniques and quantitatively evaluating their strengths and weaknesses.
This paper describes our initial contribution towards constructing such a benchmark.
To locate real resource leak bugs, we mined 124,215 code revisions of 34 large-scale open-source Android apps.
We successfully found 298 fixed resource leaks, which cover a diverse set of resource classes, from 32 out of the 34 apps.
To understand the characteristics of these bugs, we conducted an empirical study, which revealed the root causes of frequent resource leaks in Android apps and common patterns of faults made by developers.
With our findings, we further implemented a static checker to detect a common pattern of resource leaks in Android apps.
Experiments showed that the checker can effectively locate real resource leaks in popular Android apps, confirming the usefulness of our work.
In a previous paper, it was discussed whether Bitcoin and/or its blockchain could be considered a complex system and, if so, whether a chaotic one, a positive response raising concerns about the likelihood of Bitcoin/blockchain entering a chaotic regime, with catastrophic consequences for financial systems based on it.
This paper intends to simplify and extend that analysis to other PoW, PoS, and hybrid protocol-based cryptocurrencies.
As before, this study was carried out with the help of Information Theory of Complex Systems, in general, and Crutchfield's Statistical Complexity measure, in particular.
This paper is a work-in-progress.
We intend to uncover some other measures that capture the qualitative notion of complexity of systems that can be applied to these cryptocurrencies to compare with the results here obtained.
Recently, the millimeter wave (mmWave) bands have been investigated as a means to support the foreseen extreme data rate demands of next-generation cellular networks (5G).
However, in order to overcome the severe isotropic path loss and the harsh propagation experienced at such high frequencies, a dense base station deployment is required, which may be infeasible because of the unavailability of fiber drops to provide wired backhauling.
To address this challenge, the 3GPP is investigating the concept of Integrated Access and Backhaul (IAB), i.e., the possibility of providing wireless backhaul to the mobile terminals.
In this paper, we (i) extend the capabilities of the existing mmWave module for ns-3 to support advanced IAB functionalities, and (ii) evaluate the end-to-end performance of the IAB architecture through system-level full-stack simulations in terms of experienced throughput and communication latency.
We finally provide guidelines on how to design optimal wireless backhaul solutions in the presence of resource-constrained and traffic-congested mmWave scenarios.
The multi-armed bandit problem has been extensively studied under the stationary assumption.
However in reality, this assumption often does not hold because the distributions of rewards themselves may change over time.
In this paper, we propose a change-detection (CD) based framework for multi-armed bandit problems under the piecewise-stationary setting, and study a class of change-detection based UCB (Upper Confidence Bound) policies, CD-UCB, that actively detects change points and restarts the UCB indices.
We then develop CUSUM-UCB and PHT-UCB, that belong to the CD-UCB class and use cumulative sum (CUSUM) and Page-Hinkley Test (PHT) to detect changes.
We show that CUSUM-UCB obtains the best known regret upper bound under mild assumptions.
We also demonstrate the regret reduction of the CD-UCB policies over arbitrary Bernoulli rewards and Yahoo! datasets of webpage click-through rates.
We propose an inexact method for the graph Fourier transform of a graph signal, as defined by the signal decomposition over the Jordan subspaces of the graph adjacency matrix.
This method projects the signal over the generalized eigenspaces of the adjacency matrix, which accelerates the transform computation over large, sparse, and directed adjacency matrices.
The trade-off between execution time and fidelity to the original graph structure is discussed.
In addition, properties such as a generalized Parseval's identity and total variation ordering of the generalized eigenspaces are discussed.
The method is applied to 2010-2013 NYC taxi trip data to identify traffic hotspots on the Manhattan grid.
Our results show that identical highly expressed geolocations can be identified with the inexact method and the method based on eigenvector projections, while reducing computation time by a factor of 26,000 and reducing energy dispersal among the spectral components corresponding to the multiple zero eigenvalue.
The clustering ensemble paradigm has emerged as an effective tool for community detection in multilayer networks, which allows for producing consensus solutions that are designed to be more robust to the algorithmic selection and configuration bias.
However, one limitation is related to the dependency on a co-association threshold that controls the degree of consensus in the community structure solution.
The goal of this work is to overcome this limitation with a new framework of ensemble-based multilayer community detection, which features parameter-free identification of consensus communities based on generative models of graph pruning that are able to filter out noisy co-associations.
We also present an enhanced version of the modularity-driven ensemble-based multilayer community detection method, in which community memberships of nodes are reconsidered to optimize the multilayer modularity of the consensus solution.
Experimental evidence on real-world networks confirms the beneficial effect of using model-based filtering methods and also shows the superiority of the proposed method on state-of-the-art multilayer community detection.
User engagement refers to the amount of interaction an instance (e.g., tweet, news, and forum post) achieves.
Ranking the items in social media websites based on the amount of user participation in them, can be used in different applications, such as recommender systems.
In this paper, we consider a tweet containing a rating for a movie as an instance and focus on ranking the instances of each user based on their engagement, i.e., the total number of retweets and favorites it will gain.
For this task, we define several features which can be extracted from the meta-data of each tweet.
The features are partitioned into three categories: user-based, movie-based, and tweet-based.
We show that in order to obtain good results, features from all categories should be considered.
We exploit regression and learning to rank methods to rank the tweets and propose to aggregate the results of regression and learning to rank methods to achieve better performance.
We have run our experiments on an extended version of MovieTweeting dataset provided by ACM RecSys Challenge 2014.
The results show that learning to rank approach outperforms most of the regression models and the combination can improve the performance significantly.
Automatic profiling of social media users is an important task for supporting a multitude of downstream applications.
While a number of studies have used social media content to extract and study collective social attributes, there is a lack of substantial research that addresses the detection of a user's industry.
We frame this task as classification using both feature engineering and ensemble learning.
Our industry-detection system uses both posted content and profile information to detect a user's industry with 64.3% accuracy, significantly outperforming the majority baseline in a taxonomy of fourteen industry classes.
Our qualitative analysis suggests that a person's industry not only affects the words used and their perceived meanings, but also the number and type of emotions being expressed.
The spread of online reviews, ratings and opinions and its growing influence on people's behavior and decisions boosted the interest to extract meaningful information from this data deluge.
Hence, crowdsourced ratings of products and services gained a critical role in business, governments, and others.
We propose a new reputation-based ranking system utilizing multipartite rating subnetworks, that clusters users by their similarities, using Kolmogorov complexity.
Our system is novel in that it reflects a diversity of opinions/preferences by assigning possibly distinct rankings, for the same item, for different groups of users.
We prove the convergence and efficiency of the system and show that it copes better with spamming/spurious users, and it is more robust to attacks than state-of-the-art approaches.
Future 5G and Internet of Things (IoT) applications will heavily rely on long-range communication technologies such as low-power wireless area networks (LPWANs).
In particular, LoRaWAN built on LoRa physical layer is gathering increasing interests, both from academia and industries, for enabling low-cost energy efficient IoT wireless sensor networks for, e.g., environmental monitoring over wide areas.
While its communication range may go up to 20 kilometers, the achievable bit rates in LoRaWAN are limited to a few kilobits per second.
In the event of collisions, the perceived rate is further reduced due to packet loss and retransmissions.
Firstly, to alleviate the harmful impacts of collisions, we propose a decoding algorithm that enables to resolve several superposed LoRa signals.
Our proposed method exploits the slight desynchronization of superposed signals and specific features of LoRa physical layer.
Secondly, we design a full MAC protocol enabling collision resolution.
The simulation results demonstrate that the proposed method outperforms conventional LoRaWAN jointly in terms of system throughput, energy efficiency as well as delay.
These results show that our scheme is well suited for 5G and IoT systems, as one of their major goals is to provide the best trade-off among these performance objectives.
We study nonlinear power systems consisting of generators, generator buses, and non-generator buses.
First, looking at a generator and its bus' variables jointly, we introduce a synchronization concept for a pair of such joint generators and buses.
We show that this concept is related to graph symmetry.
Next, we extend, in two ways, the synchronization from a pair to a partition of all generators in the networks and show that they are related to either graph symmetry or equitable partitions.
Finally, we show how an exact reduced model can be obtained by aggregating the generators and associated buses in the network when the original system is synchronized with respect to a partition, provided that the initial condition respects the partition.
Additionally, the aggregation-based reduced model is again a power system.
In this article we introduce the principles to detect leakage using a mathematical model based on machine learning and domestic water consumption monitoring in real time.
The model uses data which is measured from a water meter, analyzes the water consumption, and uses two criteria simultaneously: deviation from the average consumption, and comparison of steady water consumptions over a period of time.
Simulation of the model on a regular household consumer was implemented on Antileaks - device that we have built that designed to transfer consumption information from an analogue water meter to a digital form in real time.
In this paper, we present a communication-free algorithm for distributed coverage of an arbitrary network by a group of mobile agents with local sensing capabilities.
The network is represented as a graph, and the agents are arbitrarily deployed on some nodes of the graph.
Any node of the graph is covered if it is within the sensing range of at least one agent.
The agents are mobile devices that aim to explore the graph and to optimize their locations in a decentralized fashion by relying only on their sensory inputs.
We formulate this problem in a game theoretic setting and propose a communication-free learning algorithm for maximizing the coverage.
We discuss the suitability of spreadsheet processors as tools for programming streaming systems.
We argue that, while spreadsheets can function as powerful models for stream operators, their fundamental boundedness limits their scope of application.
We propose two extensions to the spreadsheet model and argue their utility in the context of programming streaming systems.
Finding the product of two polynomials is an essential and basic problem in computer algebra.
While most previous results have focused on the worst-case complexity, we instead employ the technique of adaptive analysis to give an improvement in many "easy" cases.
We present two adaptive measures and methods for polynomial multiplication, and also show how to effectively combine them to gain both advantages.
One useful feature of these algorithms is that they essentially provide a gradient between existing "sparse" and "dense" methods.
We prove that these approaches provide significant improvements in many cases but in the worst case are still comparable to the fastest existing algorithms.
Learning social media content is the basis of many real-world applications, including information retrieval and recommendation systems, among others.
In contrast with previous works that focus mainly on single modal or bi-modal learning, we propose to learn social media content by fusing jointly textual, acoustic, and visual information (JTAV).
Effective strategies are proposed to extract fine-grained features of each modality, that is, attBiGRU and DCRNN.
We also introduce cross-modal fusion and attentive pooling techniques to integrate multi-modal information comprehensively.
Extensive experimental evaluation conducted on real-world datasets demonstrates our proposed model outperforms the state-of-the-art approaches by a large margin.
Automatic Offline Handwritten Signature Verification has been researched over the last few decades from several perspectives, using insights from graphology, computer vision, signal processing, among others.
In spite of the advancements on the field, building classifiers that can separate between genuine signatures and skilled forgeries (forgeries made targeting a particular signature) is still hard.
We propose approaching the problem from a feature learning perspective.
Our hypothesis is that, in the absence of a good model of the data generation process, it is better to learn the features from data, instead of using hand-crafted features that have no resemblance to the signature generation process.
To this end, we use Deep Convolutional Neural Networks to learn features in a writer-independent format, and use this model to obtain a feature representation on another set of users, where we train writer-dependent classifiers.
We tested our method in two datasets: GPDS-960 and Brazilian PUC-PR.
Our experimental results show that the features learned in a subset of the users are discriminative for the other users, including across different datasets, reaching close to the state-of-the-art in the GPDS dataset, and improving the state-of-the-art in the Brazilian PUC-PR dataset.
Empirical evidence shows that co-authored publications achieve higher visibility and impact.
The aim of the current work is to test for the existence of a similar correlation for Italian publications.
We also verify if such correlation differs: i) by subject category and macro-area; ii) by document type; iii) over the course of time.
The results confirm world-level evidence, showing a consistent and significant linear growth in the citability of a publication with number of co-authors, in almost all subject categories.
The effects are more remarkable in the fields of Social Sciences and Art & Humanities than in the Sciences-a finding not so obvious scrutinizing previous studies.
Moreover, our results partly disavow the positive association between number of authors and prestige of the journal, as measured by its impact factor.
We consider the problem of constructing fast and small parallel prefix adders for non-uniform input arrival times.
This problem arises whenever the adder is embedded into a more complex circuit, e. g. a multiplier.
Most previous results are based on representing binary carry-propagate adders as so-called parallel prefix graphs, in which pairs of generate and propagate signals are combined using complex gates known as prefix gates.
Adders constructed in this model usually minimize the delay in terms of these prefix gates.
However, the delay in terms of logic gates can be worse by a factor of two.
In contrast, we aim to minimize the delay of the underlying logic circuit directly.
We prove a lower bound on the delay of a carry bit computation achievable by any prefix carry bit circuit and develop an algorithm that computes a prefix carry bit circuit with optimum delay up to a small additive constant.
Furthermore, we use this algorithm to construct a small parallel prefix adder.
Compared to existing algorithms we simultaneously improve the delay and size guarantee, as well as the running time for constructing prefix carry bit and adder circuits.
In this paper, we present a method for instance ranking and retrieval at fine-grained level based on the global features extracted from a multi-attribute recognition model which is not dependent on landmarks information or part-based annotations.
Further, we make this architecture suitable for mobile-device application by adopting the bilinear CNN to make the multi-attribute recognition model smaller (in terms of the number of parameters).
The experiments run on the Dress category of DeepFashion In-Shop Clothes Retrieval and CUB200 datasets show that the results of instance retrieval at fine-grained level are promising for these datasets, specially in terms of texture and color.
The nonstationary nature of signals and nonlinear systems require the time-frequency representation.
In time-domain signal, frequency information is derived from the phase of the Gabor's analytic signal which is practically obtained by the inverse Fourier transform.
This study presents time-frequency analysis by the Fourier transform which maps the time-domain signal into the frequency-domain.
In this study, we derive the time information from the phase of the frequency-domain signal and obtain the time-frequency representation.
In order to obtain the time information in Fourier domain, we define the concept of `frequentaneous time' which is frequency derivative of phase.
This is very similar to the group delay, which is also defined as frequency derivative of phase and it provide physical meaning only when it is positive.
The frequentaneous time is always positive or negative depending upon whether signal is defined for only positive or negative times, respectively.
If a signal is defined for both positive and negative times, then we divide the signal into two parts, signal for positive times and signal for negative times.
The proposed frequentaneous time and Fourier transform based time-frequency distribution contains only those frequencies which are present in the Fourier spectrum.
Simulations and numerical results, on many simulated as well as read data, demonstrate the efficacy of the proposed method for the time-frequency analysis of a signal.
Training complex machine learning models for prediction often requires a large amount of data that is not always readily available.
Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare.
In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset.
We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets.
We consider the problem of communication over a network containing a hidden and malicious adversary that can control a subset of network resources, and aims to disrupt communications.
We focus on omniscient node-based adversaries, i.e., the adversaries can control a subset of nodes, and know the message, network code and packets on all links.
Characterizing information-theoretically optimal communication rates as a function of network parameters and bounds on the adversarially controlled network is in general open, even for unicast (single source, single destination) problems.
In this work we characterize the information-theoretically optimal randomized capacity of such problems, i.e., under the assumption that the source node shares (an asymptotically negligible amount of) independent common randomness with each network node a priori (for instance, as part of network design).
We propose a novel computationally-efficient communication scheme whose rate matches a natural information-theoretically "erasure outer bound" on the optimal rate.
Our schemes require no prior knowledge of network topology, and can be implemented in a distributed manner as an overlay on top of classical distributed linear network coding.
This paper considers linear discrete-time systems with additive disturbances, and designs a Model Predictive Control (MPC) law to minimise a quadratic cost function subject to a chance constraint.
The chance constraint is defined as a discounted sum of violation probabilities on an infinite horizon.
By penalising violation probabilities close to the initial time and ignoring violation probabilities in the far future, this form of constraint enables the feasibility of the online optimisation to be guaranteed without an assumption of boundedness of the disturbance.
A computationally convenient MPC optimisation problem is formulated using Chebyshev's inequality and we introduce an online constraint-tightening technique to ensure recursive feasibility based on knowledge of a suboptimal solution.
The closed loop system is guaranteed to satisfy the chance constraint and a quadratic stability condition.
We study the task of Byzantine gathering in a network modeled as a graph.
Despite the presence of Byzantine agents, all the other (good) agents, starting from possibly different nodes and applying the same deterministic algorithm, have to meet at the same node in finite time and stop moving.
An adversary chooses the initial nodes of the agents and assigns a different label to each of them.
The agents move in synchronous rounds and communicate with each other only when located at the same node.
Within the team, f of the agents are Byzantine.
A Byzantine agent acts in an unpredictable way: in particular it may forge the label of another agent or create a completely new one.
Besides its label, which corresponds to a local knowledge, an agent is assigned some global knowledge GK that is common to all agents.
In literature, the Byzantine gathering problem has been analyzed in arbitrary n-node graphs by considering the scenario when GK=(n,f) and the scenario when GK=f.
In the first (resp. second) scenario, it has been shown that the minimum number of good agents guaranteeing deterministic gathering of all of them is f+1 (resp. f+2).
For both these scenarios, all the existing deterministic algorithms, whether or not they are optimal in terms of required number of good agents, have a time complexity that is exponential in n and L, where L is the largest label belonging to a good agent.
In this paper, we seek to design a deterministic solution for Byzantine gathering that makes a concession on the proportion of Byzantine agents within the team, but that offers a significantly lower complexity.
We also seek to use a global knowledge whose the length of the binary representation is small.
Assuming that the agents are in a strong team i.e., a team in which the number of good agents is at least some prescribed value that is quadratic in f, we give positive and negative results.
Recommender systems benefit us in tackling the problem of information overload by predicting our potential choices among diverse niche objects.
So far, a variety of personalized recommendation algorithms have been proposed and most of them are based on similarities, such as collaborative filtering and mass diffusion.
Here, we propose a novel vertex similarity index named CosRA, which combines advantages of both the cosine index and the resource-allocation (RA) index.
By applying the CosRA index to real recommender systems including MovieLens, Netflix and RYM, we show that the CosRA-based method has better performance in accuracy, diversity and novelty than some benchmark methods.
Moreover, the CosRA index is free of parameters, which is a significant advantage in real applications.
Further experiments show that the introduction of two turnable parameters cannot remarkably improve the overall performance of the CosRA index.
A lower bound for the interleaving distance on persistence vector spaces is given in terms of rank invariants.
This offers an alternative proof of the stability of rank invariants.
This paper proposes a hybrid self-adaptive evolutionary algorithm for graph coloring that is hybridized with the following novel elements: heuristic genotype-phenotype mapping, a swap local search heuristic, and a neutral survivor selection operator.
This algorithm was compared with the evolutionary algorithm with the SAW method of Eiben et al., the Tabucol algorithm of Hertz and de Werra, and the hybrid evolutionary algorithm of Galinier and Hao.
The performance of these algorithms were tested on a test suite consisting of randomly generated 3-colorable graphs of various structural features, such as graph size, type, edge density, and variability in sizes of color classes.
Furthermore, the test graphs were generated including the phase transition where the graphs are hard to color.
The purpose of the extensive experimental work was threefold: to investigate the behavior of the tested algorithms in the phase transition, to identify what impact hybridization with the DSatur traditional heuristic has on the evolutionary algorithm, and to show how graph structural features influence the performance of the graph-coloring algorithms.
The results indicate that the performance of the hybrid self-adaptive evolutionary algorithm is comparable with, or better than, the performance of the hybrid evolutionary algorithm which is one of the best graph-coloring algorithms today.
Moreover, the fact that all the considered algorithms performed poorly on flat graphs confirms that this type of graphs is really the hardest to color.
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited.
Aiming at this challenge task, a novel learning framework is proposed in this paper.
The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms.
Within this framework, two separately developed research areas are bridged together, and a batch of theoretically sound online cost-sensitive bagging and online cost-sensitive boosting algorithms are first proposed.
Unlike other online cost-sensitive learning algorithms lacking theoretical analysis of asymptotic properties, the convergence of the proposed algorithms is guaranteed under certain conditions, and the experimental evidence with benchmark data sets also validates the effectiveness and efficiency of the proposed methods.
Object-oriented programming has long been regarded as too inefficient for SIMD high-performance computing, despite the fact that many important applications in HPC have an inherent object structure.
On SIMD accelerators including GPUs, this is mainly due to performance problems with memory allocation: There are a few libraries that support parallel memory allocation directly on accelerator devices, but all of them suffer from uncoalesed memory accesses.
In this work, we present DynaSOAr, a C++/CUDA data layout DSL for object-oriented programming, combined with a parallel dynamic object allocator.
DynaSOAr was designed for a class of object-oriented programs that we call Single-Method Multiple Objects (SMMO), in which parallelism is expressed over a set of objects.
DynaSOAr is the first GPU object allocator that provides a parallel do-all operation, which is the foundation of SMMO applications.
DynaSOAr improves the usage of allocated memory with a Structure of Arrays (SOA) data layout and achieves low memory fragmentation through efficient management of free and allocated memory blocks with lock-free, hierarchical bitmaps.
In our benchmarks, DynaSOAr achieves a significant speedup of application code of up to 3x over state-of-the-art allocators.
Moreover, DynaSOAr manages heap memory more efficiently than other allocators, allowing programmers to run up to 2x larger problem sizes with the same amount of memory.
Submodular extensions of an energy function can be used to efficiently compute approximate marginals via variational inference.
The accuracy of the marginals depends crucially on the quality of the submodular extension.
To identify the best possible extension, we show an equivalence between the submodular extensions of the energy and the objective functions of linear programming (LP) relaxations for the corresponding MAP estimation problem.
This allows us to (i) establish the worst-case optimality of the submodular extension for Potts model used in the literature; (ii) identify the worst-case optimal submodular extension for the more general class of metric labeling; and (iii) efficiently compute the marginals for the widely used dense CRF model with the help of a recently proposed Gaussian filtering method.
Using synthetic and real data, we show that our approach provides comparable upper bounds on the log-partition function to those obtained using tree-reweighted message passing (TRW) in cases where the latter is computationally feasible.
Importantly, unlike TRW, our approach provides the first practical algorithm to compute an upper bound on the dense CRF model.
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia.
Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers.
Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation.
Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism.
Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory.
FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM.
This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies.
This makes FPGAs potentially powerful solutions for real-time classification of CNNs.
Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution.
In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented.
Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed.
Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards.
Altera provides multi-platforms tools, mature design community and better execution times.
Adaptive tracking-by-detection approaches are popular for tracking arbitrary objects.
They treat the tracking problem as a classification task and use online learning techniques to update the object model.
However, these approaches are heavily invested in the efficiency and effectiveness of their detectors.
Evaluating a massive number of samples for each frame (e.g., obtained by a sliding window) forces the detector to trade the accuracy in favor of speed.
Furthermore, misclassification of borderline samples in the detector introduce accumulating errors in tracking.
In this study, we propose a co-tracking based on the efficient cooperation of two detectors: a rapid adaptive exemplar-based detector and another more sophisticated but slower detector with a long-term memory.
The sampling labeling and co-learning of the detectors are conducted by an uncertainty sampling unit, which improves the speed and accuracy of the system.
We also introduce a budgeting mechanism which prevents the unbounded growth in the number of examples in the first detector to maintain its rapid response.
Experiments demonstrate the efficiency and effectiveness of the proposed tracker against its baselines and its superior performance against state-of-the-art trackers on various benchmark videos.
We study the problem of searching for and tracking a collection of moving targets using a robot with a limited Field-Of-View (FOV) sensor.
The actual number of targets present in the environment is not known a priori.
We propose a search and tracking framework based on the concept of Bayesian Random Finite Sets (RFSs).
Specifically, we generalize the Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter which was previously applied for tracking problems to allow for simultaneous search and tracking with a limited FOV sensor.
The proposed framework can extract individual target tracks as well as estimate the number and the spatial density of targets.
We also show how to use the Gaussian Process (GP) regression to extract and predict non-linear target trajectories in this framework.
We demonstrate the efficacy of our techniques through representative simulations and a real data collected from an aerial robot.
Electronic medical records (EMR) contain longitudinal information about patients that can be used to analyze outcomes.
Typically, studies on EMR data have worked with established variables that have already been acknowledged to be associated with certain outcomes.
However, EMR data may also contain hitherto unrecognized factors for risk association and prediction of outcomes for a disease.
In this paper, we present a scalable data-driven framework to analyze EMR data corpus in a disease agnostic way that systematically uncovers important factors influencing outcomes in patients, as supported by data and without expert guidance.
We validate the importance of such factors by using the framework to predict for the relevant outcomes.
Specifically, we analyze EMR data covering approximately 47 million unique patients to characterize renal failure (RF) among type 2 diabetic (T2DM) patients.
We propose a specialized L1 regularized Cox Proportional Hazards (CoxPH) survival model to identify the important factors from those available from patient encounter history.
To validate the identified factors, we use a specialized generalized linear model (GLM) to predict the probability of renal failure for individual patients within a specified time window.
Our experiments indicate that the factors identified via our data-driven method overlap with the patient characteristics recognized by experts.
Our approach allows for scalable, repeatable and efficient utilization of data available in EMRs, confirms prior medical knowledge and can generate new hypothesis without expert supervision.
Cellular networks have special characteristics including highly variable channels, fast fluctuating capacities, deep per user buffers, self-inflicted queuing delays, radio uplink/downlink scheduling delays, etc.
These distinguishing properties make the problem of achieving low latency and high throughput in cellular networks more challenging than in wired networks.
That's why in this environment, TCP and its flavors, which are generally designed for wired networks, perform poorly.
To cope with these challenges, we present C2TCP, a flexible end-to-end solution targeting interactive applications requiring high throughput and low delay in cellular networks.
C2TCP stands on top of loss-based TCP and brings it delay sensitivity without requiring any network state profiling, channel prediction, or complicated rate adjustment mechanisms.
The key idea behind C2TCP is to absorb dynamics of unpredictable cellular channels by investigating local minimum delay of packets in a moving time window and react to the cellular network's capacity changes very fast.
Through extensive trace-based evaluations using traces from five commercial LTE and 3G networks, we have compared performance of C2TCP with various TCP variants, and state-of-the-art schemes including BBR, Verus, and Sprout.
Results show that on average, C2TCP outperforms these schemes and achieves lower average and 95th percentile delay for packets.
A robot operating in a real-world environment needs to perform reasoning over a variety of sensor modalities such as vision, language and motion trajectories.
However, it is extremely challenging to manually design features relating such disparate modalities.
In this work, we introduce an algorithm that learns to embed point-cloud, natural language, and manipulation trajectory data into a shared embedding space with a deep neural network.
To learn semantically meaningful spaces throughout our network, we use a loss-based margin to bring embeddings of relevant pairs closer together while driving less-relevant cases from different modalities further apart.
We use this both to pre-train its lower layers and fine-tune our final embedding space, leading to a more robust representation.
We test our algorithm on the task of manipulating novel objects and appliances based on prior experience with other objects.
On a large dataset, we achieve significant improvements in both accuracy and inference time over the previous state of the art.
We also perform end-to-end experiments on a PR2 robot utilizing our learned embedding space.
Optical coherence tomography (OCT) is a powerful and noninvasive method for retinal imaging.
In this paper, we introduce a fast segmentation method based on a new variant of spectral graph theory named diffusion maps.
The research is performed on spectral domain (SD) OCT images depicting macular and optic nerve head appearance.
The presented approach does not require edge-based image information and relies on regional image texture.
Consequently, the proposed method demonstrates robustness in situations of low image contrast or poor layer-to-layer image gradients.
Diffusion mapping is applied to 2D and 3D OCT datasets composed of two steps, one for partitioning the data into important and less important sections, and another one for localization of internal layers.In the first step, the pixels/voxels are grouped in rectangular/cubic sets to form a graph node.The weights of a graph are calculated based on geometric distances between pixels/voxels and differences of their mean intensity.The first diffusion map clusters the data into three parts, the second of which is the area of interest.
The other two sections are eliminated from the remaining calculations.
In the second step, the remaining area is subjected to another diffusion map assessment and the internal layers are localized based on their textural similarities.The proposed method was tested on 23 datasets from two patient groups (glaucoma and normals).
The mean unsigned border positioning errors(mean - SD) was 8.52 - 3.13 and 7.56 - 2.95 micrometer for the 2D and 3D methods, respectively.
Learning to act in unstructured environments, such as cluttered piles of objects, poses a substantial challenge for manipulation robots.
We present a novel neural network-based approach that separates unknown objects in clutter by selecting favourable push actions.
Our network is trained from data collected through autonomous interaction of a PR2 robot with randomly organized tabletop scenes.
The model is designed to propose meaningful push actions based on over-segmented RGB-D images.
We evaluate our approach by singulating up to 8 unknown objects in clutter.
We demonstrate that our method enables the robot to perform the task with a high success rate and a low number of required push actions.
Our results based on real-world experiments show that our network is able to generalize to novel objects of various sizes and shapes, as well as to arbitrary object configurations.
Videos of our experiments can be viewed at http://robotpush.cs.uni-freiburg.de
We present an approach for building an active agent that learns to segment its visual observations into individual objects by interacting with its environment in a completely self-supervised manner.
The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.
The model learned from over 50K interactions generalizes to novel objects and backgrounds.
To deal with noisy training signal for segmenting objects obtained by self-supervised interactions, we propose robust set loss.
A dataset of robot's interactions along-with a few human labeled examples is provided as a benchmark for future research.
We test the utility of the learned segmentation model by providing results on a downstream vision-based control task of rearranging multiple objects into target configurations from visual inputs alone.
Videos, code, and robotic interaction dataset are available at https://pathak22.github.io/seg-by-interaction/
In this paper, we present preliminary results of AFEL-REC, a recommender system for social learning environments.
AFEL-REC is build upon a scalable software architecture to provide recommendations of learning resources in near real-time.
Furthermore, AFEL-REC can cope with any kind of data that is present in social learning environments such as resource metadata, user interactions or social tags.
We provide a preliminary evaluation of three recommendation use cases implemented in AFEL-REC and we find that utilizing social data in form of tags is helpful for not only improving recommendation accuracy but also coverage.
This paper should be valuable for both researchers and practitioners interested in providing resource recommendations in social learning environments.
Deep learning has demonstrated tremendous break through in the area of image/video processing.
In this paper, a spatial-temporal residue network (STResNet) based in-loop filter is proposed to suppress visual artifacts such as blocking, ringing in video coding.
Specifically, the spatial and temporal information is jointly exploited by taking both current block and co-located block in reference frame into consideration during the processing of in-loop filter.
The architecture of STResNet only consists of four convolution layers which shows hospitality to memory and coding complexity.
Moreover, to fully adapt the input content and improve the performance of the proposed in-loop filter, coding tree unit (CTU) level control flag is applied in the sense of rate-distortion optimization.
Extensive experimental results show that our scheme provides up to 5.1% bit-rate reduction compared to the state-of-the-art video coding standard.
Researchers have extensively explored predictive control strategies for controlling heating, ventilation, and air conditioning (HVAC) units in commercial buildings.
Predictive control strategies, however, critically rely on weather and occupancy forecasts.
Existing state-of-the-art building simulators are incapable of analysing the influence of prediction errors (in weather and occupancy) on HVAC energy consumption and occupant comfort.
In this paper, we introduce ThermalSim, a building simulator that can quantify the effect of prediction errors on the HVAC operations.
ThermalSim has been implemented in C/C++ and MATLAB.
We describe its design, use, and input format.
We present a neural transducer model with visual attention that learns to generate LaTeX markup of a real-world math formula given its image.
Applying sequence modeling and transduction techniques that have been very successful across modalities such as natural language, image, handwriting, speech and audio; we construct an image-to-markup model that learns to produce syntactically and semantically correct LaTeX markup code over 150 words long and achieves a BLEU score of 89%; improving upon the previous state-of-art for the Im2Latex problem.
We also demonstrate with heat-map visualization how attention helps in interpreting the model and can pinpoint (detect and localize) symbols on the image accurately despite having been trained without any bounding box data.
Weakly Supervised Object Detection (WSOD), using only image-level annotations to train object detectors, is of growing importance in object recognition.
In this paper, we propose a novel deep network for WSOD.
Unlike previous networks that transfer the object detection problem to an image classification problem using Multiple Instance Learning (MIL), our strategy generates proposal clusters to learn refined instance classifiers by an iterative process.
The proposals in the same cluster are spatially adjacent and associated with the same object.
This prevents the network from concentrating too much on parts of objects instead of whole objects.
We first show that instances can be assigned object or background labels directly based on proposal clusters for instance classifier refinement, and then show that treating each cluster as a small new bag yields fewer ambiguities than the directly assigning label method.
The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.
Experiments are conducted on the PASCAL VOC, ImageNet detection, and MS-COCO benchmarks for WSOD.
Results show that our method outperforms the previous state of the art significantly.
Consider a movie studio aiming to produce a set of new movies for summer release: What types of movies it should produce?
Who would the movies appeal to?
How many movies should it make?
Similar issues are encountered by a variety of organizations, e.g., mobile-phone manufacturers and online magazines, who have to create new (non-existent) items to satisfy groups of users with different preferences.
In this paper, we present a joint problem formalization of these interrelated issues, and propose generative methods that address these questions simultaneously.
Specifically, we leverage the latent space obtained by training a deep generative model---the Variational Autoencoder (VAE)---via a loss function that incorporates both rating performance and item reconstruction terms.
We then apply a greedy search algorithm that utilizes this learned latent space to jointly obtain K plausible new items, and user groups that would find the items appealing.
An evaluation of our methods on a synthetic dataset indicates that our approach is able to generate novel items similar to highly-desirable unobserved items.
As case studies on real-world data, we applied our method on the MART abstract art and Movielens Tag Genome dataset, which resulted in promising results: small and diverse sets of novel items.
We introduce an improved unsupervised clustering protocol specially suited for large-scale structured data.
The protocol follows three steps: a dimensionality reduction of the data, a density estimation over the low dimensional representation of the data, and a final segmentation of the density landscape.
For the dimensionality reduction step we introduce a parallelized implementation of the well-known t-Stochastic Neighbouring Embedding (t-SNE) algorithm that significantly alleviates some inherent limitations, while improving its suitability for large datasets.
We also introduce a new adaptive Kernel Density Estimation particularly coupled with the t-SNE framework in order to get accurate density estimates out of the embedded data, and a variant of the rainfalling watershed algorithm to identify clusters within the density landscape.
The whole mapping protocol is wrapped in the bigMap R package, together with visualization and analysis tools to ease the qualitative and quantitative assessment of the clustering.
Volunteer computing is being used successfully for large scale scientific computations.
This research is in the context of Volpex, a programming framework that supports communicating parallel processes in a volunteer environment.
Redundancy and checkpointing are combined to ensure consistent forward progress with Volpex in this unique execution environment characterized by heterogeneous failure prone nodes and interdependent replicated processes.
An important parameter for optimizing performance with Volpex is the frequency of checkpointing.
The paper presents a mathematical model to minimize the completion time for inter-dependent parallel processes running in a volunteer environment by finding a suitable checkpoint interval.
Validation is performed with a sample real world application running on a pool of distributed volunteer nodes.
The results indicate that the performance with our predicted checkpoint interval is fairly close to the best performance obtained empirically by varying the checkpoint interval.
A definite Horn theory is a set of n-dimensional Boolean vectors whose characteristic function is expressible as a definite Horn formula, that is, as conjunction of definite Horn clauses.
The class of definite Horn theories is known to be learnable under different query learning settings, such as learning from membership and equivalence queries or learning from entailment.
We propose yet a different type of query: the closure query.
Closure queries are a natural extension of membership queries and also a variant, appropriate in the context of definite Horn formulas, of the so-called correction queries.
We present an algorithm that learns conjunctions of definite Horn clauses in polynomial time, using closure and equivalence queries, and show how it relates to the canonical Guigues-Duquenne basis for implicational systems.
We also show how the different query models mentioned relate to each other by either showing full-fledged reductions by means of query simulation (where possible), or by showing their connections in the context of particular algorithms that use them for learning definite Horn formulas.
This is the preprint version of our paper on REHAB2015.
A balance measurement software based on Kinect2 sensor is evaluated by comparing to Wii balance board in numerical analysis level, and further improved according to the consideration of BFP (Body fat percentage) values of the user.
Several person with different body types are involved into the test.
The algorithm is improved by comparing the body type of the user to the 'golden- standard' body type.
The evaluation results of the optimized algorithm preliminarily prove the reliability of the software.
Facebook News Feed personalization algorithm has a significant impact, on a daily basis, on the lifestyle, mood and opinion of millions of Internet users.
Nonetheless, the behavior of such algorithms usually lacks transparency, motivating measurements, modeling and analysis in order to understand and improve its properties.
In this paper, we propose a reproducible methodology encompassing measurements and an analytical model to capture the visibility of publishers over a News Feed.
First, measurements are used to parameterize and to validate the expressive power of the proposed model.
Then, we conduct a what-if analysis to assess the visibility bias incurred by the users against a baseline derived from the model.
Our results indicate that a significant bias exists and it is more prominent at the top position of the News Feed.
In addition, we found that the bias is non-negligible even for users that are deliberately set as neutral with respect to their political views.
We present Breakout, a group interaction platform for online courses that enables the creation and measurement of face-to-face peer learning groups in online settings.
Breakout is designed to help students easily engage in synchronous, video breakout session based peer learning in settings that otherwise force students to rely on asynchronous text-based communication.
The platform also offers data collection and intervention tools for studying the communication patterns inherent in online learning environments.
The goals of the system are twofold: to enhance student engagement in online learning settings and to create a platform for research into the relationship between distributed group interaction patterns and learning outcomes.
Multiagent systems where agents interact among themselves and with a stochastic environment can be formalized as stochastic games.
We study a subclass named Markov potential games (MPGs) that appear often in economic and engineering applications when the agents share a common resource.
We consider MPGs with continuous state-action variables, coupled constraints and nonconvex rewards.
Previous analysis followed a variational approach that is only valid for very simple cases (convex rewards, invertible dynamics, and no coupled constraints); or considered deterministic dynamics and provided open-loop (OL) analysis, studying strategies that consist in predefined action sequences, which are not optimal for stochastic environments.
We present a closed-loop (CL) analysis for MPGs and consider parametric policies that depend on the current state.
We provide easily verifiable, sufficient and necessary conditions for a stochastic game to be an MPG, even for complex parametric functions (e.g., deep neural networks); and show that a closed-loop Nash equilibrium (NE) can be found (or at least approximated) by solving a related optimal control problem (OCP).
This is useful since solving an OCP--which is a single-objective problem--is usually much simpler than solving the original set of coupled OCPs that form the game--which is a multiobjective control problem.
This is a considerable improvement over the previously standard approach for the CL analysis of MPGs, which gives no approximate solution if no NE belongs to the chosen parametric family, and which is practical only for simple parametric forms.
We illustrate the theoretical contributions with an example by applying our approach to a noncooperative communications engineering game.
We then solve the game with a deep reinforcement learning algorithm that learns policies that closely approximates an exact variational NE of the game.
Creating rankings might seem like a vain exercise in belly-button gazing, even more so for people so unlike that kind of things as programmers.
However, in this paper we will try to prove how creating city (or province) based rankings in Spain has led to all kind of interesting effects, including increased productivity and community building.
We describe the methodology we have used to search for programmers residing in a particular province focusing on those where most population is concentrated and apply different measures to show how these communities differ in structure, number and productivity.
Most Information Retrieval models compute the relevance score of a document for a given query by summing term weights specific to a document or a query.
Heuristic approaches, like TF-IDF, or probabilistic models, like BM25, are used to specify how a term weight is computed.
In this paper, we propose to leverage learning-to-rank principles to learn how to compute a term weight for a given document based on the term occurrence pattern.
A method is presented for solving the discrete-time finite-horizon Linear Quadratic Regulator (LQR) problem subject to auxiliary linear equality constraints, such as fixed end-point constraints.
The method explicitly determines an affine relationship between the control and state variables, as in standard Riccati recursion, giving rise to feedback control policies that account for constraints.
Since the linearly-constrained LQR problem arises commonly in robotic trajectory optimization, having a method that can efficiently compute these solutions is important.
We demonstrate some of the useful properties and interpretations of said control policies, and we compare the computation time of our method against existing methods.
Multi-kernel polar codes have recently been proposed to construct polar codes of lengths different from powers of two.
Decoder implementations for multi-kernel polar codes need to account for this feature, that becomes critical in memory management.
We propose an efficient, generalized memory management framework for implementation of successivecancellation decoding of multi-kernel polar codes.
It can be used on many types of hardware architectures and different flavors of SC decoding algorithms.
We illustrate the proposed solution for small kernel sizes, and give complexity estimates for various kernel combinations and code lengths.
Modern Code Review (MCR) plays a key role in software quality practices.
In MCR process, a new patch (i.e., a set of code changes) is encouraged to be examined by reviewers in order to identify weaknesses in source code prior to an integration into main software repositories.
To mitigate the risk of having future defects, prior work suggests that MCR should be performed with sufficient review participation.
Indeed, recent work shows that a low number of participated reviewers is associated with poor software quality.
However, there is a likely case that a new patch still suffers from poor review participation even though reviewers were invited.
Hence, in this paper, we set out to investigate the factors that are associated with the participation decision of an invited reviewer.
Through a case study of 230,090 patches spread across the Android, LibreOffice, OpenStack and Qt systems, we find that (1) 16%-66% of patches have at least one invited reviewer who did not respond to the review invitation; (2) human factors play an important role in predicting whether or not an invited reviewer will participate in a review; (3) a review participation rate of an invited reviewers and code authoring experience of an invited reviewer are highly associated with the participation decision of an invited reviewer.
These results can help practitioners better understand about how human factors associate with the participation decision of reviewers and serve as guidelines for inviting reviewers, leading to a better inviting decision and a better reviewer participation.
Today computers have become an integral part of life.
However, most people's interaction with computers in on end-user-level.
Computer engineers are needed while designing and developing a structure of computer systems, software and hardware systems and also they need when implementing and solving problems while using these systems.
Training of qualified computer engineers is vital to have a say in future technology.
Recently, big data analysis, cloud technologies, wearable technologies, mobile and online services become popular.
For that reason, computer engineering education should update itself regularly and keep up with the latest improvements.
In this study, it is touched on some topics which are suggested to extend computer engineering curricula such as big data analyses, wearable technologies internet of things, cloud technologies, identity management and cyber security which are expected to widening in the area and also demanded that computer engineering student should be qualified on.
Related topics will be described and usage areas will be explained, developments and future roles will be mentioned and also expected achievements will be described.
These achievement's relevance with learning outcomes of departments which are accredited by MUDEK will be defined.
We propose a Markov chain simulation method to generate simple connected random graphs with a specified degree sequence and level of clustering.
The networks generated by our algorithm are random in all other respects and can thus serve as generic models for studying the impacts of degree distributions and clustering on dynamical processes as well as null models for detecting other structural properties in empirical networks.
Deep metrics have been shown effective as similarity measures in multi-modal image registration; however, the metrics are currently constructed from aligned image pairs in the training data.
In this paper, we propose a strategy for learning such metrics from roughly aligned training data.
Symmetrizing the data corrects bias in the metric that results from misalignment in the data (at the expense of increased variance), while random perturbations to the data, i.e.dithering, ensures that the metric has a single mode, and is amenable to registration by optimization.
Evaluation is performed on the task of registration on separate unseen test image pairs.
The results demonstrate the feasibility of learning a useful deep metric from substantially misaligned training data, in some cases the results are significantly better than from Mutual Information.
Data augmentation via dithering is, therefore, an effective strategy for discharging the need for well-aligned training data; this brings deep metric registration from the realm of supervised to semi-supervised machine learning.
The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks.
In the presence of skewed data, sophisticated redistribution approaches thus become necessary to achieve load balancing among all reduce tasks to be executed in parallel.
For the complex problem of entity resolution, we propose and evaluate two approaches for such skew handling and load balancing.
The approaches support blocking techniques to reduce the search space of entity resolution, utilize a preprocessing MapReduce job to analyze the data distribution, and distribute the entities of large blocks among multiple reduce tasks.
The evaluation on a real cloud infrastructure shows the value and effectiveness of the proposed load balancing approaches.
Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes.
In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults.
However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults.
Our goal was to evaluate the validity of code coverage as a measure for test effectiveness.
To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java.
We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project.
The results show that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system).
Therefore, we conclude that the coverage metric is only a valid effectiveness indicator for unit tests.
In this paper we propose a new parallel architecture based on Big Data technologies for real-time sentiment analysis on microblogging posts.
Polypus is a modular framework that provides the following functionalities: (1) massive text extraction from Twitter, (2) distributed non-relational storage optimized for time range queries, (3) memory-based intermodule buffering, (4) real-time sentiment classification, (5) near real-time keyword sentiment aggregation in time series, (6) a HTTP API to interact with the Polypus cluster and (7) a web interface to analyze results visually.
The whole architecture is self-deployable and based on Docker containers.
Due to the fact that Korean is a highly agglutinative, character-rich language, previous work on Korean morphological analysis typically employs the use of sub-character features known as graphemes or otherwise utilizes comprehensive prior linguistic knowledge (i.e., a dictionary of known morphological transformation forms, or actions).
These models have been created with the assumption that character-level, dictionary-less morphological analysis was intractable due to the number of actions required.
We present, in this study, a multi-stage action-based model that can perform morphological transformation and part-of-speech tagging using arbitrary units of input and apply it to the case of character-level Korean morphological analysis.
Among models that do not employ prior linguistic knowledge, we achieve state-of-the-art word and sentence-level tagging accuracy with the Sejong Korean corpus using our proposed data-driven Bi-LSTM model.
Fast development of sharing services becomes a crucial part of the process in constructing a cyber-enabled world, as sharing services reinvent how people exchange and obtain goods or services.
However, privacy leakage or disclosure is a key concern which may hinder the development of sharing services.
While significant efforts have been undertaken to address various privacy issues in recent years, there is a surprising lack of a review for privacy concerns in the cyber-enabled sharing world.
To bridge the gap, in this study, we survey and evaluate existing and emerging privacy issues relating to sharing services from various perspectives.
Differing from existing similar works on surveying sharing practices in various fields, our work comprehensively covers six directions of sharing services in the cyber-enabled world and selects solutions mostly from the recent five years.
Finally, we conclude the issues and solutions from three perspectives, namely, the user, platform and service provider perspectives.
We consider a wireless distributed computing system, in which multiple mobile users, connected wirelessly through an access point, collaborate to perform a computation task.
In particular, users communicate with each other via the access point to exchange their locally computed intermediate computation results, which is known as data shuffling.
We propose a scalable framework for this system, in which the required communication bandwidth for data shuffling does not increase with the number of users in the network.
The key idea is to utilize a particular repetitive pattern of placing the dataset (thus a particular repetitive pattern of intermediate computations), in order to provide coding opportunities at both the users and the access point, which reduce the required uplink communication bandwidth from users to access point and the downlink communication bandwidth from access point to users by factors that grow linearly with the number of users.
We also demonstrate that the proposed dataset placement and coded shuffling schemes are optimal (i.e., achieve the minimum required shuffling load) for both a centralized setting and a decentralized setting, by developing tight information-theoretic lower bounds.
Multigrid algorithms are among the fastest iterative methods known today for solving large linear and some non-linear systems of equations.
Greatly optimized for serial operation, they still have a great potential for parallelism not fully realized.
In this work, we present a novel multigrid algorithm designed to work entirely inside many-core architectures like the graphics processing units (GPUs), without memory transfers between the GPU and the central processing unit (CPU), avoiding low bandwitdth communications.
The algorithm is denoted as the high occupancy multigrid (HOMG) because it makes use of entire grid operations with interpolations and relaxations fused into one task, providing useful work for every thread in the grid.
For a given accuracy, its number of operations scale linearly with the total number of nodes in the grid.
Perfect scalability is observed for a large number of processors.
Current advances in the development of autonomous cars suggest that driverless cars may see wide-scale deployment in the near future.
Research by both industry and academia is driven by potential benefits of this new technology, including reductions in fatalities and improvements in traffic and fuel efficiency as well as greater mobility for people who will or cannot drive cars themselves.
A deciding factor for the adoption of self-driving cars besides safety will be the comfort of the passengers.
This report looks at cost functions currently used in motion planning methods for autonomous on-road driving.
Specifically, how the human perception of how comfortable a trajectory is can be formulated within cost functions.
Individuals have an intuitive perception of what makes a good coincidence.
Though the sensitivity to coincidences has often been presented as resulting from an erroneous assessment of probability, it appears to be a genuine competence, based on non-trivial computations.
The model presented here suggests that coincidences occur when subjects perceive complexity drops.
Co-occurring events are, together, simpler than if considered separately.
This model leads to a possible redefinition of subjective probability.
We first explain the notion of secret sharing and also threshold schemes, which can be implemented with the Shamir's secret sharing.
Subsequently, we review social secret sharing (NSG'10,NS'10) and its trust function.
In a secret sharing scheme, a secret is shared among a group of players who can later recover the secret.
We review the construction of a social secret sharing scheme and its application for resource management in cloud, as explained in NS'12.
To clarify the social secret sharing scheme, we first review its trust function according to NL'06.
In this scheme, a secret is maintained by assigning a trust value to each player based on his behavior, i.e., availability.
Much of the recent progress in Vision-to-Language (V2L) problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).
This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text.
We propose here a method of incorporating high-level concepts into the very successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art performance in both image captioning and visual question answering.
We also show that the same mechanism can be used to introduce external semantic information and that doing so further improves performance.
In doing so we provide an analysis of the value of high level semantic information in V2L problems.
The functional programming language Erlang is well-suited for concurrent and distributed applications.
Numerical computing, however, is not seen as one of its strengths.
The recent introduction of Federated Learning, a concept according to which client devices are leveraged for decentralized machine learning tasks, while a central server updates and distributes a global model, provided the motivation for exploring how well Erlang is suited to that problem.
We present ffl-erl, a framework for Federated Learning, written in Erlang, and explore how well it performs in two scenarios: one in which the entire system has been written in Erlang, and another in which Erlang is relegated to coordinating client processes that rely on performing numerical computations in the programming language C. There is a concurrent as well as a distributed implementation of each case.
Erlang incurs a performance penalty, but for certain use cases this may not be detrimental, considering the trade-off between conciseness of the language and speed of development (Erlang) versus performance (C).
Thus, Erlang may be a viable alternative to C for some practical machine learning tasks.
Machine learning methods play increasingly important roles in pre-procedural planning for complex surgeries and interventions.
Very often, however, researchers find the historical records of emerging surgical techniques, such as the transcatheter aortic valve replacement (TAVR), are highly scarce in quantity.
In this paper, we address this challenge by proposing novel generative invertible networks (GIN) to select features and generate high-quality virtual patients that may potentially serve as an additional data source for machine learning.
Combining a convolutional neural network (CNN) and generative adversarial networks (GAN), GIN discovers the pathophysiologic meaning of the feature space.
Moreover, a test of predicting the surgical outcome directly using the selected features results in a high accuracy of 81.55%, which suggests little pathophysiologic information has been lost while conducting the feature selection.
This demonstrates GIN can generate virtual patients not only visually authentic but also pathophysiologically interpretable.
We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text.
CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery.
We aim to understand the method by which the networks process and classify text.
We examine common hypotheses to this problem: that filters, accompanied by global max-pooling, serve as ngram detectors.
We show that filters may capture several different semantic classes of ngrams by using different activation patterns, and that global max-pooling induces behavior which separates important ngrams from the rest.
Finally, we show practical use cases derived from our findings in the form of model interpretability (explaining a trained model by deriving a concrete identity for each filter, bridging the gap between visualization tools in vision tasks and NLP) and prediction interpretability (explaining predictions).
As an important and challenging problem in computer vision, learning based optical flow estimation aims to discover the intrinsic correspondence structure between two adjacent video frames through statistical learning.
Therefore, a key issue to solve in this area is how to effectively model the multi-scale correspondence structure properties in an adaptive end-to-end learning fashion.
Motivated by this observation, we propose an end-to-end multi-scale correspondence structure learning (MSCSL) approach for optical flow estimation.
In principle, the proposed MSCSL approach is capable of effectively capturing the multi-scale inter-image-correlation correspondence structures within a multi-level feature space from deep learning.
Moreover, the proposed MSCSL approach builds a spatial Conv-GRU neural network model to adaptively model the intrinsic dependency relationships among these multi-scale correspondence structures.
Finally, the above procedures for correspondence structure learning and multi-scale dependency modeling are implemented in a unified end-to-end deep learning framework.
Experimental results on several benchmark datasets demonstrate the effectiveness of the proposed approach.
Modularisation, repetition, and symmetry are structural features shared by almost all biological neural networks.
These features are very unlikely to be found by the means of structural evolution of artificial neural networks.
This paper introduces NMODE, which is specifically designed to operate on neuro-modules.
NMODE addresses a second problem in the context of evolutionary robotics, which is incremental evolution of complex behaviours for complex machines, by offering a way to interface neuro-modules.
The scenario in mind is a complex walking machine, for which a locomotion module is evolved first, that is then extended by other modules in later stages.
We show that NMODE is able to evolve a locomotion behaviour for a standard six-legged walking machine in approximately 10 generations and show how it can be used for incremental evolution of a complex walking machine.
The entire source code used in this paper is publicly available through GitHub.
We present the Mixed Likelihood Gaussian process latent variable model (GP-LVM), capable of modeling data with attributes of different types.
The standard formulation of GP-LVM assumes that each observation is drawn from a Gaussian distribution, which makes the model unsuited for data with e.g. categorical or nominal attributes.
Our model, for which we use a sampling based variational inference, instead assumes a separate likelihood for each observed dimension.
This formulation results in more meaningful latent representations, and give better predictive performance for real world data with dimensions of different types.
Iterative Closest Point (ICP) is a widely used method for performing scan-matching and registration.
Being simple and robust method, it is still computationally expensive and may be challenging to use in real-time applications with limited resources on mobile platforms.
In this paper we propose novel effective method for acceleration of ICP which does not require substantial modifications to the existing code.
This method is based on an idea of Anderson acceleration which is an iterative procedure for finding a fixed point of contractive mapping.
The latter is often faster than a standard Picard iteration, usually used in ICP implementations.
We show that ICP, being a fixed point problem, can be significantly accelerated by this method enhanced by heuristics to improve overall robustness.
We implement proposed approach into Point Cloud Library (PCL) and make it available online.
Benchmarking on real-world data fully supports our claims.
In recent years two sets of planar (2D) shape attributes, provided with an intuitive physical meaning, were proposed to the remote sensing community by, respectively, Nagao & Matsuyama and Shackelford & Davis in their seminal works on the increasingly popular geographic object based image analysis (GEOBIA) paradigm.
These two published sets of intuitive geometric features were selected as initial conditions by the present R&D software project, whose multi-objective goal was to accomplish: (i) a minimally dependent and maximally informative design (knowledge/information representation) of a general purpose, user and application independent dictionary of 2D shape terms provided with a physical meaning intuitive to understand by human end users and (ii) an effective (accurate, scale invariant, easy to use) and efficient implementation of 2D shape descriptors.
To comply with the Quality Assurance Framework for Earth Observation guidelines, the proposed suite of geometric functions is validated by means of a novel quantitative quality assurance policy, centered on inter feature dependence (causality) assessment.
This innovative multivariate feature validation strategy is alternative to traditional feature selection procedures based on either inductive data learning classification accuracy estimation, which is inherently case specific, or cross correlation estimation, because statistical cross correlation does not imply causation.
The project deliverable is an original general purpose software suite of seven validated off the shelf 2D shape descriptors intuitive to use.
Alternative to existing commercial or open source software libraries of tens of planar shape functions whose informativeness remains unknown, it is eligible for use in (GE)OBIA systems in operating mode, expected to mimic human reasoning based on a convergence of evidence approach.
Knowledge Management is a global process in companies.
It includes all the processes that allow capitalization, sharing and evolution of the Knowledge Capital of the firm, generally recognized as a critical resource of the organization.
Several approaches have been defined to capitalize knowledge but few of them study how to learn from this knowledge.
We present in this paper an approach that helps to enhance learning from profession knowledge in an organisation.
We apply our approach on knitting industry.
One of the distinguishing aspects of human language is its compositionality, which allows us to describe complex environments with limited vocabulary.
Previously, it has been shown that neural network agents can learn to communicate in a highly structured, possibly compositional language based on disentangled input (e.g. hand- engineered features).
Humans, however, do not learn to communicate based on well-summarized features.
In this work, we train neural agents to simultaneously develop visual perception from raw image pixels, and learn to communicate with a sequence of discrete symbols.
The agents play an image description game where the image contains factors such as colors and shapes.
We train the agents using the obverter technique where an agent introspects to generate messages that maximize its own understanding.
Through qualitative analysis, visualization and a zero-shot test, we show that the agents can develop, out of raw image pixels, a language with compositional properties, given a proper pressure from the environment.
There is a plenty of research going on in field of object recognition, but object state recognition has not been addressed as much.
There are many important applications which can utilize object state recognition, such as, in robotics, to decide for how to grab an object.
A convolution neural network was designed to classify an image to one of its states.
The approach used for training is transfer learning with Inception v3 module of GoogLeNet used as the pre-trained model.
The model was trained on images of 18 cooking objects and tested on another set of cooking objects.
The model was able to classify those images with 76% accuracy.
In this letter, we develop a converse bound on the asymptotic load threshold of coded slotted ALOHA (CSA) schemes with K-multi packet reception capabilities at the receiver.
Density evolution is used to track the average probability of packet segment loss and an area matching condition is applied to obtain the converse.
For any given CSA rate, the converse normalized to K increases with K, which is in contrast with the results obtained so far for slotted ALOHA schemes based on successive interference cancellation.
We show how the derived bound can be approached using spatially-coupled CSA.
Traditional fact checking by experts and analysts cannot keep pace with the volume of newly created information.
It is important and necessary, therefore, to enhance our ability to computationally determine whether some statement of fact is true or false.
We view this problem as a link-prediction task in a knowledge graph, and present a discriminative path-based method for fact checking in knowledge graphs that incorporates connectivity, type information, and predicate interactions.
Given a statement S of the form (subject, predicate, object), for example, (Chicago, capitalOf, Illinois), our approach mines discriminative paths that alternatively define the generalized statement (U.S. city, predicate, U.S. state) and uses the mined rules to evaluate the veracity of statement S. We evaluate our approach by examining thousands of claims related to history, geography, biology, and politics using a public, million node knowledge graph extracted from Wikipedia and PubMedDB.
Not only does our approach significantly outperform related models, we also find that the discriminative predicate path model is easily interpretable and provides sensible reasons for the final determination.
This paper extends our previous work on regularization of neural networks using Eigenvalue Decay by employing a soft approximation of the dominant eigenvalue in order to enable the calculation of its derivatives in relation to the synaptic weights, and therefore the application of back-propagation, which is a primary demand for deep learning.
Moreover, we extend our previous theoretical analysis to deep neural networks and multiclass classification problems.
Our method is implemented as an additional regularizer in Keras, a modular neural networks library written in Python, and evaluated in the benchmark data sets Reuters Newswire Topics Classification, IMDB database for binary sentiment classification, MNIST database of handwritten digits and CIFAR-10 data set for image classification.
Fault based testing is a technique in which test cases are chosen to reveal certain classes of faults.
At present, testing professionals use their personal experience to select testing methods for fault classes considered the most likely to be present.
However, there is little empirical evidence available in the open literature to support these intuitions.
By examining the source code changes when faults were fixed in seven open source software artifacts, we have classified bug fix patterns into fault classes, and recorded the relative frequencies of the identified fault classes.
This paper reports our findings related to "if-conditional" fixes.
We have classified the "if-conditional" fixes into fourteen fault classes and calculated their frequencies.
We found the most common fault class related to changes within a single "atom".
The next most common fault was the omission of an "atom".
We analysed these results in the context of Boolean specification testing.
Skin cancer is one of the major types of cancers and its incidence has been increasing over the past decades.
Skin lesions can arise from various dermatologic disorders and can be classified to various types according to their texture, structure, color and other morphological features.
The accuracy of diagnosis of skin lesions, specifically the discrimination of benign and malignant lesions, is paramount to ensure appropriate patient treatment.
Machine learning-based classification approaches are among popular automatic methods for skin lesion classification.
While there are many existing methods, convolutional neural networks (CNN) have shown to be superior over other classical machine learning methods for object detection and classification tasks.
In this work, a fully automatic computerized method is proposed, which employs well established pre-trained convolutional neural networks and ensembles learning to classify skin lesions.
We trained the networks using 2000 skin lesion images available from the ISIC 2017 challenge, which has three main categories and includes 374 melanoma, 254 seborrheic keratosis and 1372 benign nevi images.
The trained classifier was then tested on 150 unlabeled images.
The results, evaluated by the challenge organizer and based on the area under the receiver operating characteristic curve (AUC), were 84.8% and 93.6% for Melanoma and seborrheic keratosis binary classification problem, respectively.
The proposed method achieved competitive results to experienced dermatologist.
Further improvement and optimization of the proposed method with a larger training dataset could lead to a more precise, reliable and robust method for skin lesion classification.
Permutation codes, in the form of rank modulation, have shown promise for applications such as flash memory.
One of the metrics recently suggested as appropriate for rank modulation is the Ulam metric, which measures the minimum translocation distance between permutations.
Multipermutation codes have also been proposed as a generalization of permutation codes that would improve code size (and consequently the code rate).
In this paper we analyze the Ulam metric in the context of multipermutations, noting some similarities and differences between the Ulam metric in the context of permutations.
We also consider sphere sizes for multipermutations under the Ulam metric and resulting bounds on code size.
Over the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains.
However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets.
This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems.
In this research, we try to address this scalability issue for the algorithms that learn answer set programs.
We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution.
We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible.
The system is publicly available at https://goo.gl/KdWAcV.
This paper is under consideration for acceptance in TPLP.
Convolutional neural networks (CNNs) are inherently equivariant to translation.
Efforts to embed other forms of equivariance have concentrated solely on rotation.
We expand the notion of equivariance in CNNs through the Polar Transformer Network (PTN).
PTN combines ideas from the Spatial Transformer Network (STN) and canonical coordinate representations.
The result is a network invariant to translation and equivariant to both rotation and scale.
PTN is trained end-to-end and composed of three distinct stages: a polar origin predictor, the newly introduced polar transformer module and a classifier.
PTN achieves state-of-the-art on rotated MNIST and the newly introduced SIM2MNIST dataset, an MNIST variation obtained by adding clutter and perturbing digits with translation, rotation and scaling.
The ideas of PTN are extensible to 3D which we demonstrate through the Cylindrical Transformer Network.
We present a new family of one-coincidence sequence sets suitable for frequency hopping code division multiple access (FH-CDMA) systems with dispersed (low density) sequence elements.
These sets are derived from one-coincidence prime sequence sets, such that for each one-coincidence prime sequence set there is a new one-coincidence set comprised of sequences with dispersed sequence elements, required in some circumstances, for FH-CDMA systems.
Getting rid of crowdedness of sequence elements is achieved by doubling the size of the sequence element alphabet.
In addition, this doubling process eases control over the distance between adjacent sequence elements.
Properties of the new sets are discussed.
Recent years have seen proliferation in versatile mobile devices and an upsurge in the growth of data-consuming application services.
Orthogonal multiple access (OMA) technologies in today's mobile systems fall inefficient in the presence of such massive connectivity and traffic demands.
In this regards, non-orthogonal multiple access (NOMA) has been advocated by the research community to embrace unprecedented requirements.
Current NOMA designs have been demonstrated to largely improve conventional system performance in terms of throughput and latency, while their impact on the end users' perceived experience has not yet been comprehensively understood.
We envision that quality-of-experience (QoE) awareness is a key pillar for NOMA designs to fulfill versatile user demands in the 5th generation (5G) wireless communication systems.
This article systematically investigates QoE-aware NOMA designs that translate the physical-layer benefits of NOMA to the improvement of users' perceived experience in upper layers.
We shed light on design principles and key challenges in realizing QoE-aware NOMA designs.
With these principles and challenges in mind, we develop a general architecture with a dynamic network scheduling scheme.
We provide some implications for future QoE-aware NOMA designs by conducting a case study in video streaming applications.
Morphisms to finite semigroups can be used for recognizing omega-regular languages.
The so-called strongly recognizing morphisms can be seen as a deterministic computation model which provides minimal objects (known as the syntactic morphism) and a trivial complementation procedure.
We give a quadratic-time algorithm for computing the syntactic morphism from any given strongly recognizing morphism, thereby showing that minimization is easy as well.
In addition, we give algorithms for efficiently solving various decision problems for weakly recognizing morphisms.
Weakly recognizing morphism are often smaller than their strongly recognizing counterparts.
Finally, we describe the language operations needed for converting formulas in monadic second-order logic (MSO) into strongly recognizing morphisms, and we give some experimental results.
Software-Defined Networking (SDN) is an emerging paradigm that promises to change this state of affairs, by breaking vertical integration, separating the network's control logic from the underlying routers and switches, promoting (logical) centralization of network control, and introducing the ability to program the network.
The separation of concerns introduced between the definition of network policies, their implementation in switching hardware, and the forwarding of traffic, is key to the desired flexibility: by breaking the network control problem into tractable pieces, SDN makes it easier to create and introduce new abstractions in networking, simplifying network management and facilitating network evolution.
In this paper we present a comprehensive survey on SDN.
We start by introducing the motivation for SDN, explain its main concepts and how it differs from traditional networking, its roots, and the standardization activities regarding this novel paradigm.
Next, we present the key building blocks of an SDN infrastructure using a bottom-up, layered approach.
We provide an in-depth analysis of the hardware infrastructure, southbound and northbound APIs, network virtualization layers, network operating systems (SDN controllers), network programming languages, and network applications.
We also look at cross-layer problems such as debugging and troubleshooting.
In an effort to anticipate the future evolution of this new paradigm, we discuss the main ongoing research efforts and challenges of SDN.
In particular, we address the design of switches and control platforms -- with a focus on aspects such as resiliency, scalability, performance, security and dependability -- as well as new opportunities for carrier transport networks and cloud providers.
Last but not least, we analyze the position of SDN as a key enabler of a software-defined environment.
We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain.
This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis.
The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain.
We also present an algorithm listing all possible information-preserving state space reductions, for a given transition graph.
We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.
This paper introduces a general simulation framework that can allow the simulation of crashes and the evaluation of consequences on existing microsimulation packages.
A specific family of simple and reproducible conflict indicators is proposed and applied to many case studies.
In this approach driver failures are simulated by assuming that a driver stops reacting to an external stimulus and keeps driving at the current speed for a given time.
The trajectory of the distracted driver vehicle is thus evaluated and projected, for the given time steps, for the established distraction time, over the actual trajectories of other vehicles.
Every occurring crash is then evaluated in terms of energy involved in the crash, or with any other severity index (which can be easily calculated since the accident dynamics can be accurately simulated).
The simulation of a driver error allows not only the typology of crashes to be included, normally accounted for with surrogate safety measures, but also many other type of typical crashes that it is impossible to simulate with microsimulation and traditional methodologies being caused by vehicles who are driving on non-conflicting trajectories such as drivers speeding at a red light, drivers taking the wrong lane or side of the street or just driving off the road in isolated accidents against external obstacles or traffic barriers.
The total crash energy of all crashes is proposed as an indicator of risk and adopted in the case studies.
Moreover, the concepts introduced in this paper allow scientists to define other relevant variables that can be used as surrogate safety indicators that consider driving errors.
Preliminary results on different case studies have shown a great accordance of safety evaluations with statistical data and empirical expectations and also with other traditional safety indicators that are commonly used in microsimulation.
Photo lineups play a significant role in the eyewitness identification process.
This method is used to provide evidence in the prosecution and subsequent conviction of suspects.
Unfortunately, there are many cases where lineups have led to the conviction of an innocent suspect.
One of the key factors affecting the incorrect identification of a suspect is the lack of lineup fairness, i.e. that the suspect differs significantly from all other candidates.
Although the process of assembling fair lineup is both highly important and time-consuming, only a handful of tools are available to simplify the task.
In this paper, we describe our work towards using recommender systems for the photo lineup assembling task.
We propose and evaluate two complementary methods for item-based recommendation: one based on the visual descriptors of the deep neural network, the other based on the content-based attributes of persons.
The initial evaluation made by forensic technicians shows that although results favored visual descriptors over attribute-based similarity, both approaches are functional and highly diverse in terms of recommended objects.
Thus, future work should involve incorporating both approaches in a single prediction method, preference learning based on the feedback from forensic technicians and recommendation of assembled lineups instead of single candidates.
Deep neural networks have been investigated in learning latent representations of medical images, yet most of the studies limit their approach in a single supervised convolutional neural network (CNN), which usually rely heavily on a large scale annotated dataset for training.
To learn image representations with less supervision involved, we propose a deep Siamese CNN (SCNN) architecture that can be trained with only binary image pair information.
We evaluated the learned image representations on a task of content-based medical image retrieval using a publicly available multiclass diabetic retinopathy fundus image dataset.
The experimental results show that our proposed deep SCNN is comparable to the state-of-the-art single supervised CNN, and requires much less supervision for training.
Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent.
In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing across distributed workers to achieve speed-up and efficiency.
Several computational tasks are of sequential nature, and involve multiple passes over the data.
At each iteration over the data, it is common practice to randomly re-shuffle the data at the master node, assigning different batches for each worker to process.
This random re-shuffling operation comes at the cost of extra communication overhead, since at each shuffle, new data points need to be delivered to the distributed workers.
In this paper, we focus on characterizing the information theoretically optimal communication overhead for the distributed data shuffling problem.
We propose a novel coded data delivery scheme for the case of no excess storage, where every worker can only store the assigned data batches under processing.
Our scheme exploits a new type of coding opportunity and is applicable to any arbitrary shuffle, and for any number of workers.
We also present an information theoretic lower bound on the minimum communication overhead for data shuffling, and show that the proposed scheme matches this lower bound for the worst-case communication overhead.
Since 2010, we have built and maintained LensKit, an open-source toolkit for building, researching, and learning about recommender systems.
We have successfully used the software in a wide range of recommender systems experiments, to support education in traditional classroom and online settings, and as the algorithmic backend for user-facing recommendation services in movies and books.
This experience, along with community feedback, has surfaced a number of challenges with LensKit's design and environmental choices.
In response to these challenges, we are developing a new set of tools that leverage the PyData stack to enable the kinds of research experiments and educational experiences that we have been able to deliver with LensKit, along with new experimental structures that the existing code makes difficult.
The result is a set of research tools that should significantly increase research velocity and provide much smoother integration with other software such as Keras while maintaining the same level of reproducibility as a LensKit experiment.
In this paper, we reflect on the LensKit project, particularly on our experience using it for offline evaluation experiments, and describe the next-generation LKPY tools for enabling new offline evaluations and experiments with flexible, open-ended designs and well-tested evaluation primitives.
In this paper we propose a Deep Neural Network (DNN) based Speech Enhancement (SE) system that is designed to maximize an approximation of the Short-Time Objective Intelligibility (STOI) measure.
We formalize an approximate-STOI cost function and derive analytical expressions for the gradients required for DNN training and show that these gradients have desirable properties when used together with gradient based optimization techniques.
We show through simulation experiments that the proposed SE system achieves large improvements in estimated speech intelligibility, when tested on matched and unmatched natural noise types, at multiple signal-to-noise ratios.
Furthermore, we show that the SE system, when trained using an approximate-STOI cost function performs on par with a system trained with a mean square error cost applied to short-time temporal envelopes.
Finally, we show that the proposed SE system performs on par with a traditional DNN based Short-Time Spectral Amplitude (STSA) SE system in terms of estimated speech intelligibility.
These results are important because they suggest that traditional DNN based STSA SE systems might be optimal in terms of estimated speech intelligibility.
This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning.
The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions.
By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property.
As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation.
Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation.
Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.
The paper explores the topic of Facial Action Unit (FAU) detection in the wild.
In particular, we are interested in answering the following questions: (1) how useful are residual connections across dense blocks for face analysis?
(2) how useful is the information from a network trained for categorical Facial Expression Recognition (FER) for the task of FAU detection?
The proposed network (ResiDen) exploits dense blocks along with residual connections and uses auxiliary information from a FER network.
The experiments are performed on the EmotionNet and DISFA datasets.
The experiments show the usefulness of facial expression information for AU detection.
The proposed network achieves state-of-art results on the two databases.
Analysis of the results for cross database protocol shows the effectiveness of the network.
CAPTCHAs or reverse Turing tests are real-time assessments used by programs (or computers) to tell humans and machines apart.
This is achieved by assigning and assessing hard AI problems that could only be solved easily by human but not by machines.
Applications of such assessments range from stopping spammers from automatically filling online forms to preventing hackers from performing dictionary attack.
Today, the race between makers and breakers of CAPTCHAs is at a juncture, where the CAPTCHAs proposed are not even answerable by humans.
We consider such CAPTCHAs as non user friendly.
In this paper, we propose a novel technique for reverse Turing test - we call it the Line CAPTCHAs - that mainly focuses on user friendliness while not compromising the security aspect that is expected to be provided by such a system.
Memories that exploit three-dimensional (3D)-stacking technology, which integrate memory and logic dies in a single stack, are becoming popular.
These memories, such as Hybrid Memory Cube (HMC), utilize a network-on-chip (NoC) design for connecting their internal structural organizations.
This novel usage of NoC, in addition to aiding processing-in-memory capabilities, enables numerous benefits such as high bandwidth and memory-level parallelism.
However, the implications of NoCs on the characteristics of 3D-stacked memories in terms of memory access latency and bandwidth have not been fully explored.
This paper addresses this knowledge gap by (i) characterizing an HMC prototype on the AC-510 accelerator board and revealing its access latency behaviors, and (ii) by investigating the implications of such behaviors on system and software designs.
Building a voice conversion (VC) system from non-parallel speech corpora is challenging but highly valuable in real application scenarios.
In most situations, the source and the target speakers do not repeat the same texts or they may even speak different languages.
In this case, one possible, although indirect, solution is to build a generative model for speech.
Generative models focus on explaining the observations with latent variables instead of learning a pairwise transformation function, thereby bypassing the requirement of speech frame alignment.
In this paper, we propose a non-parallel VC framework with a variational autoencoding Wasserstein generative adversarial network (VAW-GAN) that explicitly considers a VC objective when building the speech model.
Experimental results corroborate the capability of our framework for building a VC system from unaligned data, and demonstrate improved conversion quality.
Deep learning stands at the forefront in many computer vision tasks.
However, deep neural networks are usually data-hungry and require a huge amount of well-annotated training samples.
Collecting sufficient annotated data is very expensive in many applications, especially for pixel-level prediction tasks such as semantic segmentation.
To solve this fundamental issue, we consider a new challenging vision task, Internetly supervised semantic segmentation, which only uses Internet data with noisy image-level supervision of corresponding query keywords for segmentation model training.
We address this task by proposing the following solution.
A class-specific attention model unifying multiscale forward and backward convolutional features is proposed to provide initial segmentation "ground truth".
The model trained with such noisy annotations is then improved by an online fine-tuning procedure.
It achieves state-of-the-art performance under the weakly-supervised setting on PASCAL VOC2012 dataset.
The proposed framework also paves a new way towards learning from the Internet without human interaction and could serve as a strong baseline therein.
Code and data will be released upon the paper acceptance.
Multi robot systems have the potential to be utilized in a variety of applications.
In most of the previous works, the trajectory generation for multi robot systems is implemented in known environments.
To overcome that we present an online trajectory optimization algorithm that utilizes communication of robots' current states to account to the other robots while using local object based maps for identifying obstacles.
Based upon this data, we predict the trajectory expected to be traversed by the robots and utilize that to avoid collisions by formulating regions of free space that the robot can be without colliding with other robots and obstacles.
A trajectory is optimized constraining the robot to remain within this region.The proposed method is tested in simulations on Gazebo using ROS.
This paper presents a low-power ECG recording system-on-chip (SoC) with on-chip low-complexity lossless ECG compression for data reduction in wireless/ambulatory ECG sensor devices.
The chip uses a linear slope predictor for data compression, and incorporates a novel low-complexity dynamic coding-packaging scheme to frame the prediction error into fixed-length 16-bit format.
The proposed technique achieves an average compression ratio of 2.25x on MIT/BIH ECG database.
Implemented in a standard 0.35 um process, the compressor uses 0.565K gates/channel occupying 0.4 mm2 for four channels, and consumes 535 nW/channel at 2.4 V for ECG sampled at 512 Hz.
Small size and ultra-low power consumption makes the proposed technique suitable for wearable ECG sensor applications.
The most popular stability notion in games should be Nash equilibrium under the rationality of players who maximize their own payoff individually.
In contrast, in many scenarios, players can be (partly) irrational with some unpredictable factors.
Hence a strategy profile can be more robust if it is resilient against certain irrational behaviors.
In this paper, we propose a stability notion that is resilient against envy.
A strategy profile is said to be envy-proof if each player cannot gain a competitive edge with respect to the change in utility over the other players by deviation.
Together with Nash equilibrium and another stability notion called immunity, we show how these separate notions are related to each other, whether they exist in games, and whether and when a strategy profile satisfying these notions can be efficiently found.
We answer these questions by starting with the general two player game and extend the discussion for the approximate stability and for the corresponding fault-tolerance notions in multi-player games.
This extended abstract is about an effort to build a formal description of a triangulation algorithm starting with a naive description of the algorithm where triangles, edges, and triangulations are simply given as sets and the most complex notions are those of boundary and separating edges.
When performing proofs about this algorithm, questions of symmetry appear and this exposition attempts to give an account of how these symmetries can be handled.
All this work relies on formal developments made with Coq and the mathematical components library.
Mobile phone based potable water quality assessment device is developed to analyze and study water pollution level at Indus river.
Indus river is habitat of endangered Indus river dolphin and water pollution is one of major causes of survivability threats for this specie.
We tested device performance at the six locations of Lahore canal. pH of canal water deviates from the normal range of the irrigation water.
In future, we will study correlation between water pollution level and habitat usage of Indus river dolphin using water quality assessment device and hydrophone array based passive acoustic monitoring (PAM) system.
As a powerful representation paradigm for networked and multi-typed data, the heterogeneous information network (HIN) is ubiquitous.
Meanwhile, defining proper relevance measures has always been a fundamental problem and of great pragmatic importance for network mining tasks.
Inspired by our probabilistic interpretation of existing path-based relevance measures, we propose to study HIN relevance from a probabilistic perspective.
We also identify, from real-world data, and propose to model cross-meta-path synergy, which is a characteristic important for defining path-based HIN relevance and has not been modeled by existing methods.
A generative model is established to derive a novel path-based relevance measure, which is data-driven and tailored for each HIN.
We develop an inference algorithm to find the maximum a posteriori (MAP) estimate of the model parameters, which entails non-trivial tricks.
Experiments on two real-world datasets demonstrate the effectiveness of the proposed model and relevance measure.
Visual observation of Cumulus Oocyte Complexes provides only limited information about its functional competence, whereas the molecular evaluations methods are cumbersome or costly.
Image analysis of mammalian oocytes can provide attractive alternative to address this challenge.
However, it is complex, given the huge number of oocytes under inspection and the subjective nature of the features inspected for identification.
Supervised machine learning methods like random forest with annotations from expert biologists can make the analysis task standardized and reduces inter-subject variability.
We present a semi-automatic framework for predicting the class an oocyte belongs to, based on multi-object parametric segmentation on the acquired microscopic image followed by a feature based classification using random forests.
We present an open-source accessory for the NAO robot, which enables to test computationally demanding algorithms in an external platform while preserving robot's autonomy and mobility.
The platform has the form of a backpack, which can be 3D printed and replicated, and holds an ODROID XU4 board to process algorithms externally with ROS compatibility.
We provide also a software bridge between the B-Human's framework and ROS to have access to the robot's sensors close to real-time.
We tested the platform in several robotics applications such as data logging, visual SLAM, and robot vision with deep learning techniques.
The CAD model, hardware specifications and software are available online for the benefit of the community: https://github.com/uchile-robotics/nao-backpack
The Industry 4.0 paradigm emphasizes the crucial benefits that collaborative robots, i.e., robots able to work alongside and together with humans, could bring to the whole production process.
In this context, an enabling technology yet unreached is the design of flexible robots able to deal at all levels with humans' intrinsic variability, which is not only a necessary element for a comfortable working experience for the person but also a precious capability for efficiently dealing with unexpected events.
In this paper, a sensing, representation, planning and control architecture for flexible human-robot cooperation, referred to as FlexHRC, is proposed.
FlexHRC relies on wearable sensors for human action recognition, AND/OR graphs for the representation of and reasoning upon cooperation models, and a Task Priority framework to decouple action planning from robot motion planning and control.
XML data warehouses form an interesting basis for decision-support applications that exploit heterogeneous data from multiple sources.
However, XML-native database systems currently bear limited performances and it is necessary to research ways to optimize them.
In this paper, we propose a new index that is specifically adapted to the multidimensional architecture of XML warehouses and eliminates join operations, while preserving the information contained in the original warehouse.
A theoretical study and experimental results demonstrate the efficiency of our index, even when queries are complex.
Perceptual judgment of image similarity by humans relies on rich internal representations ranging from low-level features to high-level concepts, scene properties and even cultural associations.
However, existing methods and datasets attempting to explain perceived similarity use stimuli which arguably do not cover the full breadth of factors that affect human similarity judgments, even those geared toward this goal.
We introduce a new dataset dubbed Totally-Looks-Like (TLL) after a popular entertainment website, which contains images paired by humans as being visually similar.
The dataset contains 6016 image-pairs from the wild, shedding light upon a rich and diverse set of criteria employed by human beings.
We conduct experiments to try to reproduce the pairings via features extracted from state-of-the-art deep convolutional neural networks, as well as additional human experiments to verify the consistency of the collected data.
Though we create conditions to artificially make the matching task increasingly easier, we show that machine-extracted representations perform very poorly in terms of reproducing the matching selected by humans.
We discuss and analyze these results, suggesting future directions for improvement of learned image representations.
In this paper, we present a methodology for customized communication architecture synthesis that matches the communication requirements of the target application.
This is an important problem, particularly for network-based implementations of complex applications.
Our approach is based on using frequently encountered generic communication primitives as an alphabet capable of characterizing any given communication pattern.
The proposed algorithm searches through the entire design space for a solution that minimizes the system total energy consumption, while satisfying the other design constraints.
Compared to the standard mesh architecture, the customized architecture generated by the newly proposed approach shows about 36% throughput increase and 51% reduction in the energy required to encrypt 128 bits of data with a standard encryption algorithm.
In this paper, we give algorithms and methods of construction of self-dual codes over finite fields using orthogonal matrices.
Randomization in the orthogonal group, and code extension are the main tools.
Some optimal, almost MDS, and MDS self-dual codes over both small and large prime fields are constructed.
This study investigates how adequate coordination among the different cognitive processes of a humanoid robot can be developed through end-to-end learning of direct perception of visuomotor stream.
We propose a deep dynamic neural network model built on a dynamic vision network, a motor generation network, and a higher-level network.
The proposed model was designed to process and to integrate direct perception of dynamic visuomotor patterns in a hierarchical model characterized by different spatial and temporal constraints imposed on each level.
We conducted synthetic robotic experiments in which a robot learned to read human's intention through observing the gestures and then to generate the corresponding goal-directed actions.
Results verify that the proposed model is able to learn the tutored skills and to generalize them to novel situations.
The model showed synergic coordination of perception, action and decision making, and it integrated and coordinated a set of cognitive skills including visual perception, intention reading, attention switching, working memory, action preparation and execution in a seamless manner.
Analysis reveals that coherent internal representations emerged at each level of the hierarchy.
Higher-level representation reflecting actional intention developed by means of continuous integration of the lower-level visuo-proprioceptive stream.
Consider two horizontal lines in the plane.
A pair of a point on the top line and an interval on the bottom line defines a triangle between two lines.
The intersection graph of such triangles is called a simple-triangle graph.
This paper shows a vertex ordering characterization of simple-triangle graphs as follows: a graph is a simple-triangle graph if and only if there is a linear ordering of the vertices that contains both an alternating orientation of the graph and a transitive orientation of the complement of the graph.
We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori.
Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams.
In addition, we also propose the use of randomly sampled regular expressions to discover and encode patterns in activities.
We demonstrate the effectiveness of our approach in experimental evaluations where we successfully recognize activities and detect anomalies in four complex datasets.
Using WiFi signals for indoor localization is the main localization modality of the existing personal indoor localization systems operating on mobile devices.
WiFi fingerprinting is also used for mobile robots, as WiFi signals are usually available indoors and can provide rough initial position estimate or can be used together with other positioning systems.
Currently, the best solutions rely on filtering, manual data analysis, and time-consuming parameter tuning to achieve reliable and accurate localization.
In this work, we propose to use deep neural networks to significantly lower the work-force burden of the localization system design, while still achieving satisfactory results.
Assuming the state-of-the-art hierarchical approach, we employ the DNN system for building/floor classification.
We show that stacked autoencoders allow to efficiently reduce the feature space in order to achieve robust and precise classification.
The proposed architecture is verified on the publicly available UJIIndoorLoc dataset and the results are compared with other solutions.
Varying weather conditions, including rainfall and snowfall, are generally regarded as a challenge for computer vision algorithms.
One proposed solution to the challenges induced by rain and snowfall is to artificially remove the rain from images or video using rain removal algorithms.
It is the promise of these algorithms that the rain-removed image frames will improve the performance of subsequent segmentation and tracking algorithms.
However, rain removal algorithms are typically evaluated on their ability to remove synthetic rain on a small subset of images.
Currently, their behavior is unknown on real-world videos when integrated with a typical computer vision pipeline.
In this paper, we review the existing rain removal algorithms and propose a new dataset that consists of 22 traffic surveillance sequences under a broad variety of weather conditions that all include either rain or snowfall.
We propose a new evaluation protocol that evaluates the rain removal algorithms on their ability to improve the performance of subsequent segmentation, instance segmentation, and feature tracking algorithms under rain and snow.
If successful, the de-rained frames of a rain removal algorithm should improve segmentation performance and increase the number of accurately tracked features.
The results show that a recent single-frame-based rain removal algorithm increases the segmentation performance by 19.7% on our proposed dataset, but it eventually decreases the feature tracking performance and showed mixed results with recent instance segmentation methods.
However, the best video-based rain removal algorithm improves the feature tracking accuracy by 7.72%.
Coherent network error correction is the error-control problem in network coding with the knowledge of the network codes at the source and sink nodes.
With respect to a given set of local encoding kernels defining a linear network code, we obtain refined versions of the Hamming bound, the Singleton bound and the Gilbert-Varshamov bound for coherent network error correction.
Similar to its classical counterpart, this refined Singleton bound is tight for linear network codes.
The tightness of this refined bound is shown by two construction algorithms of linear network codes achieving this bound.
These two algorithms illustrate different design methods: one makes use of existing network coding algorithms for error-free transmission and the other makes use of classical error-correcting codes.
The implication of the tightness of the refined Singleton bound is that the sink nodes with higher maximum flow values can have higher error correction capabilities.
Probabilistic-driven classification techniques extend the role of traditional approaches that output labels (usually integer numbers) only.
Such techniques are more fruitful when dealing with problems where one is not interested in recognition/identification only, but also into monitoring the behavior of consumers and/or machines, for instance.
Therefore, by means of probability estimates, one can take decisions to work better in a number of scenarios.
In this paper, we propose a probabilistic-based Optimum Path Forest (OPF) classifier to handle with binary classification problems, and we show it can be more accurate than naive OPF in a number of datasets.
In addition to being just more accurate or not, probabilistic OPF turns to be another useful tool to the scientific community.
Nowadays, deep learning has been widely used.
In natural language learning, the analysis of complex semantics has been achieved because of its high degree of flexibility.
The deceptive opinions detection is an important application area in deep learning model, and related mechanisms have been given attention and researched.
On-line opinions are quite short, varied types and content.
In order to effectively identify deceptive opinions, we need to comprehensively study the characteristics of deceptive opinions, and explore novel characteristics besides the textual semantics and emotional polarity that have been widely used in text analysis.
The detection mechanism based on deep learning has better self-adaptability and can effectively identify all kinds of deceptive opinions.
In this paper, we optimize the convolution neural network model by embedding the word order characteristics in its convolution layer and pooling layer, which makes convolution neural network more suitable for various text classification and deceptive opinions detection.
The TensorFlow-based experiments demonstrate that the detection mechanism proposed in this paper achieve more accurate deceptive opinion detection results.
Simile is a figure of speech that compares two things through the use of connection words, but where comparison is not intended to be taken literally.
They are often used in everyday communication, but they are also a part of linguistic cultural heritage.
In this paper we present a methodology for semi-automated collection of similes from the World Wide Web using text mining and machine learning techniques.
We expanded an existing corpus by collecting 442 similes from the internet and adding them to the existing corpus collected by Vuk Stefanovic Karadzic that contained 333 similes.
We, also, introduce crowdsourcing to the collection of figures of speech, which helped us to build corpus containing 787 unique similes.
Microscopic histology image analysis is a cornerstone in early detection of breast cancer.
However these images are very large and manual analysis is error prone and very time consuming.
Thus automating this process is in high demand.
We proposed a hierarchical system of convolutional neural networks (CNN) that classifies automatically patches of these images into four pathologies: normal, benign, in situ carcinoma and invasive carcinoma.
We evaluated our system on the BACH challenge dataset of image-wise classification and a small dataset that we used to extend it.
Using a train/test split of 75%/25%, we achieved an accuracy rate of 0.99 on the test split for the BACH dataset and 0.96 on that of the extension.
On the test of the BACH challenge, we've reached an accuracy of 0.81 which rank us to the 8th out of 51 teams.
We present a constrained motion control framework for a redundant surgical system designed for minimally invasive treatment of pelvic osteolysis.
The framework comprises a kinematics model of a six Degrees-of-Freedom (DoF) robotic arm integrated with a one DoF continuum manipulator as well as a novel convex optimization redundancy resolution controller.
To resolve the redundancy resolution problem, formulated as a constrained l2-regularized quadratic minimization, we study and evaluate the potential use of an optimally tuned alternating direction method of multipliers (ADMM) algorithm.
To this end, we prove global convergence of the algorithm at linear rate and propose expressions for the involved parameters resulting in a fast convergence.
Simulations on the robotic system verified our analytical derivations and showed the capability and robustness of the ADMM algorithm in constrained motion control of our redundant surgical system.
An examination of object recognition challenge leaderboards (ILSVRC, PASCAL-VOC) reveals that the top-performing classifiers typically exhibit small differences amongst themselves in terms of error rate/mAP.
To better differentiate the top performers, additional criteria are required.
Moreover, the (test) images, on which the performance scores are based, predominantly contain fully visible objects.
Therefore, `harder' test images, mimicking the challenging conditions (e.g. occlusion) in which humans routinely recognize objects, need to be utilized for benchmarking.
To address the concerns mentioned above, we make two contributions.
First, we systematically vary the level of local object-part content, global detail and spatial context in images from PASCAL VOC 2010 to create a new benchmarking dataset dubbed PPSS-12.
Second, we propose an object-part based benchmarking procedure which quantifies classifiers' robustness to a range of visibility and contextual settings.
The benchmarking procedure relies on a semantic similarity measure that naturally addresses potential semantic granularity differences between the category labels in training and test datasets, thus eliminating manual mapping.
We use our procedure on the PPSS-12 dataset to benchmark top-performing classifiers trained on the ILSVRC-2012 dataset.
Our results show that the proposed benchmarking procedure enables additional differentiation among state-of-the-art object classifiers in terms of their ability to handle missing content and insufficient object detail.
Given this capability for additional differentiation, our approach can potentially supplement existing benchmarking procedures used in object recognition challenge leaderboards.
Loop closure detection, the task of identifying locations revisited by a robot in a sequence of odometry and perceptual observations, is typically formulated as a combination of two subtasks: (1) bag-of-words image retrieval and (2) post-verification using RANSAC geometric verification.
The main contribution of this study is the proposal of a novel post-verification framework that achieves good precision recall trade-off in loop closure detection.
This study is motivated by the fact that not all loop closure hypotheses are equally plausible (e.g., owing to mutual consistency between loop closure constraints) and that if we have evidence that one hypothesis is more plausible than the others, then it should be verified more frequently.
We demonstrate that the problem of loop closure detection can be viewed as an instance of a multi-model hypothesize-and-verify framework and build guided sampling strategies on the framework where loop closures proposed using image retrieval are verified in a planned order (rather than in a conventional uniform order) to operate in a constant time.
Experimental results using a stereo SLAM system confirm that the proposed strategy, the use of loop closure constraints and robot trajectory hypotheses as a guide, achieves promising results despite the fact that there exists a significant number of false positive constraints and hypotheses.
Wireless sensor networks become integral part of our life.
These networks can be used for monitoring the data in various domain due to their flexibility and functionality.
Query processing and optimization in the WSN is a very challenging task because of their energy and memory constraint.
In this paper, first our focus is to review the different approaches that have significant impacts on the development of query processing techniques for WSN.
Finally, we aim to illustrate the existing approach in popular query processing engines with future research challenges in query optimization.
This paper describes the HASYv2 dataset.
HASY is a publicly available, free of charge dataset of single symbols similar to MNIST.
It contains 168233 instances of 369 classes.
HASY contains two challenges: A classification challenge with 10 pre-defined folds for 10-fold cross-validation and a verification challenge.
In this paper we present a novel distributed coverage control framework for a network of mobile agents, in charge of covering a finite set of points of interest (PoI), such as people in danger, geographically dispersed equipment or environmental landmarks.
The proposed algorithm is inspired by C-Means, an unsupervised learning algorithm originally proposed for non-exclusive clustering and for identification of cluster centroids from a set of observations.
To cope with the agents' limited sensing range and avoid infeasible coverage solutions, traditional C-Means needs to be enhanced with proximity constraints, ensuring that each agent takes into account only neighboring PoIs.
The proposed coverage control framework provides useful information concerning the ranking or importance of the different PoIs to the agents, which can be exploited in further application-dependent data fusion processes, patrolling, or disaster relief applications.
Conversational agents are gaining popularity with the increasing ubiquity of smart devices.
However, training agents in a data driven manner is challenging due to a lack of suitable corpora.
This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing.
Alongside this paper, we include a corpus of 3.6 million words across 23 topics.
We argue the utility of the corpus by comparing self-dialogues with standard two-party conversations as well as data from other corpora.
We propose to take a novel approach to robot system design where each building block of a larger system is represented as a differentiable program, i.e. a deep neural network.
This representation allows for integrating algorithmic planning and deep learning in a principled manner, and thus combine the benefits of model-free and model-based methods.
We apply the proposed approach to a challenging partially observable robot navigation task.
The robot must navigate to a goal in a previously unseen 3-D environment without knowing its initial location, and instead relying on a 2-D floor map and visual observations from an onboard camera.
We introduce the Navigation Networks (NavNets) that encode state estimation, planning and acting in a single, end-to-end trainable recurrent neural network.
In preliminary simulation experiments we successfully trained navigation networks to solve the challenging partially observable navigation task.
Developing methods of automated inference that are able to provide users with compelling human-readable justifications for why the answer to a question is correct is critical for domains such as science and medicine, where user trust and detecting costly errors are limiting factors to adoption.
One of the central barriers to training question answering models on explainable inference tasks is the lack of gold explanations to serve as training data.
In this paper we present a corpus of explanations for standardized science exams, a recent challenge task for question answering.
We manually construct a corpus of detailed explanations for nearly all publicly available standardized elementary science question (approximately 1,680 3rd through 5th grade questions) and represent these as "explanation graphs" -- sets of lexically overlapping sentences that describe how to arrive at the correct answer to a question through a combination of domain and world knowledge.
We also provide an explanation-centered tablestore, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations.
Together, these two knowledge resources map out a substantial portion of the knowledge required for answering and explaining elementary science exams, and provide both structured and free-text training data for the explainable inference task.
Symmetric positive definite (SPD) matrices are useful for capturing second-order statistics of visual data.
To compare two SPD matrices, several measures are available, such as the affine-invariant Riemannian metric, Jeffreys divergence, Jensen-Bregman logdet divergence, etc.
; however, their behaviors may be application dependent, raising the need of manual selection to achieve the best possible performance.
Further and as a result of their overwhelming complexity for large-scale problems, computing pairwise similarities by clever embedding of SPD matrices is often preferred to direct use of the aforementioned measures.
In this paper, we propose a discriminative metric learning framework, Information Divergence and Dictionary Learning (IDDL), that not only learns application specific measures on SPD matrices automatically, but also embeds them as vectors using a learned dictionary.
To learn the similarity measures (which could potentially be distinct for every dictionary atom), we use the recently introduced alpha-beta-logdet divergence, which is known to unify the measures listed above.
We propose a novel IDDL objective, that learns the parameters of the divergence and the dictionary atoms jointly in a discriminative setup and is solved efficiently using Riemannian optimization.
We showcase extensive experiments on eight computer vision datasets, demonstrating state-of-the-art performances.
We propose an improvement of an oceanographic three dimensional variational assimilation scheme (3D-VAR), named OceanVar, by introducing a recursive filter (RF) with the third order of accuracy (3rd-RF), instead of a RF with first order of accuracy (1st-RF), to approximate horizontal Gaussian covariances.
An advantage of the proposed scheme is that the CPU's time can be substantially reduced with benefits on the large scale applications.
Experiments estimating the impact of 3rd-RF are performed by assimilating oceanographic data in two realistic oceanographic applications.
The results evince benefits in terms of assimilation process computational time, accuracy of the Gaussian correlation modeling, and show that the 3rd-RF is a suitable tool for operational data assimilation.
Over the years, several meta-heuristic algorithms were proposed and are now emerging as common methods for constrained optimization problems.
Among them, genetic algorithms (GA's) shine as popular evolutionary algorithms (EA's) in engineering optimization.
Most engineering design problems are difficult to resolve with conventional optimization algorithms because they are highly nonlinear and contain constraints.
In order to handle these constraints, the most common technique is to apply penalty functions.
The major drawback is that they require tuning of parameters, which can be very challenging.
In this paper, we present a constraint-handling technique for GA's solely using the violation factor, called VCH (Violation Constraint-Handling) method.
Several benchmark problems from the literature are examined.
The VCH technique was able to provide a consistent performance and match results from other GA-based techniques.
With virtual reality, digital painting on 2D canvases is now being extended to 3D spaces.
Tilt Brush and Oculus Quill are widely accepted among artists as tools that pave the way to a new form of art - 3D emmersive painting.
Current 3D painting systems are only a start, emitting textured triangular geometries.
In this paper, we advance this new art of 3D painting to 3D volumetric painting that enables an artist to draw a huge scene with full control of spatial color fields.
Inspired by the fact that 2D paintings often use vast space to paint background and small but detailed space for foreground, we claim that supporting a large canvas in varying detail is essential for 3D painting.
In order to help artists focus and audiences to navigate the large canvas space, we provide small artist-defined areas, called rooms, that serve as beacons for artist-suggested scales, spaces, locations for intended appreciation view of the painting.
Artists and audiences can easily transport themselves between different rooms.
Technically, our canvas is represented as an array of deep octrees of depth 24 or higher, built on CPU for volume painting and on GPU for volume rendering using accurate ray casting.
In CPU side, we design an efficient iterative algorithm to refine or coarsen octree, as a result of volumetric painting strokes, at highly interactive rates, and update the corresponding GPU textures.
Then we use GPU-based ray casting algorithms to render the volumetric painting result.
We explore precision issues stemming from ray-casting the octree of high depth, and provide a new analysis and verification.
From our experimental results as well as the positive feedback from the participating artists, we strongly believe that our new 3D volume painting system can open up a new possibility for VR-driven digital art medium to professional artists as well as to novice users.
The concept of the augmented coaching ecosystem for non-obtrusive adaptive personalized elderly care is proposed on the basis of the integration of new and available ICT approaches.
They include the multimodal user interface (MMUI), augmented reality (AR), machine learning (ML), Internet of Things (IoT), and machine-to-machine (M2M) interactions.
The ecosystem is based on the Cloud-Fog-Dew computing paradigm services, providing a full symbiosis by integrating the whole range from low-level sensors up to high-level services using integration efficiency inherent in synergistic use of applied technologies.
Inside of this ecosystem, all of them are encapsulated in the following network layers: Dew, Fog, and Cloud computing layer.
Instead of the "spaghetti connections", "mosaic of buttons", "puzzles of output data", etc., the proposed ecosystem provides the strict division in the following dataflow channels: consumer interaction channel, machine interaction channel, and caregiver interaction channel.
This concept allows to decrease the physical, cognitive, and mental load on elderly care stakeholders by decreasing the secondary human-to-human (H2H), human-to-machine (H2M), and machine-to-human (M2H) interactions in favor of M2M interactions and distributed Dew Computing services environment.
It allows to apply this non-obtrusive augmented reality ecosystem for effective personalized elderly care to preserve their physical, cognitive, mental and social well-being.
Human and artificial organizations may be described as networks of interacting parts.
Those parts exchange data and control information and, as a result of these interactions, organizations produce emergent behaviors and purposes -- traits the characterize "the whole" as "greater than the sum of its parts".
In this chapter it is argued that, rather than a static and immutable property, emergence should be interpreted as the result of dynamic interactions between forces of opposite sign: centripetal (positive) forces strengthening emergence by consolidating the whole and centrifugal (negative) forces that weaken the social persona and as such are detrimental to emergence.
The result of this interaction is called in this chapter as "quality of emergence".
This problem is discussed in the context of a particular class of organizations: conventional hierarchies.
We highlight how traditional designs produce behaviors that may severely impact the quality of emergence.
Finally we discuss a particular class of organizations that do not suffer from the limitations typical of strict hierarchies and result in greater quality of emergence.
In some case, however, these enhancements are counterweighted by a reduced degree of controllability and verifiability.
Self Organizing Migrating Algorithm (SOMA) is a meta-heuristic algorithm based on the self-organizing behavior of individuals in a simulated social environment.
SOMA performs iterative computations on a population of potential solutions in the given search space to obtain an optimal solution.
In this paper, an Opportunistic Self Organizing Migrating Algorithm (OSOMA) has been proposed that introduces a novel strategy to generate perturbations effectively.
This strategy allows the individual to span across more possible solutions and thus, is able to produce better solutions.
A comprehensive analysis of OSOMA on multi-dimensional unconstrained benchmark test functions is performed.
OSOMA is then applied to solve real-time Dynamic Traveling Salesman Problem (DTSP).
The problem of real-time DTSP has been stipulated and simulated using real-time data from Google Maps with a varying cost-metric between any two cities.
Although DTSP is a very common and intuitive model in the real world, its presence in literature is still very limited.
OSOMA performs exceptionally well on the problems mentioned above.
To substantiate this claim, the performance of OSOMA is compared with SOMA, Differential Evolution and Particle Swarm Optimization.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice.
Such a distribution mismatch will lead to a significant performance drop.
In this work, we aim to improve the cross-domain robustness of object detection.
We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc.
We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy.
The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner.
The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model.
We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc.
The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.
In this paper we proposed reinforcement learning algorithms with the generalized reward function.
In our proposed method we use Q-learning and SARSA algorithms with generalised reward function to train the reinforcement learning agent.
We evaluated the performance of our proposed algorithms on two real-time strategy games called BattleCity and S3.
There are two main advantages of having such an approach as compared to other works in RTS.
(1) We can ignore the concept of a simulator which is often game specific and is usually hard coded in any type of RTS games (2) our system can learn from interaction with any opponents and quickly change the strategy according to the opponents and do not need any human traces as used in previous works.
Keywords : Reinforcement learning, Machine learning, Real time strategy, Artificial intelligence.
To overcome the tradeoff of the conventional normalized least mean square (NLMS) algorithm between fast convergence rate and low steady-state misalignment, this paper proposes a variable step size (VSS) NLMS algorithm by devising a new strategy to update the step size.
In this strategy, the input signal power and the cross-correlation between the input signal and the error signal are used to estimate the true tracking error power, reducing the effect of the system noise on the algorithm performance.
Moreover, the steady-state performances of the algorithm are provided for Gaussian white input signal and are verified by simulations.
Finally, simulation results in the context of the system identification and acoustic echo cancellation (AEC) have demonstrated that the proposed algorithm has lower steady-state misalignment than other VSS algorithms.
Despite the significant progress that has been made on estimating optical flow recently, most estimation methods, including classical and deep learning approaches, still have difficulty with multi-scale estimation, real-time computation, and/or occlusion reasoning.
In this paper, we introduce dilated convolution and occlusion reasoning into unsupervised optical flow estimation to address these issues.
The dilated convolution allows our network to avoid upsampling via deconvolution and the resulting gridding artifacts.
Dilated convolution also results in a smaller memory footprint which speeds up interference.
The occlusion reasoning prevents our network from learning incorrect deformations due to occluded image regions during training.
Our proposed method outperforms state-of-the-art unsupervised approaches on the KITTI benchmark.
We also demonstrate its generalization capability by applying it to action recognition in video.
Long Short-Term Memory networks trained with gradient descent and back-propagation have received great success in various applications.
However, point estimation of the weights of the networks is prone to over-fitting problems and lacks important uncertainty information associated with the estimation.
However, exact Bayesian neural network methods are intractable and non-applicable for real-world applications.
In this study, we propose an approximate estimation of the weights uncertainty using Ensemble Kalman Filter, which is easily scalable to a large number of weights.
Furthermore, we optimize the covariance of the noise distribution in the ensemble update step using maximum likelihood estimation.
To assess the proposed algorithm, we apply it to outlier detection in five real-world events retrieved from the Twitter platform.
The increasing deployment of sensor networks, ranging from home networks to industrial automation, leads to a similarly growing demand for storing and processing the collected sensor data.
To satisfy this demand, the most promising approach to date is the utilization of the dynamically scalable, on-demand resources made available via the cloud computing paradigm.
However, prevalent security and privacy concerns are a huge obstacle for the outsourcing of sensor data to the cloud.
Hence, sensor data needs to be secured properly before it can be outsourced to the cloud.
When securing the outsourcing of sensor data to the cloud, one important challenge lies in the representation of sensor data and the choice of security measures applied to it.
In this paper, we present the SensorCloud protocol, which enables the representation of sensor data and actuator commands using JSON as well as the encoding of the object security mechanisms applied to a given sensor data item.
Notably, we solely utilize mechanisms that have been or currently are in the process of being standardized at the IETF to aid the wide applicability of our approach.
In this paper, we consider an uplink heterogeneous cloud radio access network (H-CRAN), where a macro base station (BS) coexists with many remote radio heads (RRHs).
For cost-savings, only the BS is connected to the baseband unit (BBU) pool via fiber links.
The RRHs, however, are associated with the BBU pool through wireless fronthaul links, which share the spectrum resource with radio access networks.
Due to the limited capacity of fronthaul, the compress-and-forward scheme is employed, such as point-to-point compression or Wyner-Ziv coding.
Different decoding strategies are also considered.
This work aims to maximize the uplink ergodic sum-rate (SR) by jointly optimizing quantization noise matrix and bandwidth allocation between radio access networks and fronthaul links, which is a mixed time-scale issue.
To reduce computational complexity and communication overhead, we introduce an approximation problem of the joint optimization problem based on large-dimensional random matrix theory, which is a slow time-scale issue because it only depends on statistical channel information.
Finally, an algorithm based on Dinkelbach's algorithm is proposed to find the optimal solution to the approximate problem.
In summary, this work provides an economic solution to the challenge of constrained fronthaul capacity, and also provides a framework with less computational complexity to study how bandwidth allocation and fronthaul compression can affect the SR maximization problem.
Text classification is an important and classical problem in natural language processing.
There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification.
However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task.
In this work, we propose to use graph convolutional networks for text classification.
We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus.
Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents.
Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification.
On the other hand, Text GCN also learns predictive word and document embeddings.
In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.
Owing to recorded incidents of Information technology inclined organisations failing to respond effectively to threat incidents, this project outlines the benefits of conducting a comprehensive risk assessment which would aid proficiency in responding to potential threats.
The ultimate goal is primarily to identify, quantify and control the key threats that are detrimental to achieving business objectives.
This project carries out a detailed risk assessment for a case study organisation.
It includes a comprehensive literature review analysing several professional views on pressing issues in Information security.
In the risk register, five prominent assets were identified in respect to their owners.
The work is followed by a qualitative analysis methodology to determine the magnitude of the potential threats and vulnerabilities.
Collating these parameters enabled the valuation of individual risk per asset, per threat and vulnerability.
Evaluating a risk appetite aided in prioritising and determining acceptable risks.
From the analysis, it was deduced that human being posed the greatest Information security risk through intentional/ unintentional human error.
In conclusion, effective control techniques based on defence in-depth were devised to mitigate the impact of the identified risks from risk register.
Sequence-to-sequence attention-based models on subword units allow simple open-vocabulary end-to-end speech recognition.
In this work, we show that such models can achieve competitive results on the Switchboard 300h and LibriSpeech 1000h tasks.
In particular, we report the state-of-the-art word error rates (WER) of 3.54% on the dev-clean and 3.82% on the test-clean evaluation subsets of LibriSpeech.
We introduce a new pretraining scheme by starting with a high time reduction factor and lowering it during training, which is crucial both for convergence and final performance.
In some experiments, we also use an auxiliary CTC loss function to help the convergence.
In addition, we train long short-term memory (LSTM) language models on subword units.
By shallow fusion, we report up to 27% relative improvements in WER over the attention baseline without a language model.
Deep learning tasks are often complicated and require a variety of components working together efficiently to perform well.
Due to the often large scale of these tasks, there is a necessity to iterate quickly in order to attempt a variety of methods and to find and fix bugs.
While participating in IARPA's Functional Map of the World challenge, we identified challenges along the entire deep learning pipeline and found various solutions to these challenges.
In this paper, we present the performance, engineering, and deep learning considerations with processing and modeling data, as well as underlying infrastructure considerations that support large-scale deep learning tasks.
We also discuss insights and observations with regard to satellite imagery and deep learning for image classification.
Concepts are the foundation of human deep learning, understanding, and knowledge integration and transfer.
We propose concept-oriented deep learning (CODL) which extends (machine) deep learning with concept representations and conceptual understanding capability.
CODL addresses some of the major limitations of deep learning: interpretability, transferability, contextual adaptation, and requirement for lots of labeled training data.
We discuss the major aspects of CODL including concept graph, concept representations, concept exemplars, and concept representation learning systems supporting incremental and continual learning.
The identification of reduced-order models from high-dimensional data is a challenging task, and even more so if the identified system should not only be suitable for a certain data set, but generally approximate the input-output behavior of the data source.
In this work, we consider the input-output dynamic mode decomposition method for system identification.
We compare excitation approaches for the data-driven identification process and describe an optimization-based stabilization strategy for the identified systems.
The syntactic nature and compositionality characteristic of stochastic process algebras make models to be easily understood by human beings, but not convenient for machines as well as people to directly carry out mathematical analysis and stochastic simulation.
This paper presents a numerical representation schema for the stochastic process algebra PEPA, which can provide a platform to directly and conveniently employ a variety of computational approaches to both qualitatively and quantitatively analyse the models.
Moreover, these approaches developed on the basis of the schema are demonstrated and discussed.
In particular, algorithms for automatically deriving the schema from a general PEPA model and simulating the model based on the derived schema to derive performance measures are presented.
It is important to detect anomalous inputs when deploying machine learning systems.
The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples.
At the same time, diverse image and text data are available in enormous quantities.
We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE).
This enables anomaly detectors to generalize and detect unseen anomalies.
In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance.
We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue.
We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
The cellular technology is mostly an urban technology that has been unable to serve rural areas well.
This is because the traditional cellular models are not economical for areas with low user density and lesser revenues.
In 5G cellular networks, the coverage dilemma is likely to remain the same, thus widening the rural-urban digital divide further.
It is about time to identify the root cause that has hindered the rural technology growth and analyse the possible options in 5G architecture to address this issue.
We advocate that it can only be accomplished in two phases by sequentially addressing economic viability followed by performance progression.
We deliberate how various works in literature focus on the later stage of this two-phase problem and are not feasible to implement in the first place.
We propose the concept of TV band white space (TVWS) dovetailed with 5G infrastructure for rural coverage and show that it can yield cost-effectiveness from a service provider perspective.
Color-depth cameras (RGB-D cameras) have become the primary sensors in most robotics systems, from service robotics to industrial robotics applications.
Typical consumer-grade RGB-D cameras are provided with a coarse intrinsic and extrinsic calibration that generally does not meet the accuracy requirements needed by many robotics applications (e.g., highly accurate 3D environment reconstruction and mapping, high precision object recognition and localization, ...).
In this paper, we propose a human-friendly, reliable and accurate calibration framework that enables to easily estimate both the intrinsic and extrinsic parameters of a general color-depth sensor couple.
Our approach is based on a novel two components error model.
This model unifies the error sources of RGB-D pairs based on different technologies, such as structured-light 3D cameras and time-of-flight cameras.
Our method provides some important advantages compared to other state-of-the-art systems: it is general (i.e., well suited for different types of sensors), based on an easy and stable calibration protocol, provides a greater calibration accuracy, and has been implemented within the ROS robotics framework.
We report detailed experimental validations and performance comparisons to support our statements.
The potential for agents, whether embodied or software, to learn by observing other agents performing procedures involving objects and actions is rich.
Current research on automatic procedure learning heavily relies on action labels or video subtitles, even during the evaluation phase, which makes them infeasible in real-world scenarios.
This leads to our question: can the human-consensus structure of a procedure be learned from a large set of long, unconstrained videos (e.g., instructional videos from YouTube) with only visual evidence?
To answer this question, we introduce the problem of procedure segmentation--to segment a video procedure into category-independent procedure segments.
Given that no large-scale dataset is available for this problem, we collect a large-scale procedure segmentation dataset with procedure segments temporally localized and described; we use cooking videos and name the dataset YouCook2.
We propose a segment-level recurrent network for generating procedure segments by modeling the dependencies across segments.
The generated segments can be used as pre-processing for other tasks, such as dense video captioning and event parsing.
We show in our experiments that the proposed model outperforms competitive baselines in procedure segmentation.
Efficient network management is one of the key challenges of the constantly growing and increasingly complex wide area networks (WAN).
The paradigm shift towards virtualized (NFV) and software defined networks (SDN) in the next generation of mobile networks (5G), as well as the latest scientific insights in the field of Artificial Intelligence (AI) enable the transition from manually managed networks nowadays to fully autonomic and dynamic self-organized networks (SON).
This helps to meet the KPIs and reduce at the same time operational costs (OPEX).
In this paper, an AI driven concept is presented for the malfunction detection in NFV applications with the help of semi-supervised learning.
For this purpose, a profile of the application under test is created.
This profile then is used as a reference to detect abnormal behaviour.
For example, if there is a bug in the updated version of the app, it is now possible to react autonomously and roll-back the NFV app to a previous version in order to avoid network outages.
Recent surveys have shown that an increasing portion of the US public believes the two major US parties adequately represent the US public opinion and think additional parties are needed.
However, there are high barriers for third parties in political elections.
In this paper, we aim to address two questions: "How well do the two major US parties represent the public's ideology?" and "Does a more-than-two-party system better represent the ideology of the public?".
To address these questions, we utilize the American National Election Studies Time series dataset.
We perform unsupervised clustering with Gaussian Mixture Model method on this dataset.
When clustered into two clusters, we find a large centrist cluster and a small right-wing cluster.
The Democratic Party's position (estimated using the mean position of the individuals self-identified with the parties) is similar to that of the centrist cluster, and the Republican Party's position is between the two clusters.
We investigate if more than two parties represent the population better by comparing the Akaike Information Criteria for clustering results of the various number of clusters.
We find that additional clusters give a better representation of the data, even after penalizing for the additional parameters.
This suggests a multiparty system represents of the ideology of the public better.
While an increasing interest in deep models for single-image depth estimation methods can be observed, established schemes for their evaluation are still limited.
We propose a set of novel quality criteria, allowing for a more detailed analysis by focusing on specific characteristics of depth maps.
In particular, we address the preservation of edges and planar regions, depth consistency, and absolute distance accuracy.
In order to employ these metrics to evaluate and compare state-of-the-art single-image depth estimation approaches, we provide a new high-quality RGB-D dataset.
We used a DSLR camera together with a laser scanner to acquire high-resolution images and highly accurate depth maps.
Experimental results show the validity of our proposed evaluation protocol.
General Purpose Graphic Processing Unit(GPGPU) is used widely for achieving high performance or high throughput in parallel programming.
This capability of GPGPUs is very famous in the new era and mostly used for scientific computing which requires more processing power than normal personal computers.
Therefore, most of the programmers, researchers and industry use this new concept for their work.
However, achieving high-performance or high-throughput using GPGPUs are not an easy task compared with conventional programming concepts in the CPU side.
In this research, the CPU's cache memory optimization techniques have been adopted to the GPGPU's cache memory to identify rare performance improvement techniques compared to GPGPU's best practices.
The cache optimization techniques of blocking, loop fusion, array merging and array transpose were tested on GPGPUs for finding suitability of these techniques.
Finally, we identified that some of the CPU cache optimization techniques go well with the cache memory system of the GPGPU and shows performance improvements while some others show the opposite effect on the GPGPUs compared with the CPUs.
This paper introduce a software system including widely-used Swarm Intelligence algorithms or approaches to be used for the related scientific research studies associated with the subject area.
The programmatic infrastructure of the system allows working on a fast, easy-to-use, interactive platform to perform Swarm Intelligence based studies in a more effective, efficient and accurate way.
In this sense, the system employs all of the necessary controls for the algorithms and it ensures an interactive platform on which computer users can perform studies on a wide spectrum of solution approaches associated with simple and also more advanced problems.
Papers on Agile Software Development methods are often focused on their applicability in commercial projects or organizations.
There are no current studies that we know about addressing the application of these methods in research projects.
The objective of this work is to describe the perception of researchers on the application of agile software development practices and principles for research projects.
A study was conducted by constructing and applying a questionnaire to Brazilian researchers of different affiliations, formation and research areas in order to obtain information about their knowledge and openness to follow agile software development principles and practices.
We study the computational complexity of an important property of simple, regular and weighted games, which is decisiveness.
We show that this concept can naturally be represented in the context of hypergraph theory, and that decisiveness can be decided for simple games in quasi-polynomial time, and for regular and weighted games in polynomial time.
The strongness condition poses the main difficulties, while properness reduces the complexity of the problem, especially if it is amplified by regularity.
On the other hand, regularity also allows to specify the problem instances much more economically, implying a reconsideration of the corresponding complexity measure that, as we prove, has important structural as well as algorithmic consequences.
We consider the following problem for a fixed graph H: given a graph G and two H-colorings of G, i.e. homomorphisms from G to H, can one be transformed (reconfigured) into the other by changing one color at a time, maintaining an H-coloring throughout.
This is the same as finding a path in the Hom(G,H) complex.
For H=K_k this is the problem of finding paths between k-colorings, which was shown to be in P for k<=3 and PSPACE-complete otherwise by Cereceda et al.2011
We generalize the positive side of this dichotomy by providing an algorithm that solves the problem in polynomial time for any H with no C_4 subgraph.
This gives a large class of constraints for which finding solutions to the Constraint Satisfaction Problem is NP-complete, but finding paths in the solution space is P.   The algorithm uses a characterization of possible reconfiguration sequences (paths in Hom(G,H)), whose main part is a purely topological condition described in algebraic terms of the fundamental groupoid of H seen as a topological space.
We illustrate the potential of massive MIMO for communication with unmanned aerial vehicles (UAVs).
We consider a scenario where multiple single-antenna UAVs simultaneously communicate with a ground station (GS) equipped with a large number of antennas.
Specifically, we discuss the achievable uplink (UAV to GS) capacity performance in the case of line-of-sight (LoS) conditions.
We develop a realistic geometric model which incorporates an arbitrary orientation of the GS and UAV antenna elements to characterize the polarization mismatch loss which occurs due to the movement and orientation of the UAVs.
A closed-form expression for a lower bound on the ergodic rate for a maximum-ratio combining receiver with estimated channel state information is derived.
The optimal antenna spacing that maximizes the ergodic rate achieved by an UAV is also determined for uniform linear and rectangular arrays.
It is shown that when the UAVs are spherically uniformly distributed around the GS, the ergodic rate per UAV is maximized for an antenna spacing equal to an integer multiple of one-half wavelength.
A large number of papers have introduced novel machine learning and feature extraction methods for automatic classification of AD.
However, they are difficult to reproduce because key components of the validation are often not readily available.
These components include selected participants and input data, image preprocessing and cross-validation procedures.
The performance of the different approaches is also difficult to compare objectively.
In particular, it is often difficult to assess which part of the method provides a real improvement, if any.
We propose a framework for reproducible and objective classification experiments in AD using three publicly available datasets (ADNI, AIBL and OASIS).
The framework comprises: i) automatic conversion of the three datasets into BIDS format, ii) a modular set of preprocessing pipelines, feature extraction and classification methods, together with an evaluation framework, that provide a baseline for benchmarking the different components.
We demonstrate the use of the framework for a large-scale evaluation on 1960 participants using T1 MRI and FDG PET data.
In this evaluation, we assess the influence of different modalities, preprocessing, feature types, classifiers, training set sizes and datasets.
Performances were in line with the state-of-the-art.
FDG PET outperformed T1 MRI for all classification tasks.
No difference in performance was found for the use of different atlases, image smoothing, partial volume correction of FDG PET images, or feature type.
Linear SVM and L2-logistic regression resulted in similar performance and both outperformed random forests.
The classification performance increased along with the number of subjects used for training.
Classifiers trained on ADNI generalized well to AIBL and OASIS.
All the code of the framework and the experiments is publicly available at: https://gitlab.icm-institute.org/aramislab/AD-ML.
Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades.
However, they necessarily treat the choice to cooperate or defect as an atomic action.
In real-world social dilemmas these choices are temporally extended.
Cooperativeness is a property that applies to policies, not elementary actions.
We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions.
We analyze the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network, on two Markov games we introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game.
We characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance.
Our experiments show how conflict can emerge from competition over shared resources and shed light on how the sequential nature of real world social dilemmas affects cooperation.
The document serves as a reference for researchers trying to capture a large portion of a mass event on video for several hours, while using a very limited budget.
Cryptocurrencies and their foundation technology, the Blockchain, are reshaping finance and economics, allowing a decentralized approach enabling trusted applications with no trusted counterpart.
More recently, the Blockchain and the programs running on it, called Smart Contracts, are also finding more and more applications in all fields requiring trust and sound certifications.
Some people have come to the point of saying that the "Blockchain revolution" can be compared to that of the Internet and the Web in their early days.
As a result, all the software development revolving around the Blockchain technology is growing at a staggering rate.
The feeling of many software engineers about such huge interest in Blockchain technologies is that of unruled and hurried software development, a sort of competition on a first-come-first-served basis which does not assure neither software quality, nor that the basic concepts of software engineering are taken into account.
This paper tries to cope with this issue, proposing a software development process to gather the requirement, analyze, design, develop, test and deploy Blockchain applications.
The process is based on several Agile practices, such as User Stories and iterative and incremental development based on them.
However, it makes also use of more formal notations, such as some UML diagrams describing the design of the system, with additions to represent specific concepts found in Blockchain development.
The method is described in good detail, and an example is given to show how it works.
In this paper we analyse the benefits of incorporating interval-valued fuzzy sets into the Bousi-Prolog system.
A syntax, declarative semantics and im- plementation for this extension is presented and formalised.
We show, by using potential applications, that fuzzy logic programming frameworks enhanced with them can correctly work together with lexical resources and ontologies in order to improve their capabilities for knowledge representation and reasoning.
In most computer vision applications, convolutional neural networks (CNNs) operate on dense image data generated by ordinary cameras.
Designing CNNs for sparse and irregularly spaced input data is still an open problem with numerous applications in autonomous driving, robotics, and surveillance.
To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task.
We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers.
Furthermore, we propose an objective function that simultaneously minimizes the data error while maximizing the output confidence.
Comprehensive experiments are performed on the KITTI depth benchmark and the results clearly demonstrate that the proposed approach achieves superior performance while requiring three times fewer parameters than the state-of-the-art methods.
Moreover, our approach produces a continuous pixel-wise confidence map enabling information fusion, state inference, and decision support.
A variety of representation learning approaches have been investigated for reinforcement learning; much less attention, however, has been given to investigating the utility of sparse coding.
Outside of reinforcement learning, sparse coding representations have been widely used, with non-convex objectives that result in discriminative representations.
In this work, we develop a supervised sparse coding objective for policy evaluation.
Despite the non-convexity of this objective, we prove that all local minima are global minima, making the approach amenable to simple optimization strategies.
We empirically show that it is key to use a supervised objective, rather than the more straightforward unsupervised sparse coding approach.
We compare the learned representations to a canonical fixed sparse representation, called tile-coding, demonstrating that the sparse coding representation outperforms a wide variety of tilecoding representations.
CSMA (Carrier Sense Multiple Access) algorithms based on Gibbs sampling can achieve throughput optimality if certain parameters called the fugacities are appropriately chosen.
However, the problem of computing these fugacities is NP-hard.
In this work, we derive estimates of the fugacities by using a framework called the regional free energy approximations.
In particular, we derive explicit expressions for approximate fugacities corresponding to any feasible service rate vector.
We further prove that our approximate fugacities are exact for the class of chordal graphs.
A distinguishing feature of our work is that the regional approximations that we propose are tailored to conflict graphs with small cycles, which is a typical characteristic of wireless networks.
Numerical results indicate that the fugacities obtained by the proposed method are quite accurate and significantly outperform the existing Bethe approximation based techniques.
Despite the considerable interest in new dependent type theories, simple type theory (which dates from 1940) is sufficient to formalise serious topics in mathematics.
This point is seen by examining formal proofs of a theorem about stereographic projections.
A formalisation using the HOL Light proof assistant is contrasted with one using Isabelle/HOL.
Harrison's technique for formalising Euclidean spaces is contrasted with an approach using Isabelle/HOL's axiomatic type classes.
However, every formal system can be outgrown, and mathematics should be formalised with a view that it will eventually migrate to a new formalism.
Recent development of contraction theory based analysis of singularly perturbed system has opened the door for inspecting differential behavior of multi time-scale systems.
In this paper a contraction theory based framework is proposed for stabilization of singularly perturbed systems.
The primary objective is to design a feedback controller to achieve bounded tracking error for both standard and non-standard singularly perturbed systems.
This framework provides relaxation over traditional quadratic Lyapunov based method as there is no need to satisfy interconnection conditions during controller design algorithm.
Moreover, the stability bound does not depend on smallness of singularly perturbed parameter.
Combined with high gain scaling, the proposed technique is shown to assure contraction of approximate feedback linearizable systems.
These findings extend the class of nonlinear systems which can be made contracting.
Code optimization and high level synthesis can be posed as constraint satisfaction and optimization problems, such as graph coloring used in register allocation.
Graph coloring is also used to model more traditional CSPs relevant to AI, such as planning, time-tabling and scheduling.
Provably optimal solutions may be desirable for commercial and defense applications.
Additionally, for applications such as register allocation and code optimization, naturally-occurring instances of graph coloring are often small and can be solved optimally.
A recent wave of improvements in algorithms for Boolean satisfiability (SAT) and 0-1 Integer Linear Programming (ILP) suggests generic problem-reduction methods, rather than problem-specific heuristics, because (1) heuristics may be upset by new constraints, (2) heuristics tend to ignore structure, and (3) many relevant problems are provably inapproximable.
Problem reductions often lead to highly symmetric SAT instances, and symmetries are known to slow down SAT solvers.
In this work, we compare several avenues for symmetry breaking, in particular when certain kinds of symmetry are present in all generated instances.
Our focus on reducing CSPs to SAT allows us to leverage recent dramatic improvement in SAT solvers and automatically benefit from future progress.
We can use a variety of black-box SAT solvers without modifying their source code because our symmetry-breaking techniques are static, i.e., we detect symmetries and add symmetry breaking predicates (SBPs) during pre-processing.
An important result of our work is that among the types of instance-independent SBPs we studied and their combinations, the simplest and least complete constructions are the most effective.
Our experiments also clearly indicate that instance-independent symmetries should mostly be processed together with instance-specific symmetries rather than at the specification level, contrary to what has been suggested in the literature.
Mobility Management (MM) techniques have conventionally been centralized in nature, wherein a single network entity has been responsible for handling the mobility related tasks of the mobile nodes attached to the network.
However, an exponential growth in network traffic and the number of users has ushered in the concept of providing Mobility Management as a Service (MMaaS) to the wireless nodes attached to the 5G networks.
Allowing for on-demand mobility management solutions will not only provide the network with the flexibility that it needs to accommodate the many different use cases that are to be served by future networks, but it will also provide the network with the scalability that is needed alongside the flexibility to serve future networks.
And hence, in this paper, a detailed study of MMaaS has been provided, highlighting its benefits and challenges for 5G networks.
Additionally, the very important property of granularity of service which is deeply intertwined with the scalability and flexibility requirements of the future wireless networks, and a consequence of MMaaS, has also been discussed in detail.
With the technological advancements of aerial imagery and accurate 3d reconstruction of urban environments, more and more attention has been paid to the automated analyses of urban areas.
In our work, we examine two important aspects that allow live analysis of building structures in city models given oblique aerial imagery, namely automatic building extraction with convolutional neural networks (CNNs) and selective real-time depth estimation from aerial imagery.
We use transfer learning to train the Faster R-CNN method for real-time deep object detection, by combining a large ground-based dataset for urban scene understanding with a smaller number of images from an aerial dataset.
We achieve an average precision (AP) of about 80% for the task of building extraction on a selected evaluation dataset.
Our evaluation focuses on both dataset-specific learning and transfer learning.
Furthermore, we present an algorithm that allows for multi-view depth estimation from aerial imagery in real-time.
We adopt the semi-global matching (SGM) optimization strategy to preserve sharp edges at object boundaries.
In combination with the Faster R-CNN, it allows a selective reconstruction of buildings, identified with regions of interest (RoIs), from oblique aerial imagery.
Barring swarm robotics, a substantial share of current machine-human and machine-machine learning and interaction mechanisms are being developed and fed by results of agent-based computer simulations, game-theoretic models, or robotic experiments based on a dyadic communication pattern.
Yet, in real life, humans no less frequently communicate in groups, and gain knowledge and take decisions basing on information cumulatively gleaned from more than one single source.
These properties should be taken into consideration in the design of autonomous artificial cognitive systems construed to interact with learn from more than one contact or 'neighbour'.
To this end, significant practical import can be gleaned from research applying strict science methodology to human and social phenomena, e.g. to discovery of realistic creativity potential spans, or the 'exposure thresholds' after which new information could be accepted by a cognitive agent.
The results will be presented of a project analysing the social propagation of neologisms in a microblogging service.
From local, low-level interactions and information flows between agents inventing and imitating discrete lexemes we aim to describe the processes of the emergence of more global systemic order and dynamics, using the latest methods of complexity science.
Whether in order to mimic them, or to 'enhance' them, parameters gleaned from complexity science approaches to humans' social and humanistic behaviour should subsequently be incorporated as points of reference in the field of robotics and human-machine interaction.
Adams' extension of parsing expression grammars enables specifying indentation sensitivity using two non-standard grammar constructs --- indentation by a binary relation and alignment.
This paper proposes a step-by-step transformation of well-formed Adams' grammars for elimination of the alignment construct from the grammar.
The idea that alignment could be avoided was suggested by Adams but no process for achieving this aim has been described before.
Designing of touchless user interface is gaining popularity in various contexts.
Using such interfaces, users can interact with electronic devices even when the hands are dirty or non-conductive.
Also, user with partial physical disability can interact with electronic devices using such systems.
Research in this direction has got major boost because of the emergence of low-cost sensors such as Leap Motion, Kinect or RealSense devices.
In this paper, we propose a Leap Motion controller-based methodology to facilitate rendering of 2D and 3D shapes on display devices.
The proposed method tracks finger movements while users perform natural gestures within the field of view of the sensor.
In the next phase, trajectories are analyzed to extract extended Npen++ features in 3D.
These features represent finger movements during the gestures and they are fed to unidirectional left-to-right Hidden Markov Model (HMM) for training.
A one-to-one mapping between gestures and shapes is proposed.
Finally, shapes corresponding to these gestures are rendered over the display using MuPad interface.
We have created a dataset of 5400 samples recorded by 10 volunteers.
Our dataset contains 18 geometric and 18 non-geometric shapes such as "circle", "rectangle", "flower", "cone", "sphere" etc.
The proposed methodology achieves an accuracy of 92.87% when evaluated using 5-fold cross validation method.
Our experiments revel that the extended 3D features perform better than existing 3D features in the context of shape representation and classification.
The method can be used for developing useful HCI applications for smart display devices.
Given a finite set in a metric space, the topological analysis generalizes hierarchical clustering using a 1-parameter family of homology groups to quantify connectivity in all dimensions.
The connectivity is compactly described by the persistence diagram.
One limitation of the current framework is the reliance on metric distances, whereas in many practical applications objects are compared by non-metric dissimilarity measures.
Examples are the Kullback-Leibler divergence, which is commonly used for comparing text and images, and the Itakura-Saito divergence, popular for speech and sound.
These are two members of the broad family of dissimilarities called Bregman divergences.
We show that the framework of topological data analysis can be extended to general Bregman divergences, widening the scope of possible applications.
In particular, we prove that appropriately generalized Cech and Delaunay (alpha) complexes capture the correct homotopy type, namely that of the corresponding union of Bregman balls.
Consequently, their filtrations give the correct persistence diagram, namely the one generated by the uniformly growing Bregman balls.
Moreover, we show that unlike the metric setting, the filtration of Vietoris-Rips complexes may fail to approximate the persistence diagram.
We propose algorithms to compute the thus generalized Cech, Vietoris-Rips and Delaunay complexes and experimentally test their efficiency.
Lastly, we explain their surprisingly good performance by making a connection with discrete Morse theory.
Low-dose computed tomography (CT) has attracted a major attention in the medical imaging field, since CT-associated x-ray radiation carries health risks for patients.
The reduction of CT radiation dose, however, compromises the signal-to-noise ratio, and may compromise the image quality and the diagnostic performance.
Recently, deep-learning-based algorithms have achieved promising results in low-dose CT denoising, especially convolutional neural network (CNN) and generative adversarial network (GAN).
This article introduces a Contracting Path-based Convolutional Encoder-decoder (CPCE) network in 2D and 3D configurations within the GAN framework for low-dose CT denoising.
A novel feature of our approach is that an initial 3D CPCE denoising model can be directly obtained by extending a trained 2D CNN and then fine-tuned to incorporate 3D spatial information from adjacent slices.
Based on the transfer learning from 2D to 3D, the 3D network converges faster and achieves a better denoising performance than that trained from scratch.
By comparing the CPCE with recently published methods based on the simulated Mayo dataset and the real MGH dataset, we demonstrate that the 3D CPCE denoising model has a better performance, suppressing image noise and preserving subtle structures.
A simulation model based on parallel systems is established, aiming to explore the relation between the number of submissions and the overall quality of academic journals within a similar discipline under peer review.
The model can effectively simulate the submission, review and acceptance behaviors of academic journals, in a distributed manner.
According to the simulation experiments, it could possibly happen that the overall standard of academic journals may deteriorate due to excessive submissions.
We study probabilistic complexity classes and questions of derandomisation from a logical point of view.
For each logic L we introduce a new logic BPL, bounded error probabilistic L, which is defined from L in a similar way as the complexity class BPP, bounded error probabilistic polynomial time, is defined from PTIME.
Our main focus lies on questions of derandomisation, and we prove that there is a query which is definable in BPFO, the probabilistic version of first-order logic, but not in Cinf, finite variable infinitary logic with counting.
This implies that many of the standard logics of finite model theory, like transitive closure logic and fixed-point logic, both with and without counting, cannot be derandomised.
Similarly, we present a query on ordered structures which is definable in BPFO but not in monadic second-order logic, and a query on additive structures which is definable in BPFO but not in FO.
The latter of these queries shows that certain uniform variants of AC0 (bounded-depth polynomial sized circuits) cannot be derandomised.
These results are in contrast to the general belief that most standard complexity classes can be derandomised.
Finally, we note that BPIFP+C, the probabilistic version of fixed-point logic with counting, captures the complexity class BPP, even on unordered structures.
Based on the knowledge of dynamic systems, the shorter the transient response, or the faster a system reaches the steady-state after the introduction of the change, the smaller will be the output variability.
In lean manufacturing, the principle of reducing set-up times has the same purpose: reduce the transient time and improve production flow.
Analogously, the analysis of the transient response of project-driven systems may provide crucial information about how fast these systems react to a change and how that change affects their production output.
Although some studies have investigated flow variability in projects, few have looked at variability from the perspective that the transient state represents the changeovers on project-driven production systems and how the transient state affects the process' flow variability.
The purpose of this study is to investigate the effect of changes in project-driven production systems from a conceptual point of view, furthermore, measuring and correlating the transient response of five cases to their flow variability.
Results showed a proportional relationship between the percentile transient time and flow variability of a process.
That means that the quicker the production system reacts to change; the less the distress in the production output, consequently, lower levels of flow variability.
As practical implications, lean practices focusing on reducing set-up times (transient time) can have their effects measured on project-driven production flow.
The Time-Invariant Incremental Knapsack problem (IIK) is a generalization of Maximum Knapsack to a discrete multi-period setting.
At each time, capacity increases and items can be added, but not removed from the knapsack.
The goal is to maximize the sum of profits over all times.
IIK models various applications including specific financial markets and governmental decision processes.
IIK is strongly NP-hard and there has been work on giving approximation algorithms for some special cases.
In this paper, we settle the complexity of IIK by designing a PTAS based on rounding a disjuncive formulation, and provide several extensions of the technique.
Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlinear estimation problems in machine learning, but that are rarely used in signal processing.
In this tutorial, we present GPs for regression as a natural nonlinear extension to optimal Wiener filtering.
After establishing their basic formulation, we discuss several important aspects and extensions, including recursive and adaptive algorithms for dealing with non-stationarity, low-complexity solutions, non-Gaussian noise models and classification scenarios.
Furthermore, we provide a selection of relevant applications to wireless digital communications.
Current grammar-based NeuroEvolution approaches have several shortcomings.
On the one hand, they do not allow the generation of Artificial Neural Networks (ANNs composed of more than one hidden-layer.
On the other, there is no way to evolve networks with more than one output neuron.
To properly evolve ANNs with more than one hidden-layer and multiple output nodes there is the need to know the number of neurons available in previous layers.
In this paper we introduce Dynamic Structured Grammatical Evolution (DSGE): a new genotypic representation that overcomes the aforementioned limitations.
By enabling the creation of dynamic rules that specify the connection possibilities of each neuron, the methodology enables the evolution of multi-layered ANNs with more than one output neuron.
Results in different classification problems show that DSGE evolves effective single and multi-layered ANNs, with a varying number of output neurons.
Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives.
However, they tend to have many vulnerabilities which can be exploited by an attacker.
Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices.
However, an anomaly detection model must be trained for a long time in order to capture all benign behaviors.
This approach is vulnerable to adversarial attacks since all observations are assumed to be benign while training the anomaly detection model.
In this paper, we propose CIoTA, a lightweight framework that utilizes the blockchain concept to perform distributed and collaborative anomaly detection for devices with limited resources.
CIoTA uses blockchain to incrementally update a trusted anomaly detection model via self-attestation and consensus among IoT devices.
We evaluate CIoTA on our own distributed IoT simulation platform, which consists of 48 Raspberry Pis, to demonstrate CIoTA's ability to enhance the security of each device and the security of the network as a whole.
Mobile streaming video data accounts for a large and increasing percentage of wireless network traffic.
The available bandwidths of modern wireless networks are often unstable, leading to difficulties in delivering smooth, high-quality video.
Streaming service providers such as Netflix and YouTube attempt to adapt their systems to adjust in response to these bandwidth limitations by changing the video bitrate or, failing that, allowing playback interruptions (rebuffering).
Being able to predict end user' quality of experience (QoE) resulting from these adjustments could lead to perceptually-driven network resource allocation strategies that would deliver streaming content of higher quality to clients, while being cost effective for providers.
Existing objective QoE models only consider the effects on user QoE of video quality changes or playback interruptions.
For streaming applications, adaptive network strategies may involve a combination of dynamic bitrate allocation along with playback interruptions when the available bandwidth reaches a very low value.
Towards effectively predicting user QoE, we propose Video Assessment of TemporaL Artifacts and Stalls (Video ATLAS): a machine learning framework where we combine a number of QoE-related features, including objective quality features, rebuffering-aware features and memory-driven features to make QoE predictions.
We evaluated our learning-based QoE prediction model on the recently designed LIVE-Netflix Video QoE Database which consists of practical playout patterns, where the videos are afflicted by both quality changes and rebuffering events, and found that it provides improved performance over state-of-the-art video quality metrics while generalizing well on different datasets.
The proposed algorithm is made publicly available at http://live.ece.utexas.edu/research/Quality/VideoATLAS release_v2.rar.
In this paper, new context of Chinese Remainder Theorem (CRT) based analysis of combinatorial sequence generators has been presented.
CRT is exploited to establish fixed patterns in LFSR sequences and underlying cyclic structures of finite fields.
New methodology of direct computations of DFT spectral points in higher finite fields from known DFT spectra points of smaller constituent fields is also introduced.
Novel approach of CRT based structural analysis of LFSR based combinatorial sequence is given both in time and frequency domain.
The proposed approach is demonstrated on some examples of combiner generators and is scalable to general configuration of combiner generators.
A person is commonly described by attributes like height, build, cloth color, cloth type, and gender.
Such attributes are known as soft biometrics.
They bridge the semantic gap between human description and person retrieval in surveillance video.
The paper proposes a deep learning-based linear filtering approach for person retrieval using height, cloth color, and gender.
The proposed approach uses Mask R-CNN for pixel-wise person segmentation.
It removes background clutter and provides precise boundary around the person.
Color and gender models are fine-tuned using AlexNet and the algorithm is tested on SoftBioSearch dataset.
It achieves good accuracy for person retrieval using the semantic query in challenging conditions.
Massive open online courses pose a massive challenge for grading the answerscripts at a high accuracy.
Peer grading is often viewed as a scalable solution to this challenge, which largely depends on the altruism of the peer graders.
Some approaches in the literature treat peer grading as a 'best-effort service' of the graders, and statistically correct their inaccuracies before awarding the final scores, but ignore graders' strategic behavior.
Few other approaches incentivize non-manipulative actions of the peer graders but do not make use of certain additional information that is potentially available in a peer grading setting, e.g., the true grade can eventually be observed at an additional cost.
This cost can be thought of as an additional effort from the teaching staff if they had to finally take a look at the corrected papers post peer grading.
In this paper, we use such additional information and introduce a mechanism, TRUPEQA, that (a) uses a constant number of instructor-graded answerscripts to quantitatively measure the accuracies of the peer graders and corrects the scores accordingly, (b) ensures truthful revelation of their observed grades, (c) penalizes manipulation, but not inaccuracy, and (d) reduces the total cost of arriving at the true grades, i.e., the additional person-hours of the teaching staff.
We show that this mechanism outperforms several standard peer grading techniques used in practice, even at times when the graders are non-manipulative.
In high mobility applications of millimeter wave (mmWave) communications, e.g., vehicle-to-everything communication and next-generation cellular communication, frequent link configuration can be a source of significant overhead.
We use the sub-6 GHz channel covariance as an out-of-band side information for mmWave link configuration.
Assuming: (i) a fully digital architecture at sub-6 GHz; and (ii) a hybrid analog-digital architecture at mmWave, we propose an out-of-band covariance translation approach and an out-of-band aided compressed covariance estimation approach.
For covariance translation, we estimate the parameters of sub-6 GHz covariance and use them in theoretical expressions of covariance matrices to predict the mmWave covariance.
For out-of-band aided covariance estimation, we use weighted sparse signal recovery to incorporate out-of-band information in compressed covariance estimation.
The out-of-band covariance translation eliminates the in-band training completely, whereas out-of-band aided covariance estimation relies on in-band as well as out-of-band training.
We also analyze the loss in the signal-to-noise ratio due to an imperfect estimate of the covariance.
The simulation results show that the proposed covariance estimation strategies can reduce the training overhead compared to the in-band only covariance estimation.
The entropy region is constructed from vectors of random variables by collecting Shannon entropies of all subvectors.
Its shape is studied here by means of polymatroidal constructions, notably by convolution.
The closure of the region is decomposed into the direct sum of tight and modular parts, reducing the study to the tight part.
The relative interior of the reduction belongs to the entropy region.
Behavior of the decomposition under selfadhesivity is clarified.
Results are specialized to and completed for the region of four random variables.
This and computer experiments help to visualize approximations of a symmetrized part of the entropy region.
Four-atom conjecture on the minimization of Ingleton score is refuted.
English to Indian language machine translation poses the challenge of structural and morphological divergence.
This paper describes English to Indian language statistical machine translation using pre-ordering and suffix separation.
The pre-ordering uses rules to transfer the structure of the source sentences prior to training and translation.
This syntactic restructuring helps statistical machine translation to tackle the structural divergence and hence better translation quality.
The suffix separation is used to tackle the morphological divergence between English and highly agglutinative Indian languages.
We demonstrate that the use of pre-ordering and suffix separation helps in improving the quality of English to Indian Language machine translation.
The authors propose a parametric model called the arena model for prediction in paired competitions, i.e. paired comparisons with eliminations and bifurcations.
The arena model has a number of appealing advantages.
First, it predicts the results of competitions without rating many individuals.
Second, it takes full advantage of the structure of competitions.
Third, the model provides an easy method to quantify the uncertainty in competitions.
Fourth, some of our methods can be directly generalized for comparisons among three or more individuals.
Furthermore, the authors identify an invariant Bayes estimator with regard to the prior distribution and prove the consistency of the estimations of uncertainty.
Currently, the arena model is not effective in tracking the change of strengths of individuals, but its basic framework provides a solid foundation for future study of such cases.
Synchronizing sequences have been proposed in the late 60's to solve testing problems on systems modeled by finite state machines.
Such sequences lead a system, seen as a black box, from an unknown current state to a known final one.
This paper presents a first investigation of the computation of synchronizing sequences for systems modeled by bounded synchronized Petri nets.
In the first part of the paper, existing techniques for automata are adapted to this new setting.
Later on, new approaches, that exploit the net structure to efficiently compute synchronizing sequences without an exhaustive enumeration of the state space, are presented.
The Variational Autoencoder (VAE) is a powerful architecture capable of representation learning and generative modeling.
When it comes to learning interpretable (disentangled) representations, VAE and its variants show unparalleled performance.
However, the reasons for this are unclear, since a very particular alignment of the latent embedding is needed but the design of the VAE does not encourage it in any explicit way.
We address this matter and offer the following explanation: the diagonal approximation in the encoder together with the inherent stochasticity force local orthogonality of the decoder.
The local behavior of promoting both reconstruction and orthogonality matches closely how the PCA embedding is chosen.
Alongside providing an intuitive understanding, we justify the statement with full theoretical analysis as well as with experiments.
This paper investigates delay-distortion-power trade offs in transmission of quasi-stationary sources over block fading channels by studying encoder and decoder buffering techniques to smooth out the source and channel variations.
Four source and channel coding schemes that consider buffer and power constraints are presented to minimize the reconstructed source distortion.
The first one is a high performance scheme, which benefits from optimized source and channel rate adaptation.
In the second scheme, the channel coding rate is fixed and optimized along with transmission power with respect to channel and source variations; hence this scheme enjoys simplicity of implementation.
The two last schemes have fixed transmission power with optimized adaptive or fixed channel coding rate.
For all the proposed schemes, closed form solutions for mean distortion, optimized rate and power are provided and in the high SNR regime, the mean distortion exponent and the asymptotic mean power gains are derived.
The proposed schemes with buffering exploit the diversity due to source and channel variations.
Specifically, when the buffer size is limited, fixed channel rate adaptive power scheme outperforms an adaptive rate fixed power scheme.
Furthermore, analytical and numerical results demonstrate that with limited buffer size, the system performance in terms of reconstructed signal SNR saturates as transmission power is increased, suggesting that appropriate buffer size selection is important to achieve a desired reconstruction quality.
In this paper, we consider a MU-MISO system where users have highly accurate Channel State Information (CSI), while the Base Station (BS) has partial CSI consisting of an imperfect channel estimate and statistical knowledge of the CSI error.
With the objective of maximizing the Average Sum Rate (ASR) subject to a power constraint, a special transmission scheme is considered where the BS transmits a common symbol in a multicast fashion, in addition to the conventional private symbols.
This scheme is termed Joint Multicasting and Broadcasting (JMB).
The ASR problem is transformed into an augmented Average Weighted Sum Mean Square Error (AWSMSE) problem which is solved using Alternating Optimization (AO).
The enhanced rate performance accompanied with the incorporation of the multicast part is demonstrated through simulations.
Most saliency estimation methods aim to explicitly model low-level conspicuity cues such as edges or blobs and may additionally incorporate top-down cues using face or text detection.
Data-driven methods for training saliency models using eye-fixation data are increasingly popular, particularly with the introduction of large-scale datasets and deep architectures.
However, current methods in this latter paradigm use loss functions designed for classification or regression tasks whereas saliency estimation is evaluated on topographical maps.
In this work, we introduce a new saliency map model which formulates a map as a generalized Bernoulli distribution.
We then train a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions.
We show in extensive experiments the effectiveness of such loss functions over standard ones on four public benchmark datasets, and demonstrate improved performance over state-of-the-art saliency methods.
The IPv4 addresses exhaustion demands a protocol transition from IPv4 to IPv6.
The original transition technique, the dual stack, is not widely deployed yet and it demanded the creation of new transition techniques to extend the transition period.
This work makes an experimental comparison of techniques that use dual stack with a limited IPv4 address.
This limited address might be a RFC 1918 address with a NAT at the Internet Service Provider (ISP) gateway, also known as Carrier Grade NAT (CGN), or an Address Plus Port (A+P) shared IPv4 address.
The chosen techniques also consider an IPv6 only ISP network.
The transport of the IPv4 packets through the IPv6 only networks may use IPv4 packets encapsulated on IPv6 packets or a double translation, by making one IPv4 to IPv6 translation to enter the IPv6 only network and one IPv6 to IPv4 translation to return to the IPv4 network.
The chosen techniques were DS-Lite, 464XLAT, MAP-E and MAP-T.
The first part of the test is to check some of the most common usages of the Internet by a home user and the impacts of the transition techniques on the user experience.
The second part is a measured comparison considering bandwidth, jitter and latency introduced by the techniques and processor usage on the network equipment.
We present a method for finding correspondence between 3D models.
From an initial set of feature correspondences, our method uses a fast voting scheme to separate the inliers from the outliers.
The novelty of our method lies in the use of a combination of local and global constraints to determine if a vote should be cast.
On a local scale, we use simple, low-level geometric invariants.
On a global scale, we apply covariant constraints for finding compatible correspondences.
We guide the sampling for collecting voters by downward dependencies on previous voting stages.
All of this together results in an accurate matching procedure.
We evaluate our algorithm by controlled and comparative testing on different datasets, giving superior performance compared to state of the art methods.
In a final experiment, we apply our method for 3D object detection, showing potential use of our method within higher-level vision.
With the emergence of Non-Volatile Memories (NVMs) and their shortcomings such as limited endurance and high power consumption in write requests, several studies have suggested hybrid memory architecture employing both Dynamic Random Access Memory (DRAM) and NVM in a memory system.
By conducting a comprehensive experiments, we have observed that such studies lack to consider very important aspects of hybrid memories including the effect of: a) data migrations on performance, b) data migrations on power, and c) the granularity of data migration.
This paper presents an efficient data migration scheme at the Operating System level in a hybrid DRAMNVM memory architecture.
In the proposed scheme, two Least Recently Used (LRU) queues, one for DRAM section and one for NVM section, are used for the sake of data migration.
With careful characterization of the workloads obtained from PARSEC benchmark suite, the proposed scheme prevents unnecessary migrations and only allows migrations which benefits the system in terms of power and performance.
The experimental results show that the proposed scheme can reduce the power consumption up to 79% compared to DRAM-only memory and up to 48% compared to the state-of-the art techniques.
Corrective Transmission Switching can be used by the grid operator to relieve line overloading and voltage violations, improve system reliability, and reduce system losses.
Power grid optimization by means of line switching is typically formulated as a mixed integer programming problem (MIP).
Such problems are known to be computationally intractable, and accordingly, a number of heuristic approaches to grid topology reconfiguration have been proposed in the power systems literature.
By means of some low order examples (3-bus systems), it is shown that within a reasonably large class of greedy heuristics, none can be found that perform better than the others across all grid topologies.
Despite this cautionary tale, statistical evidence based on a large number of simulations using using IEEE 118- bus systems indicates that among three heuristics, a globally greedy heuristic is the most computationally intensive, but has the best chance of reducing generation costs while enforcing N-1 connectivity.
It is argued that, among all iterative methods, the locally optimal switches at each stage have a better chance in not only approximating a global optimal solution but also greatly limiting the number of lines that are switched.
One of the main difficulties in echo cancellation is the fact that the learning rate needs to vary according to conditions such as double-talk and echo path change.
In this paper we propose a new method of varying the learning rate of a frequency-domain echo canceller.
This method is based on the derivation of the optimal learning rate of the NLMS algorithm in the presence of noise.
The method is evaluated in conjunction with the multidelay block frequency domain (MDF) adaptive filter.
We demonstrate that it performs better than current double-talk detection techniques and is simple to implement.
Text Clustering is a text mining technique which divides the given set of text documents into significant clusters.
It is used for organizing a huge number of text documents into a well-organized form.
In the majority of the clustering algorithms, the number of clusters must be specified apriori, which is a drawback of these algorithms.
The aim of this paper is to show experimentally how to determine the number of clusters based on cluster quality.
Since partitional clustering algorithms are well-suited for clustering large document datasets, we have confined our analysis to a partitional clustering algorithm.
We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model.
This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation.
In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements.
We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives.
Past research has shown the benefits of food journaling in promoting mindful eating and healthier food choices.
However, the links between journaling and healthy eating have not been thoroughly examined.
Beyond caloric restriction, do journalers consistently and sufficiently consume healthful diets?
How different are their eating habits compared to those of average consumers who tend to be less conscious about health?
In this study, we analyze the healthy eating behaviors of active food journalers using data from MyFitnessPal.
Surprisingly, our findings show that food journalers do not eat as healthily as they should despite their proclivity to health eating and their food choices resemble those of the general populace.
Furthermore, we find that the journaling duration is only a marginal determinant of healthy eating outcomes and sociodemographic factors, such as gender and regions of residence, are much more predictive of healthy food choices.
We construct a family of perfect polyphase sequences that has the Frank sequences, Chu sequences, and Milewski sequences as special cases.
This is not the most general construction of this type, but it has a particularly simple form.
We also include some remarks about the acyclic autocorrelations of our sequences.
We present a machine learning framework that leverages a mixture of metadata, network, and temporal features to detect extremist users, and predict content adopters and interaction reciprocity in social media.
We exploit a unique dataset containing millions of tweets generated by more than 25 thousand users who have been manually identified, reported, and suspended by Twitter due to their involvement with extremist campaigns.
We also leverage millions of tweets generated by a random sample of 25 thousand regular users who were exposed to, or consumed, extremist content.
We carry out three forecasting tasks, (i) to detect extremist users, (ii) to estimate whether regular users will adopt extremist content, and finally (iii) to predict whether users will reciprocate contacts initiated by extremists.
All forecasting tasks are set up in two scenarios: a post hoc (time independent) prediction task on aggregated data, and a simulated real-time prediction task.
The performance of our framework is extremely promising, yielding in the different forecasting scenarios up to 93% AUC for extremist user detection, up to 80% AUC for content adoption prediction, and finally up to 72% AUC for interaction reciprocity forecasting.
We conclude by providing a thorough feature analysis that helps determine which are the emerging signals that provide predictive power in different scenarios.
Wireless network applications, such as, searching, routing, self stabilization and query processing can be modeled as random walks on graphs.
Stateless Opportunistic routing technique is a robust distributed routing technique based on random walk approach, where nodes transfer the packets to one of their direct neighbors uniformly, until the packets reach their destinations.
Simplicity in execution, fault tolerance, low overhead and robustness to topology changes made it more suitable to wireless sensor networks scenarios.
But the main disadvantage of stateless opportunistic routing is estimating and studying the effect of network parameters on the packet latency.
In this work, we derived the analytical expressions for mean latency or average packet travel time for r-nearest neighbor cycle, r-nearest neighbor torus networks.
Further, we derived the generalized expression for mean latency for m-dimensional r- nearest neighbor torus networks and studied the effect of number of nodes, nearest neighbors and network dimension on average packet travel time.
The basic indicators of a researcher's productivity and impact are still the number of publications and their citation counts.
These metrics are clear, straightforward, and easy to obtain.
When a ranking of scholars is needed, for instance in grant, award, or promotion procedures, their use is the fastest and cheapest way of prioritizing some scientists over others.
However, due to their nature, there is a danger of oversimplifying scientific achievements.
Therefore, many other indicators have been proposed including the usage of the PageRank algorithm known for the ranking of webpages and its modifications suited to citation networks.
Nevertheless, this recursive method is computationally expensive and even if it has the advantage of favouring prestige over popularity, its application should be well justified, particularly when compared to the standard citation counts.
In this study, we analyze three large datasets of computer science papers in the categories of artificial intelligence, software engineering, and theory and methods and apply 12 different ranking methods to the citation networks of authors.
We compare the resulting rankings with self-compiled lists of outstanding researchers selected as frequent editorial board members of prestigious journals in the field and conclude that there is no evidence of PageRank-based methods outperforming simple citation counts.
System Gramian matrices are a well-known encoding for properties of input-output systems such as controllability, observability or minimality.
These so-called system Gramians were developed in linear system theory for applications such as model order reduction of control systems.
Empirical Gramian are an extension to the system Gramians for parametric and nonlinear systems as well as a data-driven method of computation.
The empirical Gramian framework - emgr - implements the empirical Gramians in a uniform and configurable manner, with applications such as Gramian-based (nonlinear) model reduction, decentralized control, sensitivity analysis, parameter identification and combined state and parameter reduction.
Deep Reinforcement Learning (DRL) has achieved impressive success in many applications.
A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair.
The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted.
To our knowledge, this work develops the first mimic learning framework for Q functions in DRL.
We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions.
An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment.
Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods.
The transparent tree structure of an LMUT facilitates understanding the network's learned knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs.
The problem of robustly reconstructing a large number from its erroneous remainders with respect to several moduli, namely the robust remaindering problem, may occur in many applications including phase unwrapping, frequency detection from several undersampled waveforms, wireless sensor networks, etc.
Assuming that the dynamic range of the large number is the maximal possible one, i.e., the least common multiple (lcm) of all the moduli, a method called robust Chinese remainder theorem (CRT) for solving the robust remaindering problem has been recently proposed.
In this paper, by relaxing the assumption that the dynamic range is fixed to be the lcm of all the moduli, a trade-off between the dynamic range and the robustness bound for two-modular systems is studied.
It basically says that a decrease in the dynamic range may lead to an increase of the robustness bound.
We first obtain a general condition on the remainder errors and derive the exact dynamic range with a closed-form formula for the robustness to hold.
We then propose simple closed-form reconstruction algorithms.
Furthermore, the newly obtained two-modular results are applied to the robust reconstruction for multi-modular systems and generalized to real numbers.
Finally, some simulations are carried out to verify our proposed theoretical results.
In this paper we present a method for automatically planning robust optimal paths for a group of robots that satisfy a common high level mission specification.
Each robot's motion in the environment is modeled as a weighted transition system, and the mission is given as a Linear Temporal Logic (LTL) formula over a set of propositions satisfied by the regions of the environment.
In addition, an optimizing proposition must repeatedly be satisfied.
The goal is to minimize the maximum time between satisfying instances of the optimizing proposition while ensuring that the LTL formula is satisfied even with uncertainty in the robots' traveling times.
We characterize a class of LTL formulas that are robust to robot timing errors, for which we generate optimal paths if no timing errors are present, and we present bounds on the deviation from the optimal values in the presence of errors.
We implement and experimentally evaluate our method considering a persistent monitoring task in a road network environment.
The problem of finding dominators in a directed graph has many important applications, notably in global optimization of computer code.
Although linear and near-linear-time algorithms exist, they use sophisticated data structures.
We develop an algorithm for finding dominators that uses only a "static tree" disjoint set data structure in addition to simple lists and maps.
The algorithm runs in near-linear or linear time, depending on the implementation of the disjoint set data structure.
We give several versions of the algorithm, including one that computes loop nesting information (needed in many kinds of global code optimization) and that can be made self-certifying, so that the correctness of the computed dominators is very easy to verify.
In this letter, we address the symbol synchronization issue in molecular communication via diffusion (MCvD).
Symbol synchronization among chemical sensors and nanomachines is one of the critical challenges to manage complex tasks in the nanonetworks with molecular communication (MC).
As in diffusion-based MC, most of the molecules arrive at the receptor closer to the start of the symbol duration, the wrong estimation of the start of the symbol interval leads to high symbol detection error.
By utilizing two types of molecules with different diffusion coefficients we propose a synchronization technique for MCvD.
Moreover, we evaluate the symbol-error-rate performance under the proposed symbol synchronization scheme for equal and non-equal symbol duration MCvD systems.
In this lecture note, we describe high dynamic range (HDR) imaging systems; such systems are able to represent luminances of much larger brightness and, typically, also a larger range of colors than conventional standard dynamic range (SDR) imaging systems.
The larger luminance range greatly improve the overall quality of visual content, making it appears much more realistic and appealing to observers.
HDR is one of the key technologies of the future imaging pipeline, which will change the way the digital visual content is represented and manipulated today.
Bedside caregivers assess infants' pain at constant intervals by observing specific behavioral and physiological signs of pain.
This standard has two main limitations.
The first limitation is the intermittent assessment of pain, which might lead to missing pain when the infants are left unattended.
Second, it is inconsistent since it depends on the observer's subjective judgment and differs between observers.
The intermittent and inconsistent assessment can induce poor treatment and, therefore, cause serious long-term consequences.
To mitigate these limitations, the current standard can be augmented by an automated system that monitors infants continuously and provides quantitative and consistent assessment of pain.
Several automated methods have been introduced to assess infants' pain automatically based on analysis of behavioral or physiological pain indicators.
This paper comprehensively reviews the automated approaches (i.e., approaches to feature extraction) for analyzing infants' pain and the current efforts in automatic pain recognition.
In addition, it reviews the databases available to the research community and discusses the current limitations of the automated pain assessment.
A/B testing is a standard approach for evaluating the effect of online experiments; the goal is to estimate the `average treatment effect' of a new feature or condition by exposing a sample of the overall population to it.
A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network.
In this work, we propose a novel methodology using graph clustering to analyze average treatment effects under social interference.
To begin, we characterize graph-theoretic conditions under which individuals can be considered to be `network exposed' to an experiment.
We then show how graph cluster randomization admits an efficient exact algorithm to compute the probabilities for each vertex being network exposed under several of these exposure conditions.
Using these probabilities as inverse weights, a Horvitz-Thompson estimator can then provide an effect estimate that is unbiased, provided that the exposure model has been properly specified.
Given an estimator that is unbiased, we focus on minimizing the variance.
First, we develop simple sufficient conditions for the variance of the estimator to be asymptotically small in n, the size of the graph.
However, for general randomization schemes, this variance can be lower bounded by an exponential function of the degrees of a graph.
In contrast, we show that if a graph satisfies a restricted-growth condition on the growth rate of neighborhoods, then there exists a natural clustering algorithm, based on vertex neighborhoods, for which the variance of the estimator can be upper bounded by a linear function of the degrees.
Thus we show that proper cluster randomization can lead to exponentially lower estimator variance when experimentally measuring average treatment effects under interference.
Deep CNNs have been pushing the frontier of visual recognition over past years.
Besides recognition accuracy, strong demands in understanding deep CNNs in the research community motivate developments of tools to dissect pre-trained models to visualize how they make predictions.
Recent works further push the interpretability in the network learning stage to learn more meaningful representations.
In this work, focusing on a specific area of visual recognition, we report our efforts towards interpretable face recognition.
We propose a spatial activation diversity loss to learn more structured face representations.
By leveraging the structure, we further design a feature activation diversity loss to push the interpretable representations to be discriminative and robust to occlusions.
We demonstrate on three face recognition benchmarks that our proposed method is able to improve face recognition accuracy with easily interpretable face representations.
In this paper, we address the question of information preservation in ill-posed, non-linear inverse problems, assuming that the measured data is close to a low-dimensional model set.
We provide necessary and sufficient conditions for the existence of a so-called instance optimal decoder, i.e., that is robust to noise and modelling error.
Inspired by existing results in compressive sensing, our analysis is based on a (Lower) Restricted Isometry Property (LRIP), formulated in a non-linear fashion.
We also provide sufficient conditions for non-uniform recovery with random measurement operators, with a new formulation of the LRIP.
We finish by describing typical strategies to prove the LRIP in both linear and non-linear cases, and illustrate our results by studying the invertibility of a one-layer neural net with random weights.
Business process models describe the way of working in an organization.
Typically, business process models distinguish between the normal flow of work and exceptions to that normal flow.
However, they often present an idealized view.
This means that unexpected exceptions - exceptions that are not modelled in the business process model - can also occur in practice.
This has an effect on the efficiency of the organization, because information systems are not developed to handle unexpected exceptions.
This paper studies the relation between the occurrence of exceptions and operational performance.
It does this by analyzing the execution logs of business processes from five organizations, classifying execution paths as normal or exceptional.
Subsequently, it analyzes the differences between normal and exceptional paths.
The results show that exceptions are related to worse operational performance in terms of a longer throughput time and that unexpected exceptions relate to a stronger increase in throughput time than expected exceptions.
In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.
We also present integrated sequence-to-sequence baselines for machine translation, automatic speech recognition, spoken language translation, and multimodal summarization.
By making available data and code for several multimodal natural language tasks, we hope to stimulate more research on these and similar challenges, to obtain a deeper understanding of multimodality in language processing.
We replicate a variation of the image captioning architecture by Vinyals et al.(2015), then introduce dropout during inference mode to simulate the effects of neurodegenerative diseases like Alzheimer's disease (AD) and Wernicke's aphasia (WA).
We evaluate the effects of dropout on language production by measuring the KL-divergence of word frequency distributions and other linguistic metrics as dropout is added.
We find that the generated sentences most closely approximate the word frequency distribution of the training corpus when using a moderate dropout of 0.4 during inference.
If a robot is supposed to roam an environment and interact with objects, it is often necessary to know all possible objects in advance, so that a database with models of all objects can be generated for visual identification.
However, this constraint cannot always be fulfilled.
Due to that reason, a model based object recognition cannot be used to guide the robot's interactions.
Therefore, this paper proposes a system that analyzes features of encountered objects and then uses these features to compare unknown objects to already known ones.
From the resulting similarity appropriate actions can be derived.
Moreover, the system enables the robot to learn object categories by grouping similar objects or by splitting existing categories.
To represent the knowledge a hybrid form is used, consisting of both symbolic and subsymbolic representations.
This paper introduces a novel approach of clustering, which is based on group consensus of dynamic linear high-order multi-agent systems.
The graph topology is associated with a selected multi-agent system, with each agent corresponding to one vertex.
In order to reveal the cluster structure, the agents belonging to a similar cluster are expected to aggregate together.
As theoretical foundation, a necessary and sufficient condition is given to check the group consensus.
Two numerical instances are shown to illustrate the process of approach.
We reveal a complete set of constraints that need to be imposed on a set of 3-by-3 matrices to ensure that the matrices represent genuine homographies associated with multiple planes between two views.
We also show how to exploit the constraints to obtain more accurate estimates of homography matrices between two views.
Our study resolves a long-standing research question and provides a fresh perspective and a more in-depth understanding of the multiple homography estimation task.
As the number of charging Plug-in Electric Vehicles (PEVs) increase, due to the limited power capacity of the distribution feeders and the sensitivity of the mid-way distribution transformers to the excessive load, it is crucial to control the amount of power through each specific distribution feeder to avoid system overloads that may lead to breakdowns.
In this paper we develop, analyze and evaluate charging algorithms for PEVs with feeder overload constraints in the distribution grid.
The algorithms we propose jointly minimize the variance of the aggregate load and prevent overloading of the distribution feeders.
In this paper we present an online wide-area oscillation damping control (WAC) design for uncertain models of power systems using ideas from reinforcement learning.
We assume that the exact small-signal model of the power system at the onset of a contingency is not known to the operator and use the nominal model and online measurements of the generator states and control inputs to rapidly converge to a state-feedback controller that minimizes a given quadratic energy cost.
However, unlike conventional linear quadratic regulators (LQR), we intend our controller to be sparse, so its implementation reduces the communication costs.
We, therefore, employ the gradient support pursuit (GraSP) optimization algorithm to impose sparsity constraints on the control gain matrix during learning.
The sparse controller is thereafter implemented using distributed communication.
Using the IEEE 39-bus power system model with 1149 unknown parameters, it is demonstrated that the proposed learning method provides reliable LQR performance while the controller matched to the nominal model becomes unstable for severely uncertain systems.
Currently the area of VANET lacks in having some better designed algorithms to handle dynamic change and frequent disruption due to the high mobility of the vehicles.
There are many techniques to disseminate messages across the moving vehicles but they are all highly dependent on some conditions involving flow, density and speed.
The two techniques that are commonly used are AODV (Ad Hoc on Demand Distance Vector) and DSRC (Dedicated Short Range Communication).
This work presents a detailed analysis of AODV.
This study is focused on the use of AODV in Intelligent Transportation System.
The limitations in the working of AODV routing protocol has been identified and proved.
These limitations can be removed to some extent in order to increase the performance of vehicular networks and make the driving more safe and easy for a normal user as well as the implementation complications will be removed and an efficient system implementation will be possible.
We describe the LoopInvGen tool for generating loop invariants that can provably guarantee correctness of a program with respect to a given specification.
LoopInvGen is an efficient implementation of the inference technique originally proposed in our earlier work on PIE (https://doi.org/10.1145/2908080.2908099).
In contrast to existing techniques, LoopInvGen is not restricted to a fixed set of features -- atomic predicates that are composed together to build complex loop invariants.
Instead, we start with no initial features, and use program synthesis techniques to grow the set on demand.
This not only enables a less onerous and more expressive approach, but also appears to be significantly faster than the existing tools over the SyGuS-COMP 2017 benchmarks from the INV track.
We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP tasks.
The method we propose can be used to extend any closed-vocabulary generative model, but in this paper we specifically consider the case of neural language modeling.
Our Bayesian generative story combines a standard RNN language model (generating the word tokens in each sentence) with an RNN-based spelling model (generating the letters in each word type).
These two RNNs respectively capture sentence structure and word structure, and are kept separate as in linguistics.
By invoking the second RNN to generate spellings for novel words in context, we obtain an open-vocabulary language model.
For known words, embeddings are naturally inferred by combining evidence from type spelling and token context.
Comparing to baselines (including a novel strong baseline), we beat previous work and establish state-of-the-art results on multiple datasets.
The analysis of large collections of image data is still a challenging problem due to the difficulty of capturing the true concepts in visual data.
The similarity between images could be computed using different and possibly multimodal features such as color or edge information or even text labels.
This motivates the design of image analysis solutions that are able to effectively integrate the multi-view information provided by different feature sets.
We therefore propose a new image retrieval solution that is able to sort images through a random walk on a multi-layer graph, where each layer corresponds to a different type of information about the image data.
We study in depth the design of the image graph and propose in particular an effective method to select the edge weights for the multi-layer graph, such that the image ranking scores are optimised.
We then provide extensive experiments in different real-world photo collections, which confirm the high performance of our new image retrieval algorithm that generally surpasses state-of-the-art solutions due to a more meaningful image similarity computation.
New technological developments have made it possible to interact with computer systems and applications anywhere and anytime.
It is vital that these applications are able to adapt to the user, as a person, and to its current situation, whatever that is.
Therefore, the premises for evolution towards a learning society and a knowledge economy are present.
Hence, there is a stringent demand for new learner-centred frameworks that allow active participation of learners in knowledge creation within communities, organizations, territories and society, at large.
This paper presents the multi-agent architecture of our context-aware system and the learning scenarios within ubiquitous learning environments that the system provides support for.
This architecture is the outcome of our endeavour to develop ePH, a system for sharing public interest information and knowledge, which is accessible through always-on, context-aware services.
This paper presents an effective color normalization method for thin blood film images of peripheral blood specimens.
Thin blood film images can easily be separated to foreground (cell) and background (plasma) parts.
The color of the plasma region is used to estimate and reduce the differences arising from different illumination conditions.
A second stage normalization based on the database-gray world algorithm transforms the color of the foreground objects to match a reference color character.
The quantitative experiments demonstrate the effectiveness of the method and its advantages against two other general purpose color correction methods: simple gray world and Retinex.
How can a delivery robot navigate reliably to a destination in a new office building, with minimal prior information?
To tackle this challenge, this paper introduces a two-level hierarchical approach, which integrates model-free deep learning and model-based path planning.
At the low level, a neural-network motion controller, called the intention-net, is trained end-to-end to provide robust local navigation.
The intention-net maps images from a single monocular camera and "intentions" directly to robot controls.
At the high level, a path planner uses a crude map, e.g., a 2-D floor plan, to compute a path from the robot's current location to the goal.
The planned path provides intentions to the intention-net.
Preliminary experiments suggest that the learned motion controller is robust against perceptual uncertainty and by integrating with a path planner, it generalizes effectively to new environments and goals.
Fuzzing and symbolic execution are popular techniques for finding vulnerabilities and generating test-cases for programs.
Fuzzing, a blackbox method that mutates seed input values, is generally incapable of generating diverse inputs that exercise all paths in the program.
Due to the path-explosion problem and dependence on SMT solvers, symbolic execution may also not achieve high path coverage.
A hybrid technique involving fuzzing and symbolic execution may achieve better function coverage than fuzzing or symbolic execution alone.
In this paper, we present Munch, an open source framework implementing two hybrid techniques based on fuzzing and symbolic execution.
We empirically show using nine large open-source programs that overall, Munch achieves higher (in-depth) function coverage than symbolic execution or fuzzing alone.
Using metrics based on total analyses time and number of queries issued to the SMT solver, we also show that Munch is more efficient at achieving better function coverage.
This paper presents a generalized energy storage system model for voltage and angle stability analysis.
The proposed solution allows modeling most common energy storage technologies through a given set of linear differential algebraic equations (DAEs).
In particular, the paper considers, but is not limited to, compressed air, superconducting magnetic, electrochemical capacitor and battery energy storage devices.
While able to cope with a variety of different technologies, the proposed generalized model proves to be accurate for angle and voltage stability analysis, as it includes a balanced, fundamental-frequency model of the voltage source converter (VSC) and the dynamics of the dc link.
Regulators with inclusion of hard limits are also taken into account.
The transient behavior of the generalized model is compared with detailed fundamental-frequency balanced models as well as commonly-used simplified models of energy storage devices.
A comprehensive case study based on the WSCC 9-bus test system is presented and discussed.
The basic objective of data visualization is to provide an efficient graphical display for summarizing and reasoning about quantitative information.
During the last decades, political science has accumulated a large corpus of various kinds of data such as comprehensive factbooks and atlases, characterizing all or most of existing states by multiple and objectively assessed numerical indicators within certain time lapse.
As a consequence, there exists a continuous trend for political science to gradually become a more quantitative scientific field and to use quantitative information in the analysis and reasoning.
It is believed that any objective analysis in political science must be multidimensional and combine various sources of quantitative information; however, human capabilities for perception of large massifs of numerical information are limited.
Hence, methods and approaches for visualization of quantitative and qualitative data (and, especially multivariate data) is an extremely important topic.
Data visualization approaches can be classified into several groups, starting from creating informative charts and diagrams (statistical graphics and infographics) and ending with advanced statistical methods for visualizing multidimensional tables containing both quantitative and qualitative information.
In this article we provide a short review of existing methods of data visualization methods with applications in political and social science.
Chagas disease is a neglected disease, and information about its geographical spread is very scarse.
We analyze here mobility and calling patterns in order to identify potential risk zones for the disease, by using public health information and mobile phone records.
Geolocalized call records are rich in social and mobility information, which can be used to infer whether an individual has lived in an endemic area.
We present two case studies in Latin American countries.
Our objective is to generate risk maps which can be used by public health campaign managers to prioritize detection campaigns and target specific areas.
Finally, we analyze the value of mobile phone data to infer long-term migrations, which play a crucial role in the geographical spread of Chagas disease.
We revisit the Blind Deconvolution problem with a focus on understanding its robustness and convergence properties.
Provable robustness to noise and other perturbations is receiving recent interest in vision, from obtaining immunity to adversarial attacks to assessing and describing failure modes of algorithms in mission critical applications.
Further, many blind deconvolution methods based on deep architectures internally make use of or optimize the basic formulation, so a clearer understanding of how this sub-module behaves, when it can be solved, and what noise injection it can tolerate is a first order requirement.
We derive new insights into the theoretical underpinnings of blind deconvolution.
The algorithm that emerges has nice convergence guarantees and is provably robust in a sense we formalize in the paper.
Interestingly, these technical results play out very well in practice, where on standard datasets our algorithm yields results competitive with or superior to the state of the art.
Keywords: blind deconvolution, robust continuous optimization
Delayed feedback control is an easy realizable control method which generates control force by comparing the current and the delayed version of the system states.
In this paper, a new form of the delayed feedback structure is introduced.
Based on the proposed delayed feedback method, a new robust tracking system is designed.
This tracking system improves the features of the conventional state feedback with integral action and it is also able to reject higher order disturbances compared to the conventional method.
In addition, the proposed tracking system tracks the ramp-shape reference input signal as well, which this is not possible through the conventional state feedback.
Due to easy implementable feature of the proposed delayed feedback tracking system, it can be used in practical applications effectively.
Moreover, since the proposed method adds delays to the closed loop system dynamics, the ordinary differential equation of the system changes to a delay differential equation with an infinite number of characteristic roots.
Thus, conventional pole placement procedures cannot be used to design the delayed feedback controller parameters and place the unstable roots in the left half plane.
In this paper, the simulated annealing algorithm is used to determine the proposed control system parameters and move the unstable roots of the delay differential equation to the left half plane.
Finally, the efficiency of the proposed reference input tracker is demonstrated on a case study.
This paper considers the problem of single-server single-message private information retrieval with coded side information (PIR-CSI).
In this problem, there is a server storing a database, and a user which knows a linear combination of a subset of messages in the database as a side information.
The number of messages contributing to the side information is known to the server, but the indices and the coefficients of these messages are unknown to the server.
The user wishes to download a message from the server privately, i.e., without revealing which message it is requesting, while minimizing the download cost.
In this work, we consider two different settings for the PIR-CSI problem depending on the demanded message being or not being one of the messages contributing to the side information.
For each setting, we prove an upper bound on the maximum download rate as a function of the size of the database and the size of the side information, and propose a protocol that achieves the rate upper-bound.
Centrality is one of the most studied concepts in social network analysis.
There is a huge literature regarding centrality measures, as ways to identify the most relevant users in a social network.
The challenge is to find measures that can be computed efficiently, and that can be able to classify the users according to relevance criteria as close as possible to reality.
We address this problem in the context of the Twitter network, an online social networking service with millions of users and an impressive flow of messages that are published and spread daily by interactions between users.
Twitter has different types of users, but the greatest utility lies in finding the most influential ones.
The purpose of this article is to collect and classify the different Twitter influence measures that exist so far in literature.
These measures are very diverse.
Some are based on simple metrics provided by the Twitter API, while others are based on complex mathematical models.
Several measures are based on the PageRank algorithm, traditionally used to rank the websites on the Internet.
Some others consider the timeline of publication, others the content of the messages, some are focused on specific topics, and others try to make predictions.
We consider all these aspects, and some additional ones.
Furthermore, we include measures of activity and popularity, the traditional mechanisms to correlate measures, and some important aspects of computational complexity for this particular context.
Full duplex (FD) communications has the potential to double the capacity of a half duplex (HD) system at the link level.
However, in a cellular network, FD operation is not a straightforward extension of half duplex operations.
The increased interference due to a large number of simultaneous transmissions in FD operation and realtime traffic conditions limits the capacity improvement.
Realizing the potential of FD requires careful coordination of resource allocation among the cells as well as within the cell.
In this paper, we propose a distributed resource allocation, i.e., joint user selection and power allocation for a FD multi-cell system, assuming FD base stations (BSs) and HD user equipment (UEs).
Due to the complexity of finding the globally optimum solution, a sub-optimal solution for UE selection, and a novel geometric programming based solution for power allocation, are proposed.
The proposed distributed approach converges quickly and performs almost as well as a centralized solution, but with much lower signaling overhead.
It provides a hybrid scheduling policy which allows FD operations whenever it is advantageous, but otherwise defaults to HD operation.
We focus on small cell systems because they are more suitable for FD operation, given practical self-interference cancellation limits.With practical self-interference cancellation, it is shown that the proposed hybrid FD system achieves nearly two times throughput improvement for an indoor multi-cell scenario, and about 65% improvement for an outdoor multi-cell scenario compared to the HD system.
A family of reconfigurable parallel robots can change motion modes by passing through constraint singularities by locking and releasing some passive joints of the robot.
This paper is about the kinematics, the workspace and singularity analysis of a 3-PRPiR parallel robot involving lockable Pi and R (revolute) joints.
Here a Pi joint may act as a 1-DOF planar parallelogram if its lock-able P (prismatic) joint is locked or a 2-DOF RR serial chain if its lockable P joint is released.
The operation modes of the robot include a 3T operation modes to three 2T1R operation modes with two different directions of the rotation axis of the moving platform.
The inverse kinematics and forward kinematics of the robot in each operation modes are dealt with in detail.
The workspace analysis of the robot allow us to know the regions of the workspace that the robot can reach in each operation mode.
A prototype built at Heriot-Watt University is used to illustrate the results of this work.
Aspect-level sentiment classification aims to identify the sentiment expressed towards some aspects given context sentences.
In this paper, we introduce an attention-over-attention (AOA) neural network for aspect level sentiment classification.
Our approach models aspects and sentences in a joint way and explicitly captures the interaction between aspects and context sentences.
With the AOA module, our model jointly learns the representations for aspects and sentences, and automatically focuses on the important parts in sentences.
Our experiments on laptop and restaurant datasets demonstrate our approach outperforms previous LSTM-based architectures.
Fine-grained entity type classification (FETC) is the task of classifying an entity mention to a broad set of types.
Distant supervision paradigm is extensively used to generate training data for this task.
However, generated training data assigns same set of labels to every mention of an entity without considering its local context.
Existing FETC systems have two major drawbacks: assuming training data to be noise free and use of hand crafted features.
Our work overcomes both drawbacks.
We propose a neural network model that jointly learns entity mentions and their context representation to eliminate use of hand crafted features.
Our model treats training data as noisy and uses non-parametric variant of hinge loss function.
Experiments show that the proposed model outperforms previous state-of-the-art methods on two publicly available datasets, namely FIGER (GOLD) and BBN with an average relative improvement of 2.69% in micro-F1 score.
Knowledge learnt by our model on one dataset can be transferred to other datasets while using same model or other FETC systems.
These approaches of transferring knowledge further improve the performance of respective models.
We motivate a method for transparently identifying ineffectual computations in unmodified Deep Learning models and without affecting accuracy.
Specifically, we show that if we decompose multiplications down to the bit level the amount of work performed during inference for image classification models can be consistently reduced by two orders of magnitude.
In the best case studied of a sparse variant of AlexNet, this approach can ideally reduce computation work by more than 500x.
We present Laconic a hardware accelerator that implements this approach to improve execution time, and energy efficiency for inference with Deep Learning Networks.
Laconic judiciously gives up some of the work reduction potential to yield a low-cost, simple, and energy efficient design that outperforms other state-of-the-art accelerators.
For example, a Laconic configuration that uses a weight memory interface with just 128 wires outperforms a conventional accelerator with a 2K-wire weight memory interface by 2.3x on average while being 2.13x more energy efficient on average.
A Laconic configuration that uses a 1K-wire weight memory interface, outperforms the 2K-wire conventional accelerator by 15.4x and is 1.95x more energy efficient.
Laconic does not require but rewards advances in model design such as a reduction in precision, the use of alternate numeric representations that reduce the number of bits that are "1", or an increase in weight or activation sparsity.
In this letter, we present a unified Bayesian inference framework for generalized linear models (GLM) which iteratively reduces the GLM problem to a sequence of standard linear model (SLM) problems.
This framework provides new perspectives on some established GLM algorithms derived from SLM ones and also suggests novel extensions for some other SLM algorithms.
Specific instances elucidated under such framework are the GLM versions of approximate message passing (AMP), vector AMP (VAMP), and sparse Bayesian learning (SBL).
It is proved that the resultant GLM version of AMP is equivalent to the well-known generalized approximate message passing (GAMP).
Numerical results for 1-bit quantized compressed sensing (CS) demonstrate the effectiveness of this unified framework.
Recently, the end-to-end approach that learns hierarchical representations from raw data using deep convolutional neural networks has been successfully explored in the image, text and speech domains.
This approach was applied to musical signals as well but has been not fully explored yet.
To this end, we propose sample-level deep convolutional neural networks which learn representations from very small grains of waveforms (e.g.2 or 3 samples) beyond typical frame-level input representations.
Our experiments show how deep architectures with sample-level filters improve the accuracy in music auto-tagging and they provide results comparable to previous state-of-the-art performances for the Magnatagatune dataset and Million Song Dataset.
In addition, we visualize filters learned in a sample-level DCNN in each layer to identify hierarchically learned features and show that they are sensitive to log-scaled frequency along layer, such as mel-frequency spectrogram that is widely used in music classification systems.
This article examines the structure and spatial patterns of violent political organizations in the Sahel-Sahara, a region characterized by growing political instability over the last 20 years.
Drawing on a public collection of disaggregated data, the article uses network science to represent alliances and conflicts of 179 organizations that were involved in violent events between 1997 and 2014.
To this end, we combine two spectral embedding techniques that have previously been considered separately: one for directed graphs (relationships are asymmetric), and one for signed graphs (relationships are positive or negative).
Our result show that groups that are net attackers are indistinguishable at the level of their individual behavior, but clearly separate into pro- and anti-political violence based on the groups to which they are close.
The second part of the article maps a series of 389 events related to nine Trans-Saharan Islamist groups between 2004 and 2014.
Spatial analysis suggests that cross-border movement has intensified following the establishment of military bases by AQIM in Mali but reveals no evidence of a border sanctuary.
Owing to the transnational nature of conflict, the article shows that national management strategies and foreign military interventions have profoundly affected the movement of Islamist groups.
One viable solution for continuous reduction in energy-per-operation is to rethink functionality to cope with uncertainty by adopting computational approaches that are inherently robust to uncertainty.
It requires a novel look at data representations, associated operations, and circuits, and at materials and substrates that enable them.
3D integrated nanotechnologies combined with novel brain-inspired computational paradigms that support fast learning and fault tolerance could lead the way.
Recognizing the very size of the brain's circuits, hyperdimensional (HD) computing can model neural activity patterns with points in a HD space, that is, with hypervectors as large randomly generated patterns.
At its very core, HD computing is about manipulating and comparing these patterns inside memory.
Emerging nanotechnologies such as carbon nanotube field effect transistors (CNFETs) and resistive RAM (RRAM), and their monolithic 3D integration offer opportunities for hardware implementations of HD computing through tight integration of logic and memory, energy-efficient computation, and unique device characteristics.
We experimentally demonstrate and characterize an end-to-end HD computing nanosystem built using monolithic 3D integration of CNFETs and RRAM.
With our nanosystem, we experimentally demonstrate classification of 21 languages with measured accuracy of up to 98% on >20,000 sentences (6.4 million characters), training using one text sample (~100,000 characters) per language, and resilient operation (98% accuracy) despite 78% hardware errors in HD representation (outputs stuck at 0 or 1).
By exploiting the unique properties of the underlying nanotechnologies, we show that HD computing, when implemented with monolithic 3D integration, can be up to 420X more energy-efficient while using 25X less area compared to traditional silicon CMOS implementations.
Prevalent models based on artificial neural network (ANN) for sentence classification often classify sentences in isolation without considering the context in which sentences appear.
This hampers the traditional sentence classification approaches to the problem of sequential sentence classification, where structured prediction is needed for better overall classification performance.
In this work, we present a hierarchical sequential labeling network to make use of the contextual information within surrounding sentences to help classify the current sentence.
Our model outperforms the state-of-the-art results by 2%-3% on two benchmarking datasets for sequential sentence classification in medical scientific abstracts.
In this paper, we present TeachNet, a novel neural network architecture for intuitive and markerless vision-based teleoperation of dexterous robotic hands.
Robot joint angles are directly generated from depth images of the human hand that produce visually similar robot hand poses in an end-to-end fashion.
The special structure of TeachNet, combined with a consistency loss function, handles the differences in appearance and anatomy between human and robotic hands.
A synchronized human-robot training set is generated from an existing dataset of labeled depth images of the human hand and simulated depth images of a robotic hand.
The final training set includes 400K pairwise depth images and joint angles of a Shadow C6 robotic hand.
The network evaluation results verify the superiority of TeachNet, especially regarding the high-precision condition.
Imitation experiments and grasp tasks teleoperated by novice users demonstrate that TeachNet is more reliable and faster than the state-of-the-art vision-based teleoperation method.
Deep Neural Networks (DNNs) have demonstrated exceptional performance on most recognition tasks such as image classification and segmentation.
However, they have also been shown to be vulnerable to adversarial examples.
This phenomenon has recently attracted a lot of attention but it has not been extensively studied on multiple, large-scale datasets and structured prediction tasks such as semantic segmentation which often require more specialised networks with additional components such as CRFs, dilated convolutions, skip-connections and multiscale processing.
In this paper, we present what to our knowledge is the first rigorous evaluation of adversarial attacks on modern semantic segmentation models, using two large-scale datasets.
We analyse the effect of different network architectures, model capacity and multiscale processing, and show that many observations made on the task of classification do not always transfer to this more complex task.
Furthermore, we show how mean-field inference in deep structured models, multiscale processing (and more generally, input transformations) naturally implement recently proposed adversarial defenses.
Our observations will aid future efforts in understanding and defending against adversarial examples.
Moreover, in the shorter term, we show how to effectively benchmark robustness and show which segmentation models should currently be preferred in safety-critical applications due to their inherent robustness.
A key element in defending computer networks is to recognize the types of cyber attacks based on the observed malicious activities.
Obfuscation onto what could have been observed of an attack sequence may lead to mis-interpretation of its effect and intent, leading to ineffective defense or recovery deployments.
This work develops probabilistic graphical models to generalize a few obfuscation techniques and to enable analyses of the Expected Classification Accuracy (ECA) as a result of these different obfuscation on various attack models.
Determining the ECA is a NP-Hard problem due to the combinatorial number of possibilities.
This paper presents several polynomial-time algorithms to find the theoretically bounded approximation of ECA under different attack obfuscation models.
Comprehensive simulation shows the impact on ECA due to alteration, insertion and removal of attack action sequence, with increasing observation length, level of obfuscation and model complexity.
Clinical Decision Support Systems (CDSS) form an important area of research.
In spite of its importance, it is difficult for researchers to evaluate the domain primarily because of a considerable spread of relevant literature in interdisciplinary domains.
Previous surveys of CDSS have examined the domain from the perspective of individual disciplines.
However, to the best of our knowledge, no visual scientometric survey of CDSS has previously been conducted which provides a broader spectrum of the domain with a horizon covering multiple disciplines.
While traditional systematic literature surveys focus on analyzing literature using arbitrary results, visual surveys allow for the analysis of domains by using complex network-based analytical models.
In this paper, we present a detailed visual survey of CDSS literature using important papers selected from highly cited sources in the Thomson Reuters web of science.
We analyze the entire set of relevant literature indexed in the Web of Science database.
Our key results include the discovery of the articles which have served as key turning points in literature.
Additionally, we have identified highly cited authors and the key country of origin of top publications.
We also present the Universities with the strongest citation bursts.
Finally, our network analysis has also identified the key journals and subject categories both in terms of centrality and frequency.
It is our belief that this paper will thus serve as an important role for researchers as well as clinical practitioners interested in identifying key literature and resources in the domain of clinical decision support.
As open-ended human-chatbot interaction becomes commonplace, sensitive content detection gains importance.
In this work, we propose a two stage semi-supervised approach to bootstrap large-scale data for automatic sensitive language detection from publicly available web resources.
We explore various data selection methods including 1) using a blacklist to rank online discussion forums by the level of their sensitiveness followed by randomly sampling utterances and 2) training a weakly supervised model in conjunction with the blacklist for scoring sentences from online discussion forums to curate a dataset.
Our data collection strategy is flexible and allows the models to detect implicit sensitive content for which manual annotations may be difficult.
We train models using publicly available annotated datasets as well as using the proposed large-scale semi-supervised datasets.
We evaluate the performance of all the models on Twitter and Toxic Wikipedia comments testsets as well as on a manually annotated spoken language dataset collected during a large scale chatbot competition.
Results show that a model trained on this collected data outperforms the baseline models by a large margin on both in-domain and out-of-domain testsets, achieving an F1 score of 95.5% on an out-of-domain testset compared to a score of 75% for models trained on public datasets.
We also showcase that large scale two stage semi-supervision generalizes well across multiple classes of sensitivities such as hate speech, racism, sexual and pornographic content, etc. without even providing explicit labels for these classes, leading to an average recall of 95.5% versus the models trained using annotated public datasets which achieve an average recall of 73.2% across seven sensitive classes on out-of-domain testsets.
Autonomous planetary vehicles, also known as rovers, are small autonomous vehicles equipped with a variety of sensors used to perform exploration and experiments on a planet's surface.
Rovers work in a partially unknown environment, with narrow energy/time/movement constraints and, typically, small computational resources that limit the complexity of on-line planning and scheduling, thus they represent a great challenge in the field of autonomous vehicles.
Indeed, formal models for such vehicles usually involve hybrid systems with nonlinear dynamics, which are difficult to handle by most of the current planning algorithms and tools.
Therefore, when offline planning of the vehicle activities is required, for example for rovers that operate without a continuous Earth supervision, such planning is often performed on simplified models that are not completely realistic.
In this paper we show how the UPMurphi model checking based planning tool can be used to generate resource-optimal plans to control the engine of an autonomous planetary vehicle, working directly on its hybrid model and taking into account several safety constraints, thus achieving very accurate results.
Drawing inspiration from the theory of linear "decomposable systems", we provide a method, based on linear matrix inequalities (LMIs), which makes it possible to prove the convergence (or consensus) of a set of interacting agents with polynomial dynamic.
We also show that the use of a generalised version of the famous Kalman-Yakubovic-Popov lemma allows the development of an LMI test whose size does not depend on the number of agents.
The method is validated experimentally on two academic examples.
We consider high dimensional dynamic multi-product pricing with an evolving but low-dimensional linear demand model.
Assuming the temporal variation in cross-elasticities exhibits low-rank structure based on fixed (latent) features of the products, we show that the revenue maximization problem reduces to an online bandit convex optimization with side information given by the observed demands.
We design dynamic pricing algorithms whose revenue approaches that of the best fixed price vector in hindsight, at a rate that only depends on the intrinsic rank of the demand model and not the number of products.
Our approach applies a bandit convex optimization algorithm in a projected low-dimensional space spanned by the latent product features, while simultaneously learning this span via online singular value decomposition of a carefully-crafted matrix containing the observed demands.
Recent studies observe that app foreground is the most striking component that influences the access control decisions in mobile platform, as users tend to deny permission requests lacking visible evidence.
However, none of the existing permission models provides a systematic approach that can automatically answer the question: Is the resource access indicated by app foreground?
In this work, we present the design, implementation, and evaluation of COSMOS, a context-aware mediation system that bridges the semantic gap between foreground interaction and background access, in order to protect system integrity and user privacy.
Specifically, COSMOS learns from a large set of apps with similar functionalities and user interfaces to construct generic models that detect the outliers at runtime.
It can be further customized to satisfy specific user privacy preference by continuously evolving with user decisions.
Experiments show that COSMOS achieves both high precision and high recall in detecting malicious requests.
We also demonstrate the effectiveness of COSMOS in capturing specific user preferences using the decisions collected from 24 users and illustrate that COSMOS can be easily deployed on smartphones as a real-time guard with a very low performance overhead.
The numerical size of academic publications that are being published in recent years had grown rapidly.
Accessing and searching massive academic publications that are distributed over several locations need large amount of computing resources to increase the system performance.
Therefore, many grid-based search techniques were proposed to provide flexible methods for searching the distributed extensive data.
This paper proposes search technique that is capable of searching the extensive publications by utilizing grid computing technology.
The search technique is implemented as interconnected grid services to offer a mechanism to access different data locations.
The experimental result shows that the grid-based search technique has enhanced the performance of the search.
A tensor network is a diagram that specifies a way to "multiply" a collection of tensors together to produce another tensor (or matrix).
Many existing algorithms for tensor problems (such as tensor decomposition and tensor PCA), although they are not presented this way, can be viewed as spectral methods on matrices built from simple tensor networks.
In this work we leverage the full power of this abstraction to design new algorithms for certain continuous tensor decomposition problems.
An important and challenging family of tensor problems comes from orbit recovery, a class of inference problems involving group actions (inspired by applications such as cryo-electron microscopy).
Orbit recovery problems over finite groups can often be solved via standard tensor methods.
However, for infinite groups, no general algorithms are known.
We give a new spectral algorithm based on tensor networks for one such problem: continuous multi-reference alignment over the infinite group SO(2).
Our algorithm extends to the more general heterogeneous case.
An untested assumption behind the crowdsourced descriptions of the images in the Flickr30K dataset (Young et al., 2014) is that they "focus only on the information that can be obtained from the image alone" (Hodosh et al., 2013, p. 859).
This paper presents some evidence against this assumption, and provides a list of biases and unwarranted inferences that can be found in the Flickr30K dataset.
Finally, it considers methods to find examples of these, and discusses how we should deal with stereotype-driven descriptions in future applications.
Spatial information is often expressed using qualitative terms such as natural language expressions instead of coordinates; reasoning over such terms has several practical applications, such as bus routes planning.
Representing and reasoning on trajectories is a specific case of qualitative spatial reasoning that focuses on moving objects and their paths.
In this work, we propose two versions of a trajectory calculus based on the allowed properties over trajectories, where trajectories are defined as a sequence of non-overlapping regions of a partitioned map.
More specifically, if a given trajectory is allowed to start and finish at the same region, 6 base relations are defined (TC-6).
If a given trajectory should have different start and finish regions but cycles are allowed within, 10 base relations are defined (TC-10).
Both versions of the calculus are implemented as ASP programs; we propose several different encodings, including a generalised program capable of encoding any qualitative calculus in ASP.
All proposed encodings are experimentally evaluated using a real-world dataset.
Experiment results show that the best performing implementation can scale up to an input of 250 trajectories for TC-6 and 150 trajectories for TC-10 for the problem of discovering a consistent configuration, a significant improvement compared to previous ASP implementations for similar qualitative spatial and temporal calculi.
This manuscript is under consideration for acceptance in TPLP.
Information cascades, effectively facilitated by most social network platforms, are recognized as a major factor in almost every social success and disaster in these networks.
Can cascades be predicted?
While many believe that they are inherently unpredictable, recent work has shown that some key properties of information cascades, such as size, growth, and shape, can be predicted by a machine learning algorithm that combines many features.
These predictors all depend on a bag of hand-crafting features to represent the cascade network and the global network structure.
Such features, always carefully and sometimes mysteriously designed, are not easy to extend or to generalize to a different platform or domain.
Inspired by the recent successes of deep learning in multiple data mining tasks, we investigate whether an end-to-end deep learning approach could effectively predict the future size of cascades.
Such a method automatically learns the representation of individual cascade graphs in the context of the global network structure, without hand-crafted features and heuristics.
We find that node embeddings fall short of predictive power, and it is critical to learn the representation of a cascade graph as a whole.
We present algorithms that learn the representation of cascade graphs in an end-to-end manner, which significantly improve the performance of cascade prediction over strong baselines that include feature based methods, node embedding methods, and graph kernel methods.
Our results also provide interesting implications for cascade prediction in general.
In online learning the performance of an algorithm is typically compared to the performance of a fixed function from some class, with a quantity called regret.
Forster proposed a last-step min-max algorithm which was somewhat simpler than the algorithm of Vovk, yet with the same regret.
In fact the algorithm he analyzed assumed that the choices of the adversary are bounded, yielding artificially only the two extreme cases.
We fix this problem by weighing the examples in such a way that the min-max problem will be well defined, and provide analysis with logarithmic regret that may have better multiplicative factor than both bounds of Forster and Vovk.
We also derive a new bound that may be sub-logarithmic, as a recent bound of Orabona et.al, but may have better multiplicative factor.
Finally, we analyze the algorithm in a weak-type of non-stationary setting, and show a bound that is sub-linear if the non-stationarity is sub-linear as well.
Classical approaches for estimating optical flow have achieved rapid progress in the last decade.
However, most of them are too slow to be applied in real-time video analysis.
Due to the great success of deep learning, recent work has focused on using CNNs to solve such dense prediction problems.
In this paper, we investigate a new deep architecture, Densely Connected Convolutional Networks (DenseNet), to learn optical flow.
This specific architecture is ideal for the problem at hand as it provides shortcut connections throughout the network, which leads to implicit deep supervision.
We extend current DenseNet to a fully convolutional network to learn motion estimation in an unsupervised manner.
Evaluation results on three standard benchmarks demonstrate that DenseNet is a better fit than other widely adopted CNN architectures for optical flow estimation.
Live fish recognition is one of the most crucial elements of fisheries survey applications where vast amount of data are rapidly acquired.
Different from general scenarios, challenges to underwater image recognition are posted by poor image quality, uncontrolled objects and environment, as well as difficulty in acquiring representative samples.
Also, most existing feature extraction techniques are hindered from automation due to involving human supervision.
Toward this end, we propose an underwater fish recognition framework that consists of a fully unsupervised feature learning technique and an error-resilient classifier.
Object parts are initialized based on saliency and relaxation labeling to match object parts correctly.
A non-rigid part model is then learned based on fitness, separation and discrimination criteria.
For the classifier, an unsupervised clustering approach generates a binary class hierarchy, where each node is a classifier.
To exploit information from ambiguous images, the notion of partial classification is introduced to assign coarse labels by optimizing the "benefit" of indecision made by the classifier.
Experiments show that the proposed framework achieves high accuracy on both public and self-collected underwater fish images with high uncertainty and class imbalance.
We present an approach to learn a dense pixel-wise labeling from image-level tags.
Each image-level tag imposes constraints on the output labeling of a Convolutional Neural Network (CNN) classifier.
We propose Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space (i.e. predicted label distribution) of a CNN.
Our loss formulation is easy to optimize and can be incorporated directly into standard stochastic gradient descent optimization.
The key idea is to phrase the training objective as a biconvex optimization for linear models, which we then relax to nonlinear deep networks.
Extensive experiments demonstrate the generality of our new learning framework.
The constrained loss yields state-of-the-art results on weakly supervised semantic image segmentation.
We further demonstrate that adding slightly more supervision can greatly improve the performance of the learning algorithm.
News organizations are increasingly using social media to reach out to their audience aimed at raising their attention and engagement with news.
Given the continuous decrease in subscription rates and audience trust in news media, it is imperative for news organizations to understand factors contributing to their relationships with the audience.
Using Twitter data of 315 U.S. newspaper organizations and their audiences, this study uses multiple regression analysis to examine the influence of key news organization characteristics on audience engagement with news: (1) trustworthiness computed by the Trust Scores in Social Media (TSM) algorithm; (2) quantity of tweets; and (3) skillfulness of Twitter use.
The results show significant influence of a news organizations' trustworthiness and level of Twitter activity on its audiences' news engagement.
Methods to measure trustworthiness of news organizations and audience news engagement, as well as scalable algorithms to compute them from large-scale datasets, are also proposed.
A recent method employs 3D voxels to represent 3D shapes, but this limits the approach to low resolutions due to the computational cost caused by the cubic complexity of 3D voxels.
Hence the method suffers from a lack of detailed geometry.
To resolve this issue, we propose Y^2Seq2Seq, a view-based model, to learn cross-modal representations by joint reconstruction and prediction of view and word sequences.
Specifically, the network architecture of Y^2Seq2Seq bridges the semantic meaning embedded in the two modalities by two coupled `Y' like sequence-to-sequence (Seq2Seq) structures.
In addition, our novel hierarchical constraints further increase the discriminability of the cross-modal representations by employing more detailed discriminative information.
Experimental results on cross-modal retrieval and 3D shape captioning show that Y^2Seq2Seq outperforms the state-of-the-art methods.
We investigate the multi-step prediction of the drivable space, represented by Occupancy Grid Maps (OGMs), for autonomous vehicles.
Our motivation is that accurate multi-step prediction of the drivable space can efficiently improve path planning and navigation resulting in safe, comfortable and optimum paths in autonomous driving.
We train a variety of Recurrent Neural Network (RNN) based architectures on the OGM sequences from the KITTI dataset.
The results demonstrate significant improvement of the prediction accuracy using our proposed difference learning method, incorporating motion related features, over the state of the art.
We remove the egomotion from the OGM sequences by transforming them into a common frame.
Although in the transformed sequences the KITTI dataset is heavily biased toward static objects, by learning the difference between subsequent OGMs, our proposed method provides accurate prediction over both the static and moving objects.
In the past few years, consumer review sites have become the main target of deceptive opinion spam, where fictitious opinions or reviews are deliberately written to sound authentic.
Most of the existing work to detect the deceptive reviews focus on building supervised classifiers based on syntactic and lexical patterns of an opinion.
With the successful use of Neural Networks on various classification applications, in this paper, we propose FakeGAN a system that for the first time augments and adopts Generative Adversarial Networks (GANs) for a text classification task, in particular, detecting deceptive reviews.
Unlike standard GAN models which have a single Generator and Discriminator model, FakeGAN uses two discriminator models and one generative model.
The generator is modeled as a stochastic policy agent in reinforcement learning (RL), and the discriminators use Monte Carlo search algorithm to estimate and pass the intermediate action-value as the RL reward to the generator.
Providing the generator model with two discriminator models avoids the mod collapse issue by learning from both distributions of truthful and deceptive reviews.
Indeed, our experiments show that using two discriminators provides FakeGAN high stability, which is a known issue for GAN architectures.
While FakeGAN is built upon a semi-supervised classifier, known for less accuracy, our evaluation results on a dataset of TripAdvisor hotel reviews show the same performance in terms of accuracy as of the state-of-the-art approaches that apply supervised machine learning.
These results indicate that GANs can be effective for text classification tasks.
Specifically, FakeGAN is effective at detecting deceptive reviews.
Science is a social process with far-reaching impact on our modern society.
In the recent years, for the first time we are able to scientifically study the science itself.
This is enabled by massive amounts of data on scientific publications that is increasingly becoming available.
The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities.
Unfortunately, these databases are not always consistent, which considerably hinders this study.
Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases.
We found that identifying a single "best" database is far from easy.
Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.
We present a novel approach to estimate the delay observed between the occurrence and reporting of rape crimes.
We explore spatial, temporal and social effects in sparse aggregated (area-level) and high-dimensional disaggregated (event-level) data for New York and Los Angeles.
Focusing on inference, we apply Gradient Boosting and Random Forests to assess predictor importance, as well as Gaussian Processes to model spatial disparities in reporting times.
Our results highlight differences and similarities between the two cities.
We identify at-risk populations and communities which may be targeted with focused policies and interventions to support rape victims, apprehend perpetrators, and prevent future crimes.
Bilinear pooling has been recently proposed as a feature encoding layer, which can be used after the convolutional layers of a deep network, to improve performance in multiple vision tasks.
Different from conventional global average pooling or fully connected layer, bilinear pooling gathers 2nd order information in a translation invariant fashion.
However, a serious drawback of this family of pooling layers is their dimensionality explosion.
Approximate pooling methods with compact properties have been explored towards resolving this weakness.
Additionally, recent results have shown that significant performance gains can be achieved by adding 1st order information and applying matrix normalization to regularize unstable higher order information.
However, combining compact pooling with matrix normalization and other order information has not been explored until now.
In this paper, we unify bilinear pooling and the global Gaussian embedding layers through the empirical moment matrix.
In addition, we propose a novel sub-matrix square-root layer, which can be used to normalize the output of the convolution layer directly and mitigate the dimensionality problem with off-the-shelf compact pooling methods.
Our experiments on three widely used fine-grained classification datasets illustrate that our proposed architecture, MoNet, can achieve similar or better performance than with the state-of-art G2DeNet.
Furthermore, when combined with compact pooling technique, MoNet obtains comparable performance with encoded features with 96% less dimensions.
We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors.
Eigen evolution pooling is designed to produce compact feature representations for a sequence of feature vectors, while maximally preserving as much information about the sequence as possible, especially the temporal evolution of the features over time.
Eigen evolution pooling is a general pooling method that can be applied to any sequence of feature vectors, from low-level RGB values to high-level Convolutional Neural Network (CNN) feature vectors.
We show that eigen evolution pooling is more effective than average, max, and rank pooling for encoding the dynamics of human actions in video.
We demonstrate the power of eigen evolution pooling on UCF101 and Hollywood2 datasets, two human action recognition benchmarks, and achieve state-of-the-art performance.
A significant roadblock in multilingual neural language modeling is the lack of labeled non-English data.
One potential method for overcoming this issue is learning cross-lingual text representations that can be used to transfer the performance from training on English tasks to non-English tasks, despite little to no task-specific non-English data.
In this paper, we explore a natural setup for learning cross-lingual sentence representations: the dual-encoder.
We provide a comprehensive evaluation of our cross-lingual representations on a number of monolingual, cross-lingual, and zero-shot/few-shot learning tasks, and also give an analysis of different learned cross-lingual embedding spaces.
Recently, the so-called cell-free (CF) Massive MIMO architecture has been introduced, wherein a very large number of distributed access points (APs) simultaneously and jointly serve a much smaller number of mobile stations (MSs).
The paper extends the CF approach to the case in which both the APs and the MSs are equipped with multiple antennas, proposing a beamfoming scheme that, relying on the channel hardening effect, does not require channel estimation at the MSs.
We contrast the CF massive MIMO approach with a user-centric (UC) approach wherein each MS is served only by a limited number of APs.
Since far APs experience a bad SINR, it turns out that they are quite unhelpful in serving far users, and so, the UC approach, while requiring less backhaul overhead with respect to the CF approach, is shown here to achieve better performance results, in terms of achievable rate-per-user, for the vast majority of the MSs in the network.
Furthermore, in the paper we propose two power allocation strategy for the uplink and downlink, one aimed at maximizing the overall data-rate and another aimed at maximizing system fairness.
Sarcasm is a sophisticated speech act which commonly manifests on social communities such as Twitter and Reddit.
The prevalence of sarcasm on the social web is highly disruptive to opinion mining systems due to not only its tendency of polarity flipping but also usage of figurative language.
Sarcasm commonly manifests with a contrastive theme either between positive-negative sentiments or between literal-figurative scenarios.
In this paper, we revisit the notion of modeling contrast in order to reason with sarcasm.
More specifically, we propose an attention-based neural model that looks in-between instead of across, enabling it to explicitly model contrast and incongruity.
We conduct extensive experiments on six benchmark datasets from Twitter, Reddit and the Internet Argument Corpus.
Our proposed model not only achieves state-of-the-art performance on all datasets but also enjoys improved interpretability.
The fundamental role of hypernymy in NLP has motivated the development of many methods for the automatic identification of this relation, most of which rely on word distribution.
We investigate an extensive number of such unsupervised measures, using several distributional semantic models that differ by context type and feature weighting.
We analyze the performance of the different methods based on their linguistic motivation.
Comparison to the state-of-the-art supervised methods shows that while supervised methods generally outperform the unsupervised ones, the former are sensitive to the distribution of training instances, hurting their reliability.
Being based on general linguistic hypotheses and independent from training data, unsupervised measures are more robust, and therefore are still useful artillery for hypernymy detection.
With the increasing demand for image-based applications, the efficient and reliable evaluation of image quality has increased in importance.
Measuring the image quality is of fundamental importance for numerous image processing applications, where the goal of image quality assessment (IQA) methods is to automatically evaluate the quality of images in agreement with human quality judgments.
Numerous IQA methods have been proposed over the past years to fulfill this goal.
In this paper, a survey of the quality assessment methods for conventional image signals, as well as the newly emerged ones, which includes the high dynamic range (HDR) and 3-D images, is presented.
A comprehensive explanation of the subjective and objective IQA and their classification is provided.
Six widely used subjective quality datasets, and performance measures are reviewed.
Emphasis is given to the full-reference image quality assessment (FR-IQA) methods, and 9 often-used quality measures (including mean squared error (MSE), structural similarity index (SSIM), multi-scale structural similarity index (MS-SSIM), visual information fidelity (VIF), most apparent distortion (MAD), feature similarity measure (FSIM), feature similarity measure for color images (FSIMC), dynamic range independent measure (DRIM), and tone-mapped images quality index (TMQI)) are carefully described, and their performance and computation time on four subjective quality datasets are evaluated.
Furthermore, a brief introduction to 3-D IQA is provided and the issues related to this area of research are reviewed.
Traffic Matrix estimation has always caught attention from researchers for better network management and future planning.
With the advent of high traffic loads due to Cloud Computing platforms and Software Defined Networking based tunable routing and traffic management algorithms on the Internet, it is more necessary as ever to be able to predict current and future traffic volumes on the network.
For large networks such origin-destination traffic prediction problem takes the form of a large under-constrained and under-determined system of equations with a dynamic measurement matrix.
In this work, we present our Compressed Sensing with Dynamic Model Estimation (CS-DME) architecture suitable for modern software defined networks.
Our main contributions are: (1) we formulate an approach in which measurement matrix in the compressed sensing scheme can be accurately and dynamically estimated through a reformulation of the problem based on traffic demands.
(2) We show that the problem formulation using a dynamic measurement matrix based on instantaneous traffic demands may be used instead of a stationary binary routing matrix which is more suitable to modern Software Defined Networks that are constantly evolving in terms of routing by inspection of its Eigen Spectrum using two real world datasets.
(3) We also show that linking this compressed measurement matrix dynamically with the measured parameters can lead to acceptable estimation of Origin Destination (OD) Traffic flows with marginally poor results with other state-of-art schemes relying on fixed measurement matrices.
(4) Furthermore, using this compressed reformulated problem, a new strategy for selection of vantage points for most efficient traffic matrix estimation is also presented through a secondary compression technique based on subset of link measurements.
The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language.
While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic languages, such as rich morphology and virtually free word order.
The participants were asked to group contexts of a given word in accordance with its senses that were not provided beforehand.
For instance, given a word "bank" and a set of contexts for this word, e.g."bank is a financial institution that accepts deposits" and "river bank is a slope beside a body of water", a participant was asked to cluster such contexts in the unknown in advance number of clusters corresponding to, in this case, the "company" and the "area" senses of the word "bank".
For the purpose of this evaluation campaign, we developed three new evaluation datasets based on sense inventories that have different sense granularity.
The contexts in these datasets were sampled from texts of Wikipedia, the academic corpus of Russian, and an explanatory dictionary of Russian.
Overall, 18 teams participated in the competition submitting 383 models.
Multiple teams managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings.
Among the patch-based image denoising processing methods, smooth ordering of local patches (patch ordering) has been shown to give state-of-art results.
For image denoising the patch ordering method forms two large TSPs (Traveling Salesman Problem) comprised of nodes in N-dimensional space.
Ten approximate solutions of the two large TSPs are then used in a filtering process to form the reconstructed image.
Use of large TSPs makes patch ordering a computationally intensive method.
A modified patch ordering method for image denoising is proposed.
In the proposed method, several smaller-sized TSPs are formed and the filtering process varied to work with solutions of these smaller TSPs.
In terms of PSNR, denoising results of the proposed method differed by 0.032 dB to 0.016 dB on average.
In original method, solving TSPs was observed to consume 85% of execution time.
In proposed method, the time for solving TSPs can be reduced to half of the time required in original method.
The proposed method can denoise images in 40% less time.
During the last decades, classical models in language theory have been extended by control mechanisms defined by monoids.
We study which monoids cause the extensions of context-free grammars, finite automata, or finite state transducers to exceed the capacity of the original model.
Furthermore, we investigate when, in the extended automata model, the nondeterministic variant differs from the deterministic one in capacity.
We show that all these conditions are in fact equivalent and present an algebraic characterization.
In particular, the open question of whether every language generated by a valence grammar over a finite monoid is context-free is provided with a positive answer.
High performance grid computing is a key enabler of large scale collaborative computational science.
With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time.
In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in space and time, that are prevalent in the dynamic electricity price markets.
In this paper, we present a metascheduling algorithm to optimize the placement of jobs in a compute grid which consumes electricity from the day-ahead wholesale market.
We formulate the scheduling problem as a Minimum Cost Maximum Flow problem and leverage queue waiting time and electricity price predictions to accurately estimate the cost of job execution at a system.
Using trace based simulation with real and synthetic workload traces, and real electricity price data sets, we demonstrate our approach on two currently operational grids, XSEDE and NorduGrid.
Our experimental setup collectively constitute more than 433K processors spread across 58 compute systems in 17 geographically distributed locations.
Experiments show that our approach simultaneously optimizes the total electricity cost and the average response time of the grid, without being unfair to users of the local batch systems.
The next generation of cellular networks will exploit mmWave frequencies to dramatically increase the network capacity.
The communication at such high frequencies, however, requires directionality to compensate the increase in propagation loss.
Users and base stations need to align their beams during both initial access and data transmissions, to ensure the maximum gain is reached.
The accuracy of the beam selection, and the delay in updating the beam pair or performing initial access, impact the end-to-end performance and the quality of service.
In this paper we will present the beam management procedures that 3GPP has included in the NR specifications, focusing on the different operations that can be performed in Standalone (SA) and in Non-Standalone (NSA) deployments.
We will also provide a performance comparison among different schemes, along with design insights on the most important parameters related to beam management frameworks.
Many real-world applications require the estimation of human body joints for higher-level tasks as, for example, human behaviour understanding.
In recent years, depth sensors have become a popular approach to obtain three-dimensional information.
The depth maps generated by these sensors provide information that can be employed to disambiguate the poses observed in two-dimensional images.
This work addresses the problem of 3D human pose estimation from depth maps employing a Deep Learning approach.
We propose a model, named Deep Depth Pose (DDP), which receives a depth map containing a person and a set of predefined 3D prototype poses and returns the 3D position of the body joints of the person.
In particular, DDP is defined as a ConvNet that computes the specific weights needed to linearly combine the prototypes for the given input.
We have thoroughly evaluated DDP on the challenging 'ITOP' and 'UBC3V' datasets, which respectively depict realistic and synthetic samples, defining a new state-of-the-art on them.
Universal language representation is the holy grail in machine translation (MT).
Thanks to the new neural MT approach, it seems that there are good perspectives towards this goal.
In this paper, we propose a new architecture based on combining variational autoencoders with encoder-decoders and introducing an interlingual loss as an additional training objective.
By adding and forcing this interlingual loss, we are able to train multiple encoders and decoders for each language, sharing a common universal representation.
Since the final objective of this universal representation is producing close results for similar input sentences (in any language), we propose to evaluate it by encoding the same sentence in two different languages, decoding both latent representations into the same language and comparing both outputs.
Preliminary results on the WMT 2017 Turkish/English task shows that the proposed architecture is capable of learning a universal language representation and simultaneously training both translation directions with state-of-the-art results.
In this world, globalization has become a basic and most popular human trend.
To globalize information, people are going to publish the documents in the internet.
As a result, information volume of internet has become huge.
To handle that huge volume of information, Web searcher uses search engines.
The Webpage indexing mechanism of a search engine plays a big role to retrieve Web search results in a faster way from the huge volume of Web resources.
Web researchers have introduced various types of Web-page indexing mechanism to retrieve Webpages from Webpage repository.
In this paper, we have illustrated a new approach of design and development of Webpage indexing.
The proposed Webpage indexing mechanism has applied on domain specific Webpages and we have identified the Webpage domain based on an Ontology.
In our approach, first we prioritize the Ontology terms that exist in the Webpage content then apply our own indexing mechanism to index that Webpage.
The main advantage of storing an index is to optimize the speed and performance while finding relevant documents from the domain specific search engine storage area for a user given search query.
This paper considers the transmission of confidential messages over noisy wireless ad hoc networks, where both background noise and interference from concurrent transmitters affect the received signals.
For the random networks where the legitimate nodes and the eavesdroppers are distributed as Poisson point processes, we study the secrecy transmission capacity (STC), as well as the connection outage probability and secrecy outage probability, based on the physical layer security.
We first consider the basic fixed transmission distance model, and establish a theoretical model of the STC.
We then extend the above results to a more realistic random distance transmission model, namely nearest receiver transmission.
Finally, extensive simulation and numerical results are provided to validate the efficiency of our theoretical results and illustrate how the STC is affected by noise, connection and secrecy outage probabilities, transmitter and eavesdropper densities, and other system parameters.
Remarkably, our results reveal that a proper amount of noise is helpful to the secrecy transmission capacity.
In this paper, we develop a distributed intermittent communication and task planning framework for mobile robot teams.
The goal of the robots is to accomplish complex tasks, captured by local Linear Temporal Logic formulas, and share the collected information with all other robots and possibly also with a user.
Specifically, we consider situations where the robot communication capabilities are not sufficient to form reliable and connected networks while the robots move to accomplish their tasks.
In this case, intermittent communication protocols are necessary that allow the robots to temporarily disconnect from the network in order to accomplish their tasks free of communication constraints.
We assume that the robots can only communicate with each other when they meet at common locations in space.
Our distributed control framework jointly determines local plans that allow all robots fulfill their assigned temporal tasks, sequences of communication events that guarantee information exchange infinitely often, and optimal communication locations that minimize a desired distance metric.
Simulation results verify the efficacy of the proposed controllers.
Urban scholars have studied street networks in various ways, but there are data availability and consistency limitations to the current urban planning/street network analysis literature.
To address these challenges, this article presents OSMnx, a new tool to make the collection of data and creation and analysis of street networks simple, consistent, automatable and sound from the perspectives of graph theory, transportation, and urban design.
OSMnx contributes five significant capabilities for researchers and practitioners: first, the automated downloading of political boundaries and building footprints; second, the tailored and automated downloading and constructing of street network data from OpenStreetMap; third, the algorithmic correction of network topology; fourth, the ability to save street networks to disk as shapefiles, GraphML, or SVG files; and fifth, the ability to analyze street networks, including calculating routes, projecting and visualizing networks, and calculating metric and topological measures.
These measures include those common in urban design and transportation studies, as well as advanced measures of the structure and topology of the network.
Finally, this article presents a simple case study using OSMnx to construct and analyze street networks in Portland, Oregon.
In many important machine learning applications, the training distribution used to learn a probabilistic classifier differs from the testing distribution on which the classifier will be used to make predictions.
Traditional methods correct the distribution shift by reweighting the training data with the ratio of the density between test and training data.
In many applications training takes place without prior knowledge of the testing distribution on which the algorithm will be applied in the future.
Recently, methods have been proposed to address the shift by learning causal structure, but those methods rely on the diversity of multiple training data to a good performance, and have complexity limitations in high dimensions.
In this paper, we propose a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments.
The global balancing model constructs balancing weights that facilitate estimating of partial effects of features (holding fixed all other features), a problem that is challenging in high dimensions, and thus helps to identify stable, causal relationships between features and outcomes.
The deep auto-encoder model is designed to reduce the dimensionality of the feature space, thus making global balancing easier.
We show, both theoretically and with empirical experiments, that our algorithm can make stable predictions across unknown environments.
Our experiments on both synthetic and real world datasets demonstrate that our DGBR algorithm outperforms the state-of-the-art methods for stable prediction across unknown environments.
Full duplex (FD) communications, which increases spectral efficiency through simultaneous transmission and reception on the same frequency band, is a promising technology to meet the demand of next generation wireless networks.
In this paper, we consider the application of such FD communication to self-backhauled small cells.
We consider a FD capable small cell base station (BS) being wirelessly backhauled by a FD capable macro-cell BS.
FD communication enables simultaneous backhaul and access transmissions at small cell BSs, which reduces the need to orthogonalize allocated spectrum between access and backhaul.
However, in such simultaneous operations, all the links experience higher interference, which significantly suppresses the gains of FD operations.
We propose an interference-aware scheduling method to maximize the FD gain across multiple UEs in both uplink and downlink directions, while maintaining a level of fairness between all UEs.
It jointly schedules the appropriate links and traffic based on the back-pressure algorithm, and allocates appropriate transmission powers to the scheduled links using Geometric Programming.
Our simulation results show that the proposed scheduler nearly doubles the throughput of small cells compared to traditional half-duplex self-backhauling.
We will demonstrate a conversational products recommendation agent.
This system shows how we combine research in personalized recommendation systems with research in dialogue systems to build a virtual sales agent.
Based on new deep learning technologies we developed, the virtual agent is capable of learning how to interact with users, how to answer user questions, what is the next question to ask, and what to recommend when chatting with a human user.
Normally a descent conversational agent for a particular domain requires tens of thousands of hand labeled conversational data or hand written rules.
This is a major barrier when launching a conversation agent for a new domain.
We will explore and demonstrate the effectiveness of the learning solution even when there is no hand written rules or hand labeled training data.
Comtraces (combined traces) are extensions of Mazurkiewicz traces that can model the "not later than" relationship.
In this paper, we first introduce the novel notion of generalized comtraces, extensions of comtraces that can additionally model the "non-simultaneously" relationship.
Then we study some basic algebraic properties and canonical reprentations of comtraces and generalized comtraces.
Finally we analyze the relationship between generalized comtraces and generalized stratified order structures.
The major technical contribution of this paper is a proof showing that generalized comtraces can be represented by generalized stratified order structures.
The inversion of linear systems is a fundamental step in many inverse problems.
Computational challenges exist when trying to invert large linear systems, where limited computing resources mean that only part of the system can be kept in computer memory at any one time.
We are here motivated by tomographic inversion problems that often lead to linear inverse problems.
In state of the art x-ray systems, even a standard scan can produce 4 million individual measurements and the reconstruction of x-ray attenuation profiles typically requires the estimation of a million attenuation coefficients.
To deal with the large data sets encountered in real applications and to utilise modern graphics processing unit (GPU) based computing architectures, combinations of iterative reconstruction algorithms and parallel computing schemes are increasingly applied.
Although both row and column action methods have been proposed to utilise parallel computing architectures, individual computations in current methods need to know either the entire set of observations or the entire set of estimated x-ray absorptions, which can be prohibitive in many realistic big data applications.
We present a fully parallelizable computed tomography (CT) image reconstruction algorithm that works with arbitrary partial subsets of the data and the reconstructed volume.
We further develop a non-homogeneously randomised selection criteria which guarantees that sub-matrices of the system matrix are selected more frequently if they are dense, thus maximising information flow through the algorithm.
A grouped version of the algorithm is also proposed to further improve convergence speed and performance.
Algorithm performance is verified experimentally.
We describe a novel approach to interpret a polar code as a low-density parity-check (LDPC)-like code with an underlying sparse decoding graph.
This sparse graph is based on the encoding factor graph of polar codes and is suitable for conventional belief propagation (BP) decoding.
We discuss several pruning techniques based on the check node decoder (CND) and variable node decoder (VND) update equations, significantly reducing the size (i.e., decoding complexity) of the parity-check matrix.
As a result, iterative polar decoding can then be conducted on a sparse graph, akin to the traditional well-established LDPC decoding, e.g., using a fully parallel sum-product algorithm (SPA).
This facilitates the systematic analysis and design of polar codes using the well-established tools known from analyzing LDPC codes.
We show that the proposed iterative polar decoder has a negligible performance loss for short-to-intermediate codelengths compared to Arikan's original BP decoder.
Finally, the proposed decoder is shown to benefit from both reduced complexity and reduced memory requirements and, thus, is more suitable for hardware implementations.
The present work investigates whether different quantification mechanisms (set comparison, vague quantification, and proportional estimation) can be jointly learned from visual scenes by a multi-task computational model.
The motivation is that, in humans, these processes underlie the same cognitive, non-symbolic ability, which allows an automatic estimation and comparison of set magnitudes.
We show that when information about lower-complexity tasks is available, the higher-level proportional task becomes more accurate than when performed in isolation.
Moreover, the multi-task model is able to generalize to unseen combinations of target/non-target objects.
Consistently with behavioral evidence showing the interference of absolute number in the proportional task, the multi-task model no longer works when asked to provide the number of target objects in the scene.
Coverage and connectivity both are important in wireless sensor network (WSN).
Coverage means how well an area of interest is being monitored by the deployed network.
It depends on sensing model that has been used to design the network model.
Connectivity ensures the establishment of a wireless link between two nodes.
A link model studies the connectivity between two nodes.
The probability of establishing a wireless link between two nodes is a probabilistic phenomenon.
The connectivity between two nodes plays an important role in the determination of network connectivity.
In this paper, we investigate the impact of sensing model of nodes on the network coverage.
Also, we investigate the dependency of the connectivity and coverage on the shadow fading parameters.
It has been observed that shadowing effect reduces the network coverage while it enhances connectivity in a multi-hop wireless network.
The end-to-end throughput of multi-hop communication in wireless ad hoc networks is affected by the conflict between forwarding nodes.
It has been shown that sending more packets than maximum achievable end-to-end throughput not only fails to increase throughput, but also decreases throughput owing to high contention and collision.
Accordingly, it is of crucial importance for a source node to know the maximum end-to-end throughput.
The end-to-end throughput depends on multiple factors, such as physical layer limitations, MAC protocol properties, routing policy and nodes distribution.
There have been many studies on analytical modeling of end-to-end throughput but none of them has taken routing policy and nodes distribution as well as MAC layer altogether into account.
In this paper, the end-to-end throughput with perfect MAC layer is obtained based on routing policy and nodes distribution in one and two dimensional networks.
Then, imperfections of IEEE 802:11 protocol is added to the model to obtain precise value.
An exhaustive simulation is also made to validate the proposed models using NS2 simulator.
Results show that if the distribution to the next hop for a particular routing policy is known, our methodology can obtain the maximum end-to-end throughput precisely.
Skin cancer is a major public health problem, with over 5 million newly diagnosed cases in the United States each year.
Melanoma is the deadliest form of skin cancer, responsible for over 9,000 deaths each year.
In this paper, we propose an ensemble of deep convolutional neural networks to classify dermoscopy images into three classes.
To achieve the highest classification accuracy, we fuse the outputs of the softmax layers of four different neural architectures.
For aggregation, we consider the individual accuracies of the networks weighted by the confidence values provided by their final softmax layers.
This fusion-based approach outperformed all the individual neural networks regarding classification accuracy.
Android is the predominant mobile operating system for the past few years.
The prevalence of devices that can be powered by Android magnetized not merely application developers but also malware developers with criminal intention to design and spread malicious applications that can affect the normal work of Android phones and tablets, steal personal information and credential data, or even worse lock the phone and ask for ransom.
Researchers persistently devise countermeasures strategies to fight back malware.
One of these strategies applied in the past five years is the use of deep learning methods in Android malware detection.
This necessitates a review to inspect the accomplished work in order to know where the endeavors have been established, identify unresolved problems, and motivate future research directions.
In this work, an extensive survey of static analysis, dynamic analysis, and hybrid analysis that utilized deep learning methods are reviewed with an elaborated discussion on their key concepts, contributions, and limitations.
Minimizing job scheduling time is a fundamental issue in data center networks that has been extensively studied in recent years.
The incoming jobs require different CPU and memory units, and span different number of time slots.
The traditional solution is to design efficient heuristic algorithms with performance guarantee under certain assumptions.
In this paper, we improve a recently proposed job scheduling algorithm using deep reinforcement learning and extend it to multiple server clusters.
Our study reveals that deep reinforcement learning method has the potential to outperform traditional resource allocation algorithms in a variety of complicated environments.
We consider a computational model which is known as set automata.
The set automata are one-way finite automata with an additional storage---the set.
There are two kinds of set automata---the deterministic and the nondeterministic ones.
We denote them as DSA and NSA respectively.
The model was introduced by M. Kutrib, A. Malcher, M. Wendlandt in 2014.
It was shown that DSA-languages look similar to DCFL due to their closure properties and NSA-languages look similar to CFL due to their undecidability properties.
In this paper we show that this similarity is natural: we prove that languages recognizable by NSA form a rational cone, so as CFL.
The main topic of this paper is computational complexity: we prove that   - languages recognizable by DSA belong to P and there are P-complete languages among them;   - languages recognizable by NSA are in NP and there are NP-complete languages among them;   - the word membership problem is P-complete for DSA without epsilon-loops and PSPACE-complete for general DSA;   - the emptiness problem is in PSPACE for NSA and, moreover, it is PSPACE-complete for DSA.
Rapport, the close and harmonious relationship in which interaction partners are "in sync" with each other, was shown to result in smoother social interactions, improved collaboration, and improved interpersonal outcomes.
In this work, we are first to investigate automatic prediction of low rapport during natural interactions within small groups.
This task is challenging given that rapport only manifests in subtle non-verbal signals that are, in addition, subject to influences of group dynamics as well as inter-personal idiosyncrasies.
We record videos of unscripted discussions of three to four people using a multi-view camera system and microphones.
We analyse a rich set of non-verbal signals for rapport detection, namely facial expressions, hand motion, gaze, speaker turns, and speech prosody.
Using facial features, we can detect low rapport with an average precision of 0.7 (chance level at 0.25), while incorporating prior knowledge of participants' personalities can even achieve early prediction without a drop in performance.
We further provide a detailed analysis of different feature sets and the amount of information contained in different temporal segments of the interactions.
Heterogeneous information networks (HINs) are ubiquitous in real-world applications.
Due to the heterogeneity in HINs, the typed edges may not fully align with each other.
In order to capture the semantic subtlety, we propose the concept of aspects with each aspect being a unit representing one underlying semantic facet.
Meanwhile, network embedding has emerged as a powerful method for learning network representation, where the learned embedding can be used as features in various downstream applications.
Therefore, we are motivated to propose a novel embedding learning framework---AspEm---to preserve the semantic information in HINs based on multiple aspects.
Instead of preserving information of the network in one semantic space, AspEm encapsulates information regarding each aspect individually.
In order to select aspects for embedding purpose, we further devise a solution for AspEm based on dataset-wide statistics.
To corroborate the efficacy of AspEm, we conducted experiments on two real-words datasets with two types of applications---classification and link prediction.
Experiment results demonstrate that AspEm can outperform baseline network embedding learning methods by considering multiple aspects, where the aspects can be selected from the given HIN in an unsupervised manner.
Non-motorized transport is becoming increasingly important in urban development of cities in China.
How to evaluate the non-motorized transport popularity of urban roads is an interesting question to study.
The great amount of tracking data generated by smart mobile devices give us opportunities to solve this problem.
This study aims to provide a data driven method for evaluating the popularity (walkability and bikeability) of urban non-motorized transport system.
This paper defines a p-index to evaluate the popular degree of road segments which is based on the cycling, running, and walking GPS track data from outdoor activities logging applications.
According to the p-index definition, this paper evaluates the non-motorized transport popularity of urban area in Wuhan city within different temporal periods.
Systems for automatic extraction of semantic information about events from large textual resources are now available: these tools are capable to generate RDF datasets about text extracted events and this knowledge can be used to reason over the recognized events.
On the other hand, text based tasks for event recognition, as for example event coreference (i.e. recognizing whether two textual descriptions refer to the same event), do not take into account ontological information of the extracted events in their process.
In this paper, we propose a method to derive event coreference on text extracted event data using semantic based rule reasoning.
We demonstrate our method considering a limited (yet representative) set of event types: we introduce a formal analysis on their ontological properties and, on the base of this, we define a set of coreference criteria.
We then implement these criteria as RDF-based reasoning rules to be applied on text extracted event data.
We evaluate the effectiveness of our approach over a standard coreference benchmark dataset.
Whole genome prediction of complex phenotypic traits using high-density genotyping arrays has attracted a great deal of attention, as it is relevant to the fields of plant and animal breeding and genetic epidemiology.
As the number of genotypes is generally much bigger than the number of samples, predictive models suffer from the curse-of-dimensionality.
The curse-of-dimensionality problem not only affects the computational efficiency of a particular genomic selection method, but can also lead to poor performance, mainly due to correlation among markers.
In this work we proposed the first transductive feature selection method based on the MRMR (Max-Relevance and Min-Redundancy) criterion which we call MINT.
We applied MINT on genetic trait prediction problems and showed that in general MINT is a better feature selection method than the state-of-the-art inductive method mRMR.
Trends like digital transformation even intensify the already overwhelming mass of information knowledge workers face in their daily life.
To counter this, we have been investigating knowledge work and information management support measures inspired by human forgetting.
In this paper, we give an overview of solutions we have found during the last five years as well as challenges that still need to be tackled.
Additionally, we share experiences gained with the prototype of a first forgetful information system used 24/7 in our daily work for the last three years.
We also address the untapped potential of more explicated user context as well as features inspired by Memory Inhibition, which is our current focus of research.
Memorability is considered to be an important characteristic of visual content, whereas for advertisement and educational purposes it is the most important one.
Despite numerous studies on understanding and predicting image memorability, there are almost no achievements in memorability modification.
In this work, we study two possible approaches to image modification which likely may influence memorability.
The visual features which influence memorability directly stay unknown till now, hence it is impossible to control it manually.
As a solution, we let GAN learn it deeply using labeled data, and then use it for conditional generation of new images.
By analogy with algorithms which edit facial attributes, we consider memorability as yet another attribute and operate with it in the same way.
Obtained data is also interesting for analysis, simply because there are no real-world examples of successful change of image memorability while preserving its other attributes.
We believe this may give many new answers to the question "what makes an image memorable?"
Apart from that we also study the influence of conventional photo-editing tools (Photoshop, Instagram, etc.) used daily by a wide audience on memorability.
In this case, we start from real practical methods and study it using statistics and recent advances in memorability prediction.
Photographers, designers, and advertisers will benefit from the results of this study directly.
Sentence embedding is an important research topic in natural language processing.
It is essential to generate a good embedding vector that fully reflects the semantic meaning of a sentence in order to achieve an enhanced performance for various natural language processing tasks, such as machine translation and document classification.
Thus far, various sentence embedding models have been proposed, and their feasibility has been demonstrated through good performances on tasks following embedding, such as sentiment analysis and sentence classification.
However, because the performances of sentence classification and sentiment analysis can be enhanced by using a simple sentence representation method, it is not sufficient to claim that these models fully reflect the meanings of sentences based on good performances for such tasks.
In this paper, inspired by human language recognition, we propose the following concept of semantic coherence, which should be satisfied for a good sentence embedding method: similar sentences should be located close to each other in the embedding space.
Then, we propose the Paraphrase-Thought (P-thought) model to pursue semantic coherence as much as possible.
Experimental results on two paraphrase identification datasets (MS COCO and STS benchmark) show that the P-thought models outperform the benchmarked sentence embedding methods.
Technology of autonomous vehicles (AVs) is getting mature and many AVs will appear on the roads in the near future.
AVs become connected with the support of various vehicular communication technologies and they possess high degree of control to respond to instantaneous situations cooperatively with high efficiency and flexibility.
In this paper, we propose a new public transportation system based on AVs.
It manages a fleet of AVs to accommodate transportation requests, offering point-to-point services with ride sharing.
We focus on the two major problems of the system: scheduling and admission control.
The former is to configure the most economical schedules and routes for the AVs to satisfy the admissible requests while the latter is to determine the set of admissible requests among all requests to produce maximum profit.
The scheduling problem is formulated as a mixed-integer linear program and the admission control problem is cast as a bilevel optimization, which embeds the scheduling problem as the major constraint.
By utilizing the analytical properties of the problem, we develop an effective genetic-algorithm-based method to tackle the admission control problem.
We validate the performance of the algorithm with real-world transportation service data.
Conversational agents have become ubiquitous, ranging from goal-oriented systems for helping with reservations to chit-chat models found in modern virtual assistants.
In this survey paper, we explore this fascinating field.
We look at some of the pioneering work that defined the field and gradually move to the current state-of-the-art models.
We look at statistical, neural, generative adversarial network based and reinforcement learning based approaches and how they evolved.
Along the way we discuss various challenges that the field faces, lack of context in utterances, not having a good quantitative metric to compare models, lack of trust in agents because they do not have a consistent persona etc.
We structure this paper in a way that answers these pertinent questions and discusses competing approaches to solve them.
The use of interpolants in model checking is becoming an enabling technology to allow fast and robust verification of hardware and software.
The application of encodings based on the theory of arrays, however, is limited by the impossibility of deriving quantifier- free interpolants in general.
In this paper, we show that it is possible to obtain quantifier-free interpolants for a Skolemized version of the extensional theory of arrays.
We prove this in two ways: (1) non-constructively, by using the model theoretic notion of amalgamation, which is known to be equivalent to admit quantifier-free interpolation for universal theories; and (2) constructively, by designing an interpolating procedure, based on solving equations between array updates.
(Interestingly, rewriting techniques are used in the key steps of the solver and its proof of correctness.)
To the best of our knowledge, this is the first successful attempt of computing quantifier- free interpolants for a variant of the theory of arrays with extensionality.
A Software Engineering project depends significantly on team performance, as does any activity that involves human interaction.
In the last years, the traditional perspective on software development is changing and agile methods have received considerable attention.
Among other attributes, the ageists claim that fostering creativity is one of the keys to response to common problems and challenges of software development today.
The development of new software products requires the generation of novel and useful ideas.
It is a conceptual framework introduced in the Agile Manifesto in 2001.
This paper is written in support of agile practices in terms of significance of teamwork for the success of software projects.
Survey is used as a research method to know the significance of teamwork.
A co-evolutionary algorithm (CA) based chess player is presented.
Implementation details of the algorithms, namely coding, population, variation operators are described.
The alpha-beta or mini-max like behaviour of the player is achieved through two competitive or cooperative populations.
Special attention is given to the fitness function evaluation (the heart of the solution).
Test results on algorithms vs. algorithms or human player is provided.
In this paper we propose a method for applications oriented input design for linear systems under time-domain constraints on the amplitude of input and output signals.
The method guarantees a desired control performance for the estimated model in minimum time, by imposing some lower bound on the information matrix.
The problem is formulated as a time domain optimization problem, which is non-convex.
This is addressed through an alternating method, where we separate the problem into two steps and at each step we optimize the cost function with respect to one of two variables.
We alternate between these two steps until convergence.
A time recursive input design algorithm is performed, which enables us to use the algorithm with control.
Therefore, a receding horizon framework is used to solve each optimization problem.
Finally, we illustrate the method with two numerical examples which show the good ability of the proposed approach in generating an optimal input signal.
The popularity of digital currencies, especially cryptocurrencies, has been continuously growing since the appearance of Bitcoin.
Bitcoin's security lies in a proof-of-work scheme, which requires high computational resources at the miners.
Despite advances in mobile technology, existing cryptocurrencies cannot be maintained by mobile devices due to their low processing capabilities.
Mobile devices can only accommodate mobile applications (wallets) that allow users to exchange credits of cryptocurrencies.
In this work, we propose LocalCoin, an alternative cryptocurrency that requires minimal computational resources, produces low data traffic and works with off-the-shelf mobile devices.
LocalCoin replaces the computational hardness that is at the root of Bitcoin's security with the social hardness of ensuring that all witnesses to a transaction are colluders.
Localcoin features (i) a lightweight proof-of-work scheme and (ii) a distributed blockchain.
We analyze LocalCoin for double spending for passive and active attacks and prove that under the assumption of sufficient number of users and properly selected tuning parameters the probability of double spending is close to zero.
Extensive simulations on real mobility traces, realistic urban settings, and random geometric graphs show that the probability of success of one transaction converges to 1 and the probability of the success of a double spending attempt converges to 0.
XML data warehouses form an interesting basis for decision-support applications that exploit heterogeneous data from multiple sources.
However, XML-native database systems currently suffer from limited performances in terms of manageable data volume and response time for complex analytical queries.
Fragmenting and distributing XML data warehouses (e.g., on data grids) allow to address both these issues.
In this paper, we work on XML warehouse fragmentation.
In relational data warehouses, several studies recommend the use of derived horizontal fragmentation.
Hence, we propose to adapt it to the XML context.
We particularly focus on the initial horizontal fragmentation of dimensions' XML documents and exploit two alternative algorithms.
We experimentally validate our proposal and compare these alternatives with respect to a unified XML warehouse model we advocate for.
In this paper, we investigate the performance gains of adapting pilot spacing and power for Carrier Aggregation (CA)-OFDM systems in nonstationary wireless channels.
In current multi-band CA-OFDM wireless networks, all component carriers use the same pilot density, which is designed for poor channel environments.
This leads to unnecessary pilot overhead in good channel conditions and performance degradation in the worst channel conditions.
We propose adaptation of pilot spacing and power using a codebook-based approach, where the transmitter and receiver exchange information about the fading characteristics of the channel over a short period of time, which are stored as entries in a channel profile codebook.
We present a heuristic algorithm that maximizes the achievable rate by finding the optimal pilot spacing and power, from a set of candidate pilot configurations.
We also analyze the computational complexity of our proposed algorithm and the feedback overhead.
We describe methods to minimize the computation and feedback requirements for our algorithm in multi-band CA scenarios and present simulation results in typical terrestrial and air-to-ground/air-to-air nonstationary channels.
Our results show that significant performance gains can be achieved when adopting adaptive pilot spacing and power allocation in nonstationary channels.
We also discuss important practical considerations and provide guidelines to implement adaptive pilot spacing in CA-OFDM systems.
The constrained Cramer-Rao bound (CCRB) is a lower bound on the mean-squared-error (MSE) of estimators that satisfy some unbiasedness conditions.
Although the CCRB unbiasedness conditions are satisfied asymptotically by the constrained maximum likelihood (CML) estimator, in the non-asymptotic region these conditions are usually too strict and the commonly-used estimators, such as the CML estimator, do not satisfy them.
Therefore, the CCRB may not be a lower bound on the MSE matrix of such estimators.
In this paper, we propose a new definition for unbiasedness under constraints, denoted by C-unbiasedness, which is based on using Lehmann-unbiasedness with a weighted MSE (WMSE) risk and taking into account the parametric constraints.
In addition to C-unbiasedness, a Cramer-Rao-type bound on the WMSE of C-unbiased estimators, denoted as Lehmann-unbiased CCRB (LU-CCRB), is derived.
This bound is a scalar bound that depends on the chosen weighted combination of estimation errors.
It is shown that C-unbiasedness is less restrictive than the CCRB unbiasedness conditions.
Thus, the set of estimators that satisfy the CCRB unbiasedness conditions is a subset of the set of C-unbiased estimators and the proposed LU-CCRB may be an informative lower bound in cases where the corresponding CCRB is not.
In the simulations, we examine linear and nonlinear estimation problems under nonlinear constraints in which the CML estimator is shown to be C-unbiased and the LU-CCRB is an informative lower bound on the WMSE, while the corresponding CCRB on the WMSE is not a lower bound and is not informative in the non-asymptotic region.
Data retrieval systems such as online search engines and online social networks must comply with the privacy policies of personal and selectively shared data items, regulatory policies regarding data retention and censorship, and the provider's own policies regarding data use.
Enforcing these policies is difficult and error-prone.
Systematic techniques to enforce policies are either limited to type-based policies that apply uniformly to all data of the same type, or incur significant runtime overhead.
This paper presents Shai, the first system that systematically enforces data-specific policies with near-zero overhead in the common case.
Shai's key idea is to push as many policy checks as possible to an offline, ahead-of-time analysis phase, often relying on predicted values of runtime parameters such as the state of access control lists or connected users' attributes.
Runtime interception is used sparingly, only to verify these predictions and to make any remaining policy checks.
Our prototype implementation relies on efficient, modern OS primitives for sandboxing and isolation.
We present the design of Shai and quantify its overheads on an experimental data indexing and search pipeline based on the popular search engine Apache Lucene.
One of possible ways of obtaining continuous-space sentence representations is by training neural machine translation (NMT) systems.
The recent attention mechanism however removes the single point in the neural network from which the source sentence representation can be extracted.
We propose several variations of the attentive NMT architecture bringing this meeting point back.
Empirical evaluation suggests that the better the translation quality, the worse the learned sentence representations serve in a wide range of classification and similarity tasks.
Modern learning algorithms use gradient descent updates to train inferential models that best explain data.
Scaling these approaches to massive data sizes requires proper distributed gradient descent schemes where distributed worker nodes compute partial gradients based on their partial and local data sets, and send the results to a master node where all the computations are aggregated into a full gradient and the learning model is updated.
However, a major performance bottleneck that arises is that some of the worker nodes may run slow.
These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time.
We propose a distributed computing scheme, called Batched Coupon's Collector (BCC) to alleviate the effect of stragglers in gradient methods.
We prove that our BCC scheme is robust to a near optimal number of random stragglers.
We also empirically demonstrate that our proposed BCC scheme reduces the run-time by up to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation strategies.
We also generalize the proposed BCC scheme to minimize the completion time when implementing gradient descent-based algorithms over heterogeneous worker nodes.
It's useful to automatically transform an image from its original form to some synthetic form (style, partial contents, etc.), while keeping the original structure or semantics.
We define this requirement as the "image-to-image translation" problem, and propose a general approach to achieve it, based on deep convolutional and conditional generative adversarial networks (GANs), which has gained a phenomenal success to learn mapping images from noise input since 2014.
In this work, we develop a two step (unsupervised) learning method to translate images between different domains by using unlabeled images without specifying any correspondence between them, so that to avoid the cost of acquiring labeled data.
Compared with prior works, we demonstrated the capacity of generality in our model, by which variance of translations can be conduct by a single type of model.
Such capability is desirable in applications like bidirectional translation
Facial expressions play a significant role in human communication and behavior.
Psychologists have long studied the relationship between facial expressions and emotions.
Paul Ekman et al., devised the Facial Action Coding System (FACS) to taxonomize human facial expressions and model their behavior.
The ability to recognize facial expressions automatically, enables novel applications in fields like human-computer interaction, social gaming, and psychological research.
There has been a tremendously active research in this field, with several recent papers utilizing convolutional neural networks (CNN) for feature extraction and inference.
In this paper, we employ CNN understanding methods to study the relation between the features these computational networks are using, the FACS and Action Units (AU).
We verify our findings on the Extended Cohn-Kanade (CK+), NovaEmotions and FER2013 datasets.
We apply these models to various tasks and tests using transfer learning, including cross-dataset validation and cross-task performance.
Finally, we exploit the nature of the FER based CNN models for the detection of micro-expressions and achieve state-of-the-art accuracy using a simple long-short-term-memory (LSTM) recurrent neural network (RNN).
The Cloud-Radio Access Network (C-RAN) cellular architecture relies on the transfer of complex baseband signals to and from a central unit (CU) over digital fronthaul links to enable the virtualization of the baseband processing functionalities of distributed radio units (RUs).
The standard design of digital fronthauling is based on either scalar quantization or on more sophisticated point to-point compression techniques operating on baseband signals.
Motivated by network-information theoretic results, techniques for fronthaul quantization and compression that improve over point-to-point solutions by allowing for joint processing across multiple fronthaul links at the CU have been recently proposed for both the uplink and the downlink.
For the downlink, a form of joint compression, known in network information theory as multivariate compression, was shown to be advantageous under a non-constructive asymptotic information-theoretic framework.
In this paper, instead, the design of a practical symbol-by-symbol fronthaul quantization algorithm that implements the idea of multivariate compression is investigated for the C-RAN downlink.
As compared to current standards, the proposed multivariate quantization (MQ) only requires changes in the CU processing while no modification is needed at the RUs.
The algorithm is extended to enable the joint optimization of downlink precoding and quantization, reduced-complexity MQ via successive block quantization, and variable-length compression.
Numerical results, which include performance evaluations over standard cellular models, demonstrate the advantages of MQ and the merits of a joint optimization with precoding.
Municipal solid waste management (MSWM) is a challenging issue of urban development in developing countries.
Each country having different socio-economic-environmental background, might not accept a particular disposal method as the optimal choice.
Selection of suitable disposal method in MSWM, under vague and imprecise information can be considered as multi criteria decision making problem (MCDM).
In the present paper, TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) methodology is extended based on credibility theory for evaluating the performances of MSW disposal methods under some criteria fixed by experts.
The proposed model helps decision makers to choose a preferable alternative for their municipal area.
A sensitivity analysis by our proposed model confirms this fact.
In contrast to the prevalent assumption of rich multipath in information theoretic analysis of wireless channels, physical channels exhibit sparse multipath, especially at large bandwidths.
We propose a model for sparse multipath fading channels and present results on the impact of sparsity on non-coherent capacity and reliability in the wideband regime.
A key implication of sparsity is that the statistically independent degrees of freedom in the channel, that represent the delay-Doppler diversity afforded by multipath, scale at a sub-linear rate with the signal space dimension (time-bandwidth product).
Our analysis is based on a training-based communication scheme that uses short-time Fourier (STF) signaling waveforms.
Sparsity in delay-Doppler manifests itself as time-frequency coherence in the STF domain.
From a capacity perspective, sparse channels are asymptotically coherent: the gap between coherent and non-coherent extremes vanishes in the limit of large signal space dimension without the need for peaky signaling.
From a reliability viewpoint, there is a fundamental tradeoff between channel diversity and learnability that can be optimized to maximize the error exponent at any rate by appropriately choosing the signaling duration as a function of bandwidth.
The ability to activate and manage effective collaborations is becoming an increasingly important criteria in policies on academic career advancement.
The rise of such policies leads to development of indicators that permit measurement of the propensity to collaborate for academics of different ranks, and to examine the role of several variables in collaboration, first among these being the researchers' disciplines.
In this work we apply an innovative bibliometric approach based on individual propensity for collaboration to measure the differences in propensity across academic ranks, by discipline and for choice of collaboration forms - intramural, extramural domestic and international.
The analysis is based on the scientific production of Italian academics for the period 2006 to 2010, totaling over 200,000 publications indexed in Web of Science.
It shows that assistant professors register a propensity for intramural collaboration that is clearly greater than for professors of higher ranks.
Vice versa, the higher ranks, but not quite so clearly, register greater propensity to collaborate at the international level.
In this work, we analyze the performance of the uplink (UL) of a massive MIMO network considering an asymptotically large number of antennas at base stations (BSs).
We model the locations of BSs as a homogeneous Poisson point process (PPP) and assume that their service regions are limited to their respective Poisson-Voronoi cells (PVCs).
Further, for each PVC, based on a threshold radius, we model the cell center (CC) region as the Johnson-Mehl (JM) cell of its BS while rest of the PVC is deemed as the cell edge (CE) region.
The CC and CE users are located uniformly at random independently of each other in the JM cell and CE region, respectively.
In addition, we consider a fractional pilot reuse (FPR) scheme where two different sets of pilot sequences are used for CC and CE users with the objective of reducing the interference due to pilot contamination for CE users.
Based on the above system model, we derive analytical expressions for the UL signal-to-interference-and-noise ratio (SINR) coverage probability and average spectral efficiency (SE) for randomly selected CC and CE users.
In addition, we present an approximate expression for the average cell SE.
One of the key intermediate results in our analysis is the approximate but accurate characterization of the distributions of the CC and CE areas of a typical cell.
Another key intermediate step is the accurate characterization of the pair correlation functions of the point processes formed by the interfering CC and CE users that subsequently enables the coverage probability analysis.
From our system analysis, we present a partitioning rule for the number of pilot sequences to be used for CC and CE users as a function of threshold radius that improves the average CE user SE while achieving similar CC user SE with respect to unity pilot reuse.
An Application Specific Instruction set Processor (ASIP) is an important component in designing embedded systems.
One of the problems in designing an instruction set for such processors is determining the number of registers is needed in the processor that will optimize the computational time and the cost.
The performance of a processor may fall short due to register spilling, which is caused by the lack of available registers in a processor.
In the design perspective, it will result in processors with great performance and low power consumption if we can avoid register spilling by deciding a value for the number of registers needed in an ASIP.
However, as of now, it has not clearly been recognized how the number of registers changes with different application domains.
In this paper, we evaluated whether different application domains have any significant effect on register spilling and therefore the performance of a processor so that we could use different number of registers when building ASIPs for different application domains rather than using a constant set of registers.
Such utilization of registers will result in processors with high performance, low cost and low power consumption.
OpenStreetMap offers a valuable source of worldwide geospatial data useful to urban researchers.
This study uses the OSMnx software to automatically download and analyze 27,000 US street networks from OpenStreetMap at metropolitan, municipal, and neighborhood scales - namely, every US city and town, census urbanized area, and Zillow-defined neighborhood.
It presents empirical findings on US urban form and street network characteristics, emphasizing measures relevant to graph theory, transportation, urban design, and morphology such as structure, connectedness, density, centrality, and resilience.
In the past, street network data acquisition and processing have been challenging and ad hoc.
This study illustrates the use of OSMnx and OpenStreetMap to consistently conduct street network analysis with extremely large sample sizes, with clearly defined network definitions and extents for reproducibility, and using nonplanar, directed graphs.
These street networks and measures data have been shared in a public repository for other researchers to use.
We design and implement the first private and anonymous decentralized crowdsourcing system ZebraLancer.
It realizes the fair exchange (i.e.security against malicious workers and dishonest requesters) without using any third-party arbiter.
More importantly, it overcomes two fundamental challenges of decentralization, i.e.
data leakage and identity breach.
First, our outsource-then-prove methodology resolves the critical tension between blockchain transparency and data confidentiality without sacrificing the fairness of exchange.
ZebraLancer ensures: a requester will not pay more than what data deserve, according to a policy announced when her task is published through the blockchain; each worker indeed gets a payment based on the policy, if submits data to the blockchain; the above properties are realized not only without a central arbiter, but also without leaking the data to blockchain network.
Furthermore, the blockchain transparency might allow one to infer private information of workers/requesters through their participation history.
ZebraLancer solves the problem by allowing anonymous participations without surrendering user accountability.
Specifically, workers cannot misuse anonymity to submit multiple times to reap rewards, and an anonymous requester cannot maliciously submit colluded answers to herself to repudiate payments.
The idea behind is a subtle linkability: if one authenticates twice in a task, everybody can tell, or else staying anonymous.
To realize such delicate linkability, we put forth a novel cryptographic notion, the common-prefix-linkable anonymous authentication.
Finally, we implement our protocol for a common image annotation task and deploy it in a test net of Ethereum.
The experiment results show the applicability of our protocol and highlight subtleties of tailoring the protocol to be compatible with the existing real-world open blockchain.
In this paper, we design an analytically and experimentally better online energy and job scheduling algorithm with the objective of maximizing net profit for a service provider in green data centers.
We first study the previously known algorithms and conclude that these online algorithms have provable poor performance against their worst-case scenarios.
To guarantee an online algorithm's performance in hindsight, we design a randomized algorithm to schedule energy and jobs in the data centers and prove the algorithm's expected competitive ratio in various settings.
Our algorithm is theoretical-sound and it outperforms the previously known algorithms in many settings using both real traces and simulated data.
An optimal offline algorithm is also implemented as an empirical benchmark.
A critical challenge problem of scene change detection is that noisy changes generated by varying illumination, shadows and camera viewpoint make variances of a scene difficult to define and measure since the noisy changes and semantic ones are entangled.
Following the intuitive idea of detecting changes by directly comparing dissimilarities between a pair of features, we propose a novel fully Convolutional siamese metric Network(CosimNet) to measure changes by customizing implicit metrics.
To learn more discriminative metrics, we utilize contrastive loss to reduce the distance between the unchanged feature pairs and to enlarge the distance between the changed feature pairs.
Specifically, to address the issue of large viewpoint differences, we propose Thresholded Contrastive Loss (TCL) with a more tolerant strategy to punish noisy changes.
We demonstrate the effectiveness of the proposed approach with experiments on three challenging datasets: CDnet, PCD2015, and VL-CMU-CD.
Our approach is robust to lots of challenging conditions, such as illumination changes, large viewpoint difference caused by camera motion and zooming.
In addition, we incorporate the distance metric into the segmentation framework and validate the effectiveness through visualization of change maps and feature distribution.
The source code is available at https://github.com/gmayday1997/ChangeDet.
In this paper, we proposed a novel Probabilistic Attribute Tree-CNN (PAT-CNN) to explicitly deal with the large intra-class variations caused by identity-related attributes, e.g., age, race, and gender.
Specifically, a novel PAT module with an associated PAT loss was proposed to learn features in a hierarchical tree structure organized according to attributes, where the final features are less affected by the attributes.
Then, expression-related features are extracted from leaf nodes.
Samples are probabilistically assigned to tree nodes at different levels such that expression-related features can be learned from all samples weighted by probabilities.
We further proposed a semi-supervised strategy to learn the PAT-CNN from limited attribute-annotated samples to make the best use of available data.
Experimental results on five facial expression datasets have demonstrated that the proposed PAT-CNN outperforms the baseline models by explicitly modeling attributes.
More impressively, the PAT-CNN using a single model achieves the best performance for faces in the wild on the SFEW dataset, compared with the state-of-the-art methods using an ensemble of hundreds of CNNs.
Most of the traditional work on intrinsic image decomposition rely on deriving priors about scene characteristics.
On the other hand, recent research use deep learning models as in-and-out black box and do not consider the well-established, traditional image formation process as the basis of their intrinsic learning process.
As a consequence, although current deep learning approaches show superior performance when considering quantitative benchmark results, traditional approaches are still dominant in achieving high qualitative results.
In this paper, the aim is to exploit the best of the two worlds.
A method is proposed that (1) is empowered by deep learning capabilities, (2) considers a physics-based reflection model to steer the learning process, and (3) exploits the traditional approach to obtain intrinsic images by exploiting reflectance and shading gradient information.
The proposed model is fast to compute and allows for the integration of all intrinsic components.
To train the new model, an object centered large-scale datasets with intrinsic ground-truth images are created.
The evaluation results demonstrate that the new model outperforms existing methods.
Visual inspection shows that the image formation loss function augments color reproduction and the use of gradient information produces sharper edges.
Datasets, models and higher resolution images are available at https://ivi.fnwi.uva.nl/cv/retinet.
Skin lesion segmentation is one of the first steps towards automatic Computer-Aided Diagnosis of skin cancer.
Vast variety in the appearance of the skin lesion makes this task very challenging.
The contribution of this paper is to apply a power foreground extraction technique called GrabCut for automatic skin lesion segmentation with minimal human interaction in HSV color space.
Preprocessing was performed for removing the outer black border.
Jaccard Index was measured to evaluate the performance of the segmentation method.
On average, 0.71 Jaccard Index was achieved on 1000 images from ISIC challenge 2017 Training Dataset.
This paper presents new alternatives to the well-known Bloom filter data structure.
The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks.
Because the Bloom filter performs frequent random reads and writes, it is used almost exclusively in RAM, limiting the size of the sets it can represent.
This paper first describes the quotient filter, which supports the basic operations of the Bloom filter, achieving roughly comparable performance in terms of space and time, but with better data locality.
Operations on the quotient filter require only a small number of contiguous accesses.
The quotient filter has other advantages over the Bloom filter: it supports deletions, it can be dynamically resized, and two quotient filters can be efficiently merged.
The paper then gives two data structures, the buffered quotient filter and the cascade filter, which exploit the quotient filter advantages and thus serve as SSD-optimized alternatives to the Bloom filter.
The cascade filter has better asymptotic I/O performance than the buffered quotient filter, but the buffered quotient filter outperforms the cascade filter on small to medium data sets.
Both data structures significantly outperform recently-proposed SSD-optimized Bloom filter variants, such as the elevator Bloom filter, buffered Bloom filter, and forest-structured Bloom filter.
In experiments, the cascade filter and buffered quotient filter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.
Autonomous sorting is a crucial task in industrial robotics which can be very challenging depending on the expected amount of automation.
Usually, to decide where to sort an object, the system needs to solve either an instance retrieval (known object) or a supervised classification (predefined set of classes) problem.
In this paper, we introduce a new decision making module, where the robotic system chooses how to sort the objects in an unsupervised way.
We call this problem Unsupervised Robotic Sorting (URS) and propose an implementation on an industrial robotic system, using deep CNN feature extraction and standard clustering algorithms.
We carry out extensive experiments on various standard datasets to demonstrate the efficiency of the proposed image clustering pipeline.
To evaluate the robustness of our URS implementation, we also introduce a complex real world dataset containing images of objects under various background and lighting conditions.
This dataset is used to fine tune the design choices (CNN and clustering algorithm) for URS.
Finally, we propose a method combining our pipeline with ensemble clustering to use multiple images of each object.
This redundancy of information about the objects is shown to increase the clustering results.
Data exchange is the problem of transforming data that is structured under a source schema into data structured under another schema, called the target schema, so that both the source and target data satisfy the relationship between the schemas.
Even though the formal framework of data exchange for relational database systems is well-established, it does not immediately carry over to the settings of temporal data, which necessitates reasoning over unbounded periods of time.
In this work, we study data exchange for temporal data.
We first motivate the need for two views of temporal data: the concrete view, which depicts how temporal data is compactly represented and on which the implementations are based, and the abstract view, which defines the semantics of temporal data as a sequence of snapshots.
We first extend the chase procedure for the abstract view to have a conceptual basis for the data exchange for temporal databases.
Considering non-temporal source-to-target tuple generating dependencies and equality generating dependencies, the chase algorithm can be applied on each snapshot independently.
Then we define a chase procedure (called c-chase) on concrete instances and show the result of c-chase on a concrete instance is semantically aligned with the result of chase on the corresponding abstract instance.
In order to interpret intervals as constants while checking if a dependency or a query is satisfied by a concrete database, we will normalize the instance with respect to the dependency or the query.
To obtain the semantic alignment, the nulls in the concrete view are annotated with temporal information.
Furthermore, we show that the result of the concrete chase provides a foundation for query answering.
We define naive evaluation on the result of the c-chase and show it produces certain answers.
Machine learning-based malware detection dominates current security defense approaches for Android apps.
However, due to the evolution of Android platforms and malware, existing such techniques are widely limited by their need for constant retraining that are costly, and reliance on new malware samples that may not be timely available.
As a result, new and emerging malware slips through, as seen from the continued surging of malware in the wild.
Thus, a more practical detector needs not only to be accurate but, more critically, to be able to sustain its capabilities over time without frequent retraining.
In this paper, we study how Android apps evolve as a population over time, in terms of their behaviors related to accesses to sensitive information and operations.
We first perform a longitudinal characterization of 6K benign and malicious apps developed across seven years, with focus on these sensitive accesses in app executions.
Our study reveals, during the long evolution, a consistent, clear differentiation between malware and benign apps regarding such accesses, measured by relative statistics of relevant method calls.
Following these findings, we developed DroidSpan, a novel classification system based on a new behavioral profile for Android apps.
Through an extensive evaluation, we showed that DroidSpan can not only effectively detect malware but sustain high detection accuracy (93% F1 measure) for four years (with 81% F1 for five years).
Through a dedicated study, we also showed its resiliency to sophisticated evasion schemes.
By comparing to a state-of-the-art malware detector, we demonstrated the largely superior sustainability of our approach at reasonable costs.
Deep generative architectures provide a way to model not only images, but also complex, 3-dimensional objects, such as point clouds.
In this work, we present a novel method to obtain meaningful representations of 3D shapes that can be used for clustering and reconstruction.
Contrary to existing methods for 3D point cloud generation that train separate decoupled models for representation learning and generation, our approach is the first end-to-end solution that allows to simultaneously learn a latent space of representation and generate 3D shape out of it.
To achieve this goal, we extend a deep Adversarial Autoencoder model (AAE) to accept 3D input and create 3D output.
Thanks to our end-to-end training regime, the resulting method called 3D Adversarial Autoencoder (3dAAE) obtains either binary or continuous latent space that covers much wider portion of training data distribution, hence allowing smooth interpolation between the shapes.
Finally, our extensive quantitative evaluation shows that 3dAAE provides state-of-the-art results on a set of benchmark tasks.
In this paper, we derive a temporal arbitrage policy for storage via reinforcement learning.
Real-time price arbitrage is an important source of revenue for storage units, but designing good strategies have proven to be difficult because of the highly uncertain nature of the prices.
Instead of current model predictive or dynamic programming approaches, we use reinforcement learning to design an optimal arbitrage policy.
This policy is learned through repeated charge and discharge actions performed by the storage unit through updating a value matrix.
We design a reward function that does not only reflect the instant profit of charge/discharge decisions but also incorporate the history information.
Simulation results demonstrate that our designed reward function leads to significant performance improvement compared with existing algorithms.
We introduce and analyze different strategies for the parallel-in-time integration method PFASST to recover from hard faults and subsequent data loss.
Since PFASST stores solutions at multiple time steps on different processors, information from adjacent steps can be used to recover after a processor has failed.
PFASST's multi-level hierarchy allows to use the coarse level for correcting the reconstructed solution, which can help to minimize overhead.
A theoretical model is devised linking overhead to the number of additional PFASST iterations required for convergence after a fault.
The potential efficiency of different strategies is assessed in terms of required additional iterations for examples of diffusive and advective type.
Large scale decentralized systems, such as P2P, sensor or IoT device networks are becoming increasingly common, and require robust protocols to address the challenges posed by the distribution of data and the large number of peers belonging to the network.
In this paper, we deal with the problem of mining frequent items in unstructured P2P networks.
This problem, of practical importance, has many useful applications.
We design P2PSS, a fully decentralized, gossip--based protocol for frequent items discovery, leveraging the Space-Saving algorithm.
We formally prove the correctness and theoretical error bound.
Extensive experimental results clearly show that P2PSS provides very good accuracy and scalability, also in the presence of highly dynamic P2P networks with churning.
To the best of our knowledge, this is the first gossip--based distributed algorithm providing strong theoretical guarantees for both the Approximate Frequent Items Problem in Unstructured P2P Networks and for the frequency estimation of discovered frequent items.
Mechanical learning is a computing system that is based on a set of simple and fixed rules, and can learn from incoming data.
A learning machine is a system that realizes mechanical learning.
Importantly, we emphasis that it is based on a set of simple and fixed rules, contrasting to often called machine learning that is sophisticated software based on very complicated mathematical theory, and often needs human intervene for software fine tune and manual adjustments.
Here, we discuss some basic facts and principles of such system, and try to lay down a framework for further study.
We propose 2 directions to approach mechanical learning, just like Church-Turing pair: one is trying to realize a learning machine, another is trying to well describe the mechanical learning.
In an increasing number of domains it has been demonstrated that deep learning models can be trained using relatively large batch sizes without sacrificing data efficiency.
However the limits of this massive data parallelism seem to differ from domain to domain, ranging from batches of tens of thousands in ImageNet to batches of millions in RL agents that play the game Dota 2.
To our knowledge there is limited conceptual understanding of why these limits to batch size differ or how we might choose the correct batch size in a new domain.
In this paper, we demonstrate that a simple and easy-to-measure statistic called the gradient noise scale predicts the largest useful batch size across many domains and applications, including a number of supervised learning datasets (MNIST, SVHN, CIFAR-10, ImageNet, Billion Word), reinforcement learning domains (Atari and Dota), and even generative model training (autoencoders on SVHN).
We find that the noise scale increases as the loss decreases over a training run and depends on the model size primarily through improved model performance.
Our empirically-motivated theory also describes the tradeoff between compute-efficiency and time-efficiency, and provides a rough model of the benefits of adaptive batch-size training.
Human face analysis is an important task in computer vision.
According to cognitive-psychological studies, facial dynamics could provide crucial cues for face analysis.
The motion of a facial local region in facial expression is related to the motion of other facial local regions.
In this paper, a novel deep learning approach, named facial dynamics interpreter network, has been proposed to interpret the important relations between local dynamics for estimating facial traits from expression sequence.
The facial dynamics interpreter network is designed to be able to encode a relational importance, which is used for interpreting the relation between facial local dynamics and estimating facial traits.
By comparative experiments, the effectiveness of the proposed method has been verified.
The important relations between facial local dynamics are investigated by the proposed facial dynamics interpreter network in gender classification and age estimation.
Moreover, experimental results show that the proposed method outperforms the state-of-the-art methods in gender classification and age estimation.
Most existing approaches to training object detectors rely on fully supervised learning, which requires the tedious manual annotation of object location in a training set.
Recently there has been an increasing interest in developing weakly supervised approach to detector training where the object location is not manually annotated but automatically determined based on binary (weak) labels indicating if a training image contains the object.
This is a challenging problem because each image can contain many candidate object locations which partially overlaps the object of interest.
Existing approaches focus on how to best utilise the binary labels for object location annotation.
In this paper we propose to solve this problem from a very different perspective by casting it as a transfer learning problem.
Specifically, we formulate a novel transfer learning based on learning to rank, which effectively transfers a model for automatic annotation of object location from an auxiliary dataset to a target dataset with completely unrelated object categories.
We show that our approach outperforms existing state-of-the-art weakly supervised approach to annotating objects in the challenging VOC dataset.
In this paper we study the facility leasing problem with penalties.
We present a primal-dual algorithm which is a 3-approximation, based on the algorithm by Nagarajan and Williamson for the facility leasing problem and on the algorithm by Charikar et al. for the facility location problem with penalties.
In this paper, we present a GPU implementation of a two-dimensional shallow water model.
Water simulations are useful for modeling floods, river/reservoir behavior, and dam break scenarios.
Our GPU implementation shows vast performance improvements over the original Fortran implementation.
By taking advantage of the GPU, researchers and engineers will be able to study water systems more efficiently and in greater detail.
This paper presents a practical approach towards implementing pathfinding algorithms on real-world and low-cost non- commercial hardware platforms.
While using robotics simulation platforms as a test-bed for our algorithms we easily overlook real- world exogenous problems that are developed by external factors.
Such problems involve robot wheel slips, asynchronous motors, abnormal sensory data or unstable power sources.
The real-world dynamics tend to be very painful even for executing simple algorithms like a Wavefront planner or A-star search.
This paper addresses designing techniques that tend to be robust as well as reusable for any hardware platforms; covering problems like controlling asynchronous drives, odometry offset issues and handling abnormal sensory feedback.
The algorithm implementation medium and hardware design tools have been kept general in order to present our work as a serving platform for future researchers and robotics enthusiast working in the field of path planning robotics.
In this paper, we study the problem of controlling a two-dimensional robotic swarm with the purpose of achieving high level and complex spatio-temporal patterns.
We use a rich spatio-temporal logic that is capable of describing a wide range of time varying and complex spatial configurations, and develop a method to encode such formal specifications as a set of mixed integer linear constraints, which are incorporated into a mixed integer linear programming problem.
We plan trajectories for each individual robot such that the whole swarm satisfies the spatio-temporal requirements, while optimizing total robot movement and/or a metric that shows how strongly the swarm trajectory resembles given spatio-temporal behaviors.
An illustrative case study is included.
We present the first formal algebraic specification of a hypertext reference model.
It is based on the well-known Dexter Hypertext Reference Model and includes modifications with respect to the development of hypertext since the WWW came up.
Our hypertext model was developed as a product model with the aim to automatically support the design process and is extended to a model of hypertext-systems in order to be able to describe the state transitions in this process.
While the specification should be easy to read for non-experts in algebraic specification, it guarantees a unique understanding and enables a close connection to logic-based development and verification.
We characterize the statistical bootstrap for the estimation of information-theoretic quantities from data, with particular reference to its use in the study of large-scale social phenomena.
Our methods allow one to preserve, approximately, the underlying axiomatic relationships of information theory---in particular, consistency under arbitrary coarse-graining---that motivate use of these quantities in the first place, while providing reliability comparable to the state of the art for Bayesian estimators.
We show how information-theoretic quantities allow for rigorous empirical study of the decision-making capacities of rational agents and the time-asymmetric flows of information in distributed systems.
We provide illustrative examples by reference to ongoing collaborative work on the semantic structure of the British Criminal Court system and the conflict dynamics of the contemporary Afghanistan insurgency.
Prices of NAND flash memories are falling drastically due to market growth and fabrication process mastering while research efforts from a technological point of view in terms of endurance and density are very active.
NAND flash memories are becoming the most important storage media in mobile computing and tend to be less confined to this area.
The major constraint of such a technology is the limited number of possible erase operations per block which tend to quickly provoke memory wear out.
To cope with this issue, state-of-the-art solutions implement wear leveling policies to level the wear out of the memory and so increase its lifetime.
These policies are integrated into the Flash Translation Layer (FTL) and greatly contribute in decreasing the write performance.
In this paper, we propose to reduce the flash memory wear out problem and improve its performance by absorbing the erase operations throughout a dual cache system replacing FTL wear leveling and garbage collection services.
We justify this idea by proposing a first performance evaluation of an exclusively cache based system for embedded flash memories.
Unlike wear leveling schemes, the proposed cache solution reduces the total number of erase operations reported on the media by absorbing them in the cache for workloads expressing a minimal global sequential rate.
Social media has played an important role in shaping political discourse over the last decade.
At the same time, it is often perceived to have increased political polarization, thanks to the scale of discussions and their public nature.
In this paper, we try to answer the question of whether political polarization in the US on Twitter has increased over the last eight years.
We analyze a large longitudinal Twitter dataset of 679,000 users and look at signs of polarization in their (i) network - how people follow political and media accounts, (ii) tweeting behavior - whether they retweet content from both sides, and (iii) content - how partisan the hashtags they use are.
Our analysis shows that online polarization has indeed increased over the past eight years and that, depending on the measure, the relative change is 10%-20%.
Our study is one of very few with such a long-term perspective, encompassing two US presidential elections and two mid-term elections, providing a rare longitudinal analysis.
The popularity of Tor as an anonymity system has made it a popular target for a variety of attacks.
We focus on traffic correlation attacks, which are no longer solely in the realm of academic research with recent revelations about the NSA and GCHQ actively working to implement them in practice.
Our first contribution is an empirical study that allows us to gain a high fidelity snapshot of the threat of traffic correlation attacks in the wild.
We find that up to 40% of all circuits created by Tor are vulnerable to attacks by traffic correlation from Autonomous System (AS)-level adversaries, 42% from colluding AS-level adversaries, and 85% from state-level adversaries.
In addition, we find that in some regions (notably, China and Iran) there exist many cases where over 95% of all possible circuits are vulnerable to correlation attacks, emphasizing the need for AS-aware relay-selection.
To mitigate the threat of such attacks, we build Astoria--an AS-aware Tor client.
Astoria leverages recent developments in network measurement to perform path-prediction and intelligent relay selection.
Astoria reduces the number of vulnerable circuits to 2% against AS-level adversaries, under 5% against colluding AS-level adversaries, and 25% against state-level adversaries.
In addition, Astoria load balances across the Tor network so as to not overload any set of relays.
Algorithms for many hypergraph problems, including partitioning, utilize multilevel frameworks to achieve a good trade-off between the performance and the quality of results.
In this paper we introduce two novel aggregative coarsening schemes and incorporate them within state-of-the-art hypergraph partitioner Zoltan.
Our coarsening schemes are inspired by the algebraic multigrid and stable matching approaches.
We demonstrate the effectiveness of the developed schemes as a part of multilevel hypergraph partitioning framework on a wide range of problems.
In many real-world applications, data are often collected in the form of stream, and thus the distribution usually changes in nature, which is referred as concept drift in literature.
We propose a novel and effective approach to handle concept drift via model reuse, leveraging previous knowledge by reusing models.
Each model is associated with a weight representing its reusability towards current data, and the weight is adaptively adjusted according to the model performance.
We provide generalization and regret analysis.
Experimental results also validate the superiority of our approach on both synthetic and real-world datasets.
Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories.
Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression algorithms, and face recognition systems.
A classical example of an associative memory is the Hopfield neural network.
Recently, Gripon and Berrou have introduced an alternative construction which builds on ideas from the theory of error correcting codes and which greatly outperforms the Hopfield network in capacity, diversity, and efficiency.
In this paper we implement a variation of the Gripon-Berrou associative memory on a general purpose graphical processing unit (GPU).
The work of Gripon and Berrou proposes two retrieval rules, sum-of-sum and sum-of-max.
The sum-of-sum rule uses only matrix-vector multiplication and is easily implemented on the GPU.
The sum-of-max rule is much less straightforward to implement because it involves non-linear operations.
However, the sum-of-max rule gives significantly better retrieval error rates.
We propose a hybrid rule tailored for implementation on a GPU which achieves a 880-fold speedup without sacrificing any accuracy.
We consider the problem of locating a black hole in synchronous anonymous networks using finite state agents.
A black hole is a harmful node in the network that destroys any agent visiting that node without leaving any trace.
The objective is to locate the black hole without destroying too many agents.
This is difficult to achieve when the agents are initially scattered in the network and are unaware of the location of each other.
Previous studies for black hole search used more powerful models where the agents had non-constant memory, were labelled with distinct identifiers and could either write messages on the nodes of the network or mark the edges of the network.
In contrast, we solve the problem using a small team of finite-state agents each carrying a constant number of identical tokens that could be placed on the nodes of the network.
Thus, all resources used in our algorithms are independent of the network size.
We restrict our attention to oriented torus networks and first show that no finite team of finite state agents can solve the problem in such networks, when the tokens are not movable.
In case the agents are equipped with movable tokens, we determine lower bounds on the number of agents and tokens required for solving the problem in torus networks of arbitrary size.
Further, we present a deterministic solution to the black hole search problem for oriented torus networks, using the minimum number of agents and tokens.
The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image processing, and other research fields.
Among sparse representations, the cosparse analysis model has recently gained increasing interest.
Many signals exhibit a multidimensional structure, e.g. images or three-dimensional MRI scans.
Most data analysis and learning algorithms use vectorized signals and thereby do not account for this underlying structure.
The drawback of not taking the inherent structure into account is a dramatic increase in computational cost.
We propose an algorithm for learning a cosparse Analysis Operator that adheres to the preexisting structure of the data, and thus allows for a very efficient implementation.
This is achieved by enforcing a separable structure on the learned operator.
Our learning algorithm is able to deal with multidimensional data of arbitrary order.
We evaluate our method on volumetric data at the example of three-dimensional MRI scans.
The combination of aerial survey capabilities of Unmanned Aerial Vehicles with targeted intervention abilities of agricultural Unmanned Ground Vehicles can significantly improve the effectiveness of robotic systems applied to precision agriculture.
In this context, building and updating a common map of the field is an essential but challenging task.
The maps built using robots of different types show differences in size, resolution and scale, the associated geolocation data may be inaccurate and biased, while the repetitiveness of both visual appearance and geometric structures found within agricultural contexts render classical map merging techniques ineffective.
In this paper we propose AgriColMap, a novel map registration pipeline for that leverages a grid-based multi-modal environment representation which includes a vegetation index map and a Digital Surface Model.
We cast the data association problem between maps built from UAVs and UGVs as a multi-modal, large displacement dense optical flow estimation.
The dominant, coherent flows, selected using a voting scheme, are used as point-to-point correspondences to infer a preliminary non-rigid alignment between the maps.
A final refinement is then performed, by exploiting only meaningful parts of the registered maps.
We evaluate our system using real world data for 3 fields with different crop species.
The results show that our method outperforms several state of the art map registration and matching techniques by a large margin, and has a higher tolerance to large initial misalignments.
We release an implementation of the proposed approach along with the acquired datasets with this paper.
The use of future contextual information is typically shown to be helpful for acoustic modeling.
Recently, we proposed a RNN model called minimal gated recurrent unit with input projection (mGRUIP), in which a context module namely temporal convolution, is specifically designed to model the future context.
This model, mGRUIP with context module (mGRUIP-Ctx), has been shown to be able of utilizing the future context effectively, meanwhile with quite low model latency and computation cost.
In this paper, we continue to improve mGRUIP-Ctx with two revisions: applying BN methods and enlarging model context.
Experimental results on two Mandarin ASR tasks (8400 hours and 60K hours) show that, the revised mGRUIP-Ctx outperform LSTM with a large margin (11% to 38%).
It even performs slightly better than a superior BLSTM on the 8400h task, with 33M less parameters and just 290ms model latency.
We present a new parallel algorithm for solving triangular systems with multiple right hand sides (TRSM).
TRSM is used extensively in numerical linear algebra computations, both to solve triangular linear systems of equations as well as to compute factorizations with triangular matrices, such as Cholesky, LU, and QR.
Our algorithm achieves better theoretical scalability than known alternatives, while maintaining numerical stability, via selective use of triangular matrix inversion.
We leverage the fact that triangular inversion and matrix multiplication are more parallelizable than the standard TRSM algorithm.
By only inverting triangular blocks along the diagonal of the initial matrix, we generalize the usual way of TRSM computation and the full matrix inversion approach.
This flexibility leads to an efficient algorithm for any ratio of the number of right hand sides to the triangular matrix dimension.
We provide a detailed communication cost analysis for our algorithm as well as for the recursive triangular matrix inversion.
This cost analysis makes it possible to determine optimal block sizes and processor grids a priori.
Relative to the best known algorithms for TRSM, our approach can require asymptotically fewer messages, while performing optimal amounts of computation and communication in terms of words sent.
Detecting fake users (also called Sybils) in online social networks is a basic security research problem.
State-of-the-art approaches rely on a large amount of manually labeled users as a training set.
These approaches suffer from three key limitations: 1) it is time-consuming and costly to manually label a large training set, 2) they cannot detect new Sybils in a timely fashion, and 3) they are vulnerable to Sybil attacks that leverage information of the training set.
In this work, we propose SybilBlind, a structure-based Sybil detection framework that does not rely on a manually labeled training set.
SybilBlind works under the same threat model as state-of-the-art structure-based methods.
We demonstrate the effectiveness of SybilBlind using 1) a social network with synthetic Sybils and 2) two Twitter datasets with real Sybils.
For instance, SybilBlind achieves an AUC of 0.98 on a Twitter dataset.
Planning in partially observable Markov decision processes (POMDPs) remains a challenging topic in the artificial intelligence community, in spite of recent impressive progress in approximation techniques.
Previous research has indicated that online planning approaches are promising in handling large-scale POMDP domains efficiently as they make decisions "on demand" instead of proactively for the entire state space.
We present a Factored Hybrid Heuristic Online Planning (FHHOP) algorithm for large POMDPs.
FHHOP gets its power by combining a novel hybrid heuristic search strategy with a recently developed factored state representation.
On several benchmark problems, FHHOP substantially outperformed state-of-the-art online heuristic search approaches in terms of both scalability and quality.
In languages like C, buffer overflows are widespread.
A common mitigation technique is to use tools that detect them during execution and abort the program to prevent the leakage of data or the diversion of control flow.
However, for server applications, it would be desirable to prevent such errors while maintaining availability of the system.
To this end, we present an approach to handle buffer overflows without aborting the program.
This approach involves implementing a continuation logic in library functions based on an introspection function that allows querying the size of a buffer.
We demonstrate that introspection can be implemented in popular bug-finding and bug-mitigation tools such as LLVM's AddressSanitizer, SoftBound, and Intel-MPX-based bounds checking.
We evaluated our approach in a case study of real-world bugs and show that for tools that explicitly track bounds data, introspection results in a low performance overhead.
We describe an XML file format for storing data from computations in algebra and geometry.
We also present a formal specification based on a RELAX-NG schema.
Machine learning and data mining algorithms are becoming increasingly important in analyzing large volume, multi-relational and multi--modal datasets, which are often conveniently represented as multiway arrays or tensors.
It is therefore timely and valuable for the multidisciplinary research community to review tensor decompositions and tensor networks as emerging tools for large-scale data analysis and data mining.
We provide the mathematical and graphical representations and interpretation of tensor networks, with the main focus on the Tucker and Tensor Train (TT) decompositions and their extensions or generalizations.
Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA) (This is Part 1)
The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data.
If labels indices are considered as time indices, the problems can often be seen as equivalent.
In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data.
From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data.
We carry out an empirical evaluation investigating performance on real-world sequential-prediction tasks: electricity demand, and route prediction.
As well as showing that several popular multi-label algorithms are in fact easily applicable to sequencing tasks, our novel approaches, which benefit from a unified view of these areas, prove very competitive against established methods.
Deploying deep neural networks on mobile devices is a challenging task.
Current model compression methods such as matrix decomposition effectively reduce the deployed model size, but still cannot satisfy real-time processing requirement.
This paper first discovers that the major obstacle is the excessive execution time of non-tensor layers such as pooling and normalization without tensor-like trainable parameters.
This motivates us to design a novel acceleration framework: DeepRebirth through "slimming" existing consecutive and parallel non-tensor and tensor layers.
The layer slimming is executed at different substructures: (a) streamline slimming by merging the consecutive non-tensor and tensor layer vertically; (b) branch slimming by merging non-tensor and tensor branches horizontally.
The proposed optimization operations significantly accelerate the model execution and also greatly reduce the run-time memory cost since the slimmed model architecture contains less hidden layers.
To maximally avoid accuracy loss, the parameters in new generated layers are learned with layer-wise fine-tuning based on both theoretical analysis and empirical verification.
As observed in the experiment, DeepRebirth achieves more than 3x speed-up and 2.5x run-time memory saving on GoogLeNet with only 0.4% drop of top-5 accuracy on ImageNet.
Furthermore, by combining with other model compression techniques, DeepRebirth offers an average of 65ms inference time on the CPU of Samsung Galaxy S6 with 86.5% top-5 accuracy, 14% faster than SqueezeNet which only has a top-5 accuracy of 80.5%.
Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic.
In this paper, we explore the existence of similar information on vector representations of images.
For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet.
We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics.
We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g.118 dog types).
More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g. living things).
Afterwards, we consider vector arithmetics.
Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them.
Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.
This paper presents the release of EmojiNet, the largest machine-readable emoji sense inventory that links Unicode emoji representations to their English meanings extracted from the Web.
EmojiNet is a dataset consisting of: (i) 12,904 sense labels over 2,389 emoji, which were extracted from the web and linked to machine-readable sense definitions seen in BabelNet, (ii) context words associated with each emoji sense, which are inferred through word embedding models trained over Google News corpus and a Twitter message corpus for each emoji sense definition, and (iii) recognizing discrepancies in the presentation of emoji on different platforms, specification of the most likely platform-based emoji sense for a selected set of emoji.
The dataset is hosted as an open service with a REST API and is available at http://emojinet.knoesis.org/.
The development of this dataset, evaluation of its quality, and its applications including emoji sense disambiguation and emoji sense similarity are discussed.
In this letter, we propose a control framework for human-in-the-loop systems, in which many human decision makers are involved in the feedback loop composed of a plant and a controller.
The novelty of the framework is that the decision makers are weakly controlled; in other words, they receive a set of admissible control actions from the controller and choose one of them in accordance with their private preferences.
For example, the decision makers can decide their actions to minimize their own costs or by simply relying on their experience and intuition.
A class of controllers which output set-valued signals is proposed, and it is shown that the overall control system is stable independently of the decisions made by the humans.
Finally, a learning algorithm is applied to the controller that updates the controller parameters to reduce the achievable minimal costs for the decision makers.
Effective use of the algorithm is demonstrated in a numerical experiment.
A Hamilton cycle is a cycle containing every vertex of a graph.
A graph is called Hamiltonian if it contains a Hamilton cycle.
The Hamilton cycle problem is to find the sufficient and necessary condition that a graph is Hamiltonian.
In this paper, we give out some new kind of definitions of the subgraphs and determine the Hamiltoncity of edges according to the existence of the subgraphs in a graph, and then obtain a new property of Hamilton graphs as being a necessary and sufficient condition characterized in the connectivity of the subgraph that induced from the cycle structure of a given graph.
Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks.
We propose a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same.
Our approach rests on two key ideas: "successor features", a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement", a generalization of dynamic programming's policy improvement operation that considers a set of policies rather than a single one.
Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks.
The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place.
We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.
Domain adaptation aims to learn models on a supervised source domain that perform well on an unsupervised target.
Prior work has examined domain adaptation in the context of stationary domain shifts, i.e. static data sets.
However, with large-scale or dynamic data sources, data from a defined domain is not usually available all at once.
For instance, in a streaming data scenario, dataset statistics effectively become a function of time.
We introduce a framework for adaptation over non-stationary distribution shifts applicable to large-scale and streaming data scenarios.
The model is adapted sequentially over incoming unsupervised streaming data batches.
This enables improvements over several batches without the need for any additionally annotated data.
To demonstrate the effectiveness of our proposed framework, we modify associative domain adaptation to work well on source and target data batches with unequal class distributions.
We apply our method to several adaptation benchmark datasets for classification and show improved classifier accuracy not only for the currently adapted batch, but also when applied on future stream batches.
Furthermore, we show the applicability of our associative learning modifications to semantic segmentation, where we achieve competitive results.
We present a novel efficient object detection and localization framework based on the probabilistic bisection algorithm.
A Convolutional Neural Network (CNN) is trained and used as a noisy oracle that provides answers to input query images.
The responses along with error probability estimates obtained from the CNN are used to update beliefs on the object location along each dimension.
We show that querying along each dimension achieves the same lower bound on localization error as the joint query design.
Finally, we compare our approach to the traditional sliding window technique on a real world face localization task and show speed improvements by at least an order of magnitude while maintaining accurate localization.
This paper considers the problem of removing costly features from a Bayesian network classifier.
We want the classifier to be robust to these changes, and maintain its classification behavior.
To this end, we propose a closeness metric between Bayesian classifiers, called the expected classification agreement (ECA).
Our corresponding trimming algorithm finds an optimal subset of features and a new classification threshold that maximize the expected agreement, subject to a budgetary constraint.
It utilizes new theoretical insights to perform branch-and-bound search in the space of feature sets, while computing bounds on the ECA.
Our experiments investigate both the runtime cost of trimming and its effect on the robustness and accuracy of the final classifier.
In the process of knowledge discovery and representation in large datasets using formal concept analysis, complexity plays a major role in identifying all the formal concepts and constructing the concept lattice(digraph of the concepts).
For identifying the formal concepts and constructing the digraph from the identified concepts in very large datasets, various distributed algorithms are available in the literature.
However, the existing distributed algorithms are not very well suitable for concept generation because it is an iterative process.
The existing algorithms are implemented using distributed frameworks like MapReduce and Open MP, these frameworks are not appropriate for iterative applications.
Hence, in this paper we proposed efficient distributed algorithms for both formal concept generation and concept lattice digraph construction in large formal contexts using Apache Spark.
Various performance metrics are considered for the evaluation of the proposed work, the results of the evaluation proves that the proposed algorithms are efficient for concept generation and lattice graph construction in comparison with the existing algorithms.
We present an algorithm to generate synthetic datasets of tunable difficulty on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks.
The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content.
This makes them particularly challenging when implementing network complexity reduction methods.
We explore how network performance is affected by deliberately adding various forms of noise and expanding the feature set and dataset size.
Finally, we establish several metrics to indicate the difficulty of a dataset, and evaluate their merits.
The algorithm and datasets are open-source.
We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network.
By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors.
In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners.
Experiments show that CornerNet achieves a 42.1% AP on MS COCO, outperforming all existing one-stage detectors.
Exiting deep-learning based dense stereo matching methods often rely on ground-truth disparity maps as the training signals, which are however not always available in many situations.
In this paper, we design a simple convolutional neural network architecture that is able to learn to compute dense disparity maps directly from the stereo inputs.
Training is performed in an end-to-end fashion without the need of ground-truth disparity maps.
The idea is to use image warping error (instead of disparity-map residuals) as the loss function to drive the learning process, aiming to find a depth-map that minimizes the warping error.
While this is a simple concept well-known in stereo matching, to make it work in a deep-learning framework, many non-trivial challenges must be overcome, and in this work we provide effective solutions.
Our network is self-adaptive to different unseen imageries as well as to different camera settings.
Experiments on KITTI and Middlebury stereo benchmark datasets show that our method outperforms many state-of-the-art stereo matching methods with a margin, and at the same time significantly faster.
Representation learning is an essential problem in a wide range of applications and it is important for performing downstream tasks successfully.
In this paper, we propose a new model that learns coupled representations of domains, intents, and slots by taking advantage of their hierarchical dependency in a Spoken Language Understanding system.
Our proposed model learns the vector representation of intents based on the slots tied to these intents by aggregating the representations of the slots.
Similarly, the vector representation of a domain is learned by aggregating the representations of the intents tied to a specific domain.
To the best of our knowledge, it is the first approach to jointly learning the representations of domains, intents, and slots using their hierarchical relationships.
The experimental results demonstrate the effectiveness of the representations learned by our model, as evidenced by improved performance on the contextual cross-domain reranking task.
In this paper, optimal filter design for generalized frequency-division multiplexing (GFDM) is considered under two design criteria: rate maximization and out-of-band (OOB) emission minimization.
First, the problem of GFDM filter optimization for rate maximization is formulated by expressing the transmission rate of GFDM as a function of GFDM filter coefficients.
It is shown that Dirichlet filters are rate-optimal in additive white Gaussian noise (AWGN) channels with no carrier frequency offset (CFO) under linear zero-forcing (ZF) or minimum mean-square error (MMSE) receivers, but in general channels perturbed by CFO a properly designed nontrivial GFDM filter can yield better performance than Dirichlet filters by adjusting the subcarrier waveform to cope with the channel-induced CFO.
Next, the problem of GFDM filter design for OOB emission minimization is formulated by expressing the power spectral density (PSD) of the GFDM transmit signal as a function of GFDM filter coefficients, and it is shown that the OOB emission can be reduced significantly by designing the GFDM filter properly.
Finally, joint design of GFDM filter and window for the two design criteria is considered.
For enhancing the privacy protections of databases, where the increasing amount of detailed personal data is stored and processed, multiple mechanisms have been developed, such as audit logging and alert triggers, which notify administrators about suspicious activities; however, the two main limitations in common are: 1) the volume of such alerts is often substantially greater than the capabilities of resource-constrained organizations, and 2) strategic attackers may disguise their actions or carefully choosing which records they touch, making incompetent the statistical detection models.
For solving them, we introduce a novel approach to database auditing that explicitly accounts for adversarial behavior by 1) prioritizing the order in which types of alerts are investigated and 2) providing an upper bound on how much resource to allocate for each type.
We model the interaction between a database auditor and potential attackers as a Stackelberg game in which the auditor chooses an auditing policy and attackers choose which records to target.
A corresponding approach combining linear programming, column generation, and heuristic search is proposed to derive an auditing policy.
For testing the policy-searching performance, a publicly available credit card application dataset are adopted, on which it shows that our methods produce high-quality mixed strategies as database audit policies, and our general approach significantly outperforms non-game-theoretic baselines.
We discuss the scheduling of a set of networked control systems implemented over a shared communication network.
Each control loop is described by a linear-time-invariant (LTI) system with an event-triggered implementation.
We assume the network can be used by at most one control loop at any time instant and after each controller update, a pre-defined channel occupancy time elapses before the network is available.
In our framework we offer the scheduler two options to avoid conflicts: using the event-triggering mechanism, where the scheduler can choose the triggering coefficient; or forcing controller updates at an earlier pre-defined time.
Our objective is avoiding communication conflict while guaranteeing stability of all control loops.
We formulate the original scheduling problem as a control synthesis problem over a network of timed game automata (NTGA) with a safety objective.
The NTGA is obtained by taking the parallel composition of the timed game automata (TGA) associated with the network and with all control loops.
The construction of TGA associated with control loops leverages recent results on the abstraction of timing models of event-triggered LTI systems.
In our problem, the safety objective is to avoid that update requests from a control loop happen while the network is in use by another task.
We showcase the results in some examples.
In this paper, we consider the task of learning control policies for text-based games.
In these games, all interactions in the virtual world are through text and the underlying state is not observed.
The resulting language barrier makes such environments challenging for automatic game players.
We employ a deep reinforcement learning framework to jointly learn state representations and action policies using game rewards as feedback.
This framework enables us to map text descriptions into vector representations that capture the semantics of the game states.
We evaluate our approach on two game worlds, comparing against baselines using bag-of-words and bag-of-bigrams for state representations.
Our algorithm outperforms the baselines on both worlds demonstrating the importance of learning expressive representations.
Traditional face editing methods often require a number of sophisticated and task specific algorithms to be applied one after the other --- a process that is tedious, fragile, and computationally intensive.
In this paper, we propose an end-to-end generative adversarial network that infers a face-specific disentangled representation of intrinsic face properties, including shape (i.e. normals), albedo, and lighting, and an alpha matte.
We show that this network can be trained on "in-the-wild" images by incorporating an in-network physically-based image formation module and appropriate loss functions.
Our disentangling latent representation allows for semantically relevant edits, where one aspect of facial appearance can be manipulated while keeping orthogonal properties fixed, and we demonstrate its use for a number of facial editing applications.
Stock market forecasting is very important in the planning of business activities.
Stock price prediction has attracted many researchers in multiple disciplines including computer science, statistics, economics, finance, and operations research.
Recent studies have shown that the vast amount of online information in the public domain such as Wikipedia usage pattern, news stories from the mainstream media, and social media discussions can have an observable effect on investors opinions towards financial markets.
The reliability of the computational models on stock market prediction is important as it is very sensitive to the economy and can directly lead to financial loss.
In this paper, we retrieved, extracted, and analyzed the effects of news sentiments on the stock market.
Our main contributions include the development of a sentiment analysis dictionary for the financial sector, the development of a dictionary-based sentiment analysis model, and the evaluation of the model for gauging the effects of news sentiments on stocks for the pharmaceutical market.
Using only news sentiments, we achieved a directional accuracy of 70.59% in predicting the trends in short-term stock price movement.
Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012).
The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in the image.
Such a model captures the whole-image context around the objects but cannot handle multiple instances of the same object in the image without naively replicating the number of outputs for each instance.
In this work, we propose a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest.
The model naturally handles a variable number of instances for each class and allows for cross-class generalization at the highest levels of the network.
We are able to obtain competitive recognition performance on VOC2007 and ILSVRC2012, while using only the top few predicted locations in each image and a small number of neural network evaluations.
Motivated by a growing market that involves buying and selling data over the web, we study pricing schemes that assign value to queries issued over a database.
Previous work studied pricing mechanisms that compute the price of a query by extending a data seller's explicit prices on certain queries, or investigated the properties that a pricing function should exhibit without detailing a generic construction.
In this work, we present a formal framework for pricing queries over data that allows the construction of general families of pricing functions, with the main goal of avoiding arbitrage.
We consider two types of pricing schemes: instance-independent schemes, where the price depends only on the structure of the query, and answer-dependent schemes, where the price also depends on the query output.
Our main result is a complete characterization of the structure of pricing functions in both settings, by relating it to properties of a function over a lattice.
We use our characterization, together with information-theoretic methods, to construct a variety of arbitrage-free pricing functions.
Finally, we discuss various tradeoffs in the design space and present techniques for efficient computation of the proposed pricing functions.
While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves.
To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state-of-the-art performance within the same model class on a variety of tasks.
We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems.
Event recognition systems rely on properly engineered knowledge bases of event definitions to infer occurrences of events in time.
The manual development of such knowledge is a tedious and error-prone task, thus event-based applications may benefit from automated knowledge construction techniques, such as Inductive Logic Programming (ILP), which combines machine learning with the declarative and formal semantics of First-Order Logic.
However, learning temporal logical formalisms, which are typically utilized by logic-based Event Recognition systems is a challenging task, which most ILP systems cannot fully undertake.
In addition, event-based data is usually massive and collected at different times and under various circumstances.
Ideally, systems that learn from temporal data should be able to operate in an incremental mode, that is, revise prior constructed knowledge in the face of new evidence.
Most ILP systems are batch learners, in the sense that in order to account for new evidence they have no alternative but to forget past knowledge and learn from scratch.
Given the increased inherent complexity of ILP and the volumes of real-life temporal data, this results to algorithms that scale poorly.
In this work we present an incremental method for learning and revising event-based knowledge, in the form of Event Calculus programs.
The proposed algorithm relies on abductive-inductive learning and comprises a scalable clause refinement methodology, based on a compressive summarization of clause coverage in a stream of examples.
We present an empirical evaluation of our approach on real and synthetic data from activity recognition and city transport applications.
The purported "black box"' nature of neural networks is a barrier to adoption in applications where interpretability is essential.
Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input.
DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference.
By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches.
Scores can be computed efficiently in a single backward pass.
We apply DeepLIFT to models trained on MNIST and simulated genomic data, and show significant advantages over gradient-based methods.
A detailed video tutorial on the method is at http://goo.gl/qKb7pL and code is at http://goo.gl/RM8jvH.
A small part of the Torah is arranged into a two dimensional array.
The characters are then permuted using a simple recursive deterministic algorithm.
The various permutations are then passed through three stochastic filters and one deterministic filter to identify the permutations which most closely approximate readable Biblical Hebrew.
Of the 15 Billion sequences available at the second level of recursion, 800 pass the a priori thresholds set for each filter.
The resulting "Biblical Hebrew" text is available for inspection and the generation of further material continues.
It is well known that modal satisfiability is PSPACE-complete (Ladner 1977).
However, the complexity may decrease if we restrict the set of propositional operators used.
Note that there exist an infinite number of propositional operators, since a propositional operator is simply a Boolean function.
We completely classify the complexity of modal satisfiability for every finite set of propositional operators, i.e., in contrast to previous work, we classify an infinite number of problems.
We show that, depending on the set of propositional operators, modal satisfiability is PSPACE-complete, coNP-complete, or in P. We obtain this trichotomy not only for modal formulas, but also for their more succinct representation using modal circuits.
We consider both the uni-modal and the multi-modal case, and study the dual problem of validity as well.
Generating images from word descriptions is a challenging task.
Generative adversarial networks(GANs) are shown to be able to generate realistic images of real-life objects.
In this paper, we propose a new neural network architecture of LSTM Conditional Generative Adversarial Networks to generate images of real-life objects.
Our proposed model is trained on the Oxford-102 Flowers and Caltech-UCSD Birds-200-2011 datasets.
We demonstrate that our proposed model produces the better results surpassing other state-of-art approaches.
Evaluating agent performance when outcomes are stochastic and agents use randomized strategies can be challenging when there is limited data available.
The variance of sampled outcomes may make the simple approach of Monte Carlo sampling inadequate.
This is the case for agents playing heads-up no-limit Texas hold'em poker, where man-machine competitions have involved multiple days of consistent play and still not resulted in statistically significant conclusions even when the winner's margin is substantial.
In this paper, we introduce AIVAT, a low variance, provably unbiased value assessment tool that uses an arbitrary heuristic estimate of state value, as well as the explicit strategy of a subset of the agents.
Unlike existing techniques which reduce the variance from chance events, or only consider game ending actions, AIVAT reduces the variance both from choices by nature and by players with a known strategy.
The resulting estimator in no-limit poker can reduce the number of hands needed to draw statistical conclusions by more than a factor of 10.
The demand for stream processing is increasing at an unprecedented rate.
Big data is no longer limited to processing of big volumes of data.
In most real-world scenarios, the need for processing stream data as it comes can only meet the business needs.
It is required for trading, fraud detection, system monitoring, product maintenance and of course social media data such as Twitter and YouTube videos.
In such cases, a "too late architecture" that focuses on batch processing cannot realize the use cases.
In this article, we present an end to end Big data platform called AlertMix for processing multi-source streaming data.
Its architecture and how various Big data technologies are utilized are explained in this work.
We present the performance of our platform on real live streaming data which is currently handled by the platform.
This paper presents a minimalist neural regression network as an aggregate of independent identical regression blocks that are trained simultaneously.
Moreover, it introduces a new multiplicative parameter, shared by all the neural units of a given layer, to maintain the quality of its gradients.
Furthermore, it increases its estimation accuracy via learning a weight factor whose quantity captures the redundancy between the estimated and actual values at each training iteration.
We choose the estimation of the direct weld parameters of different welding techniques to show a significant improvement in calculation of these parameters by our model in contrast to state-of-the-arts techniques in the literature.
Furthermore, we demonstrate the ability of our model to retain its performance when presented with combined data of different welding techniques.
This is a nontrivial result in attaining an scalable model whose quality of estimation is independent of adopted welding techniques.
The input to a neural sequence-to-sequence model is often determined by an up-stream system, e.g. a word segmenter, part of speech tagger, or speech recognizer.
These up-stream models are potentially error-prone.
Representing inputs through word lattices allows making this uncertainty explicit by capturing alternative sequences and their posterior probabilities in a compact form.
In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model.
We integrate lattice posterior scores into this architecture by extending the TreeLSTM's child-sum and forget gates and introducing a bias term into the attention mechanism.
We experiment with speech translation lattices and report consistent improvements over baselines that translate either the 1-best hypothesis or the lattice without posterior scores.
We study probabilistic models of natural images and extend the autoregressive family of PixelCNN architectures by incorporating auxiliary variables.
Subsequently, we describe two new generative image models that exploit different image transformations as auxiliary variables: a quantized grayscale view of the image or a multi-resolution image pyramid.
The proposed models tackle two known shortcomings of existing PixelCNN models: 1) their tendency to focus on low-level image details, while largely ignoring high-level image information, such as object shapes, and 2) their computationally costly procedure for image sampling.
We experimentally demonstrate benefits of the proposed models, in particular showing that they produce much more realistically looking image samples than previous state-of-the-art probabilistic models.
This paper shows and evaluates a novel approach to integrate a non-invasive Brain-Computer Interface (BCI) with the Robot Operating System (ROS) to mentally drive a telepresence robot.
Controlling a mobile device by using human brain signals might improve the quality of life of people suffering from severe physical disabilities or elderly people who cannot move anymore.
Thus, the BCI user is able to actively interact with relatives and friends located in different rooms thanks to a video streaming connection to the robot.
To facilitate the control of the robot via BCI, we explore new ROS-based algorithms for navigation and obstacle avoidance, making the system safer and more reliable.
In this regard, the robot can exploit two maps of the environment, one for localization and one for navigation, and both can be used also by the BCI user to watch the position of the robot while it is moving.
As demonstrated by the experimental results, the user's cognitive workload is reduced, decreasing the number of commands necessary to complete the task and helping him/her to keep attention for longer periods of time.
The Unified Modeling Language (UML) community has started to define so-called profiles in order to better suit the needs of specific domains or settings.
Product lines1 represent a special breed of systems they are extensible semi-finished pieces of software.
Completing the semi-finished software leads to various software pieces, typically specific applications, which share the same core.
Though product lines have been developed for a wide range of domains, they apply common construction principles.
The intention of the UML-F profile (for framework architectures) is the definition of a UML subset, enriched with a few UML-compliant extensions, which allows the annotation of such artifacts.
This paper presents aspects of the profile with a focus on patterns and exemplifies the profile's usage.
With explosion of data size and limited storage space at a single location, data are often distributed at different locations.
We thus face the challenge of performing large-scale machine learning from these distributed data through communication networks.
In this paper, we study how the network communication constraints will impact the convergence speed of distributed machine learning optimization algorithms.
In particular, we give the convergence rate analysis of the distributed dual coordinate ascent in a general tree structured network.
Furthermore, by considering network communication delays, we optimize the network-constrained dual coordinate ascent algorithms to maximize its convergence speed.
Our results show that under different network communication delays, to achieve maximum convergence speed, one needs to adopt delay-dependent numbers of local and global iterations for distributed dual coordinate ascent.
A major challenge in obtaining large-scale evaluations, e.g., product or service reviews on online platforms, labeling images, grading in online courses, etc., is that of eliciting honest responses from agents in the absence of verifiability.
We propose a new reward mechanism with strong incentive properties applicable in a wide variety of such settings.
This mechanism has a simple and intuitive output agreement structure: an agent gets a reward only if her response for an evaluation matches that of her peer.
But instead of the reward being the same across different answers, it is inversely proportional to a popularity index of each answer.
This index is a second order population statistic that captures how frequently two agents performing the same evaluation agree on the particular answer.
Rare agreements thus earn a higher reward than agreements that are relatively more common.
In the regime where there are a large number of evaluation tasks, we show that truthful behavior is a strict Bayes-Nash equilibrium of the game induced by the mechanism.
Further, we show that the truthful equilibrium is approximately optimal in terms of expected payoffs to the agents across all symmetric equilibria, where the approximation error vanishes in the number of evaluation tasks.
Moreover, under a mild condition on strategy space, we show that any symmetric equilibrium that gives a higher expected payoff than the truthful equilibrium must be close to being fully informative if the number of evaluations is large.
These last two results are driven by a new notion of an agreement measure that is shown to be monotonic in information loss.
This notion and its properties are of independent interest.
We present an anytime algorithm that generates a collision-free configuration-space path that closely follows a desired path in task space, according to the discrete Frechet distance.
By leveraging tools from computational geometry, we approximate the search space using a cross-product graph.
We use a variant of Dijkstra's graph-search algorithm to efficiently search for and iteratively improve the solution.
We compare multiple proposed densification strategies and empirically show that our algorithm outperforms a set of state-of-the-art planners on a range of manipulation problems.
Finally, we offer a proof sketch of the asymptotic optimality of our algorithm.
The problem of finding the maximum number of vertex-disjoint uni-color paths in an edge-colored graph (called MaxCDP) has been recently introduced in literature, motivated by applications in social network analysis.
In this paper we investigate how the complexity of the problem depends on graph parameters (namely the number of vertices to remove to make the graph a collection of disjoint paths and the size of the vertex cover of the graph), which makes sense since graphs in social networks are not random and have structure.
The problem was known to be hard to approximate in polynomial time and not fixed-parameter tractable (FPT) for the natural parameter.
Here, we show that it is still hard to approximate, even in FPT-time.
Finally, we introduce a new variant of the problem, called MaxCDDP, whose goal is to find the maximum number of vertex-disjoint and color-disjoint uni-color paths.
We extend some of the results of MaxCDP to this new variant, and we prove that unlike MaxCDP, MaxCDDP is already hard on graphs at distance two from disjoint paths.
In this paper, we study the problem of question answering when reasoning over multiple facts is required.
We propose Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that effectively handles both short-term (local) and long-term (global) sequential dependencies to reason over multiple facts.
QRN considers the context sentences as a sequence of state-changing triggers, and reduces the original query to a more informed query as it observes each trigger (context sentence) through time.
Our experiments show that QRN produces the state-of-the-art results in bAbI QA and dialog tasks, and in a real goal-oriented dialog dataset.
In addition, QRN formulation allows parallelization on RNN's time axis, saving an order of magnitude in time complexity for training and inference.
Regularization for matrix factorization (MF) and approximation problems has been carried out in many different ways.
Due to its popularity in deep learning, dropout has been applied also for this class of problems.
Despite its solid empirical performance, the theoretical properties of dropout as a regularizer remain quite elusive for this class of problems.
In this paper, we present a theoretical analysis of dropout for MF, where Bernoulli random variables are used to drop columns of the factors.
We demonstrate the equivalence between dropout and a fully deterministic model for MF in which the factors are regularized by the sum of the product of squared Euclidean norms of the columns.
Additionally, we inspect the case of a variable sized factorization and we prove that dropout achieves the global minimum of a convex approximation problem with (squared) nuclear norm regularization.
As a result, we conclude that dropout can be used as a low-rank regularizer with data dependent singular-value thresholding.
This paper deals with uncertain parabolic fluid flow problem where the uncertainty occurs due to the initial conditions and parameters involved in the system.
Uncertain values are considered as fuzzy and these are handled through a recently developed method.
Here the concepts of fuzzy numbers are combined with Finite Difference Method (FDM) and then Fuzzy Finite Difference Method (FFDM) has been proposed.
The proposed FFDM has been used to solve the fluid flow problem bounded by two parallel plates.
Finally sensitivity of the fuzzy parameters has also been analysed.
Context: Over the last decade, software researchers and engineers have developed a vast body of methodologies and technologies in requirements engineering for self-adaptive systems.
Although existing studies have explored various aspects of this field, no systematic study has been performed on summarizing modeling methods and corresponding requirements activities.
Objective: This study summarizes the state-of-the-art research trends, details the modeling methods and corresponding requirements activities, identifies relevant quality attributes and application domains and assesses the quality of each study.
Method: We perform a systematic literature review underpinned by a rigorously established and reviewed protocol.
To ensure the quality of the study, we choose 21 highly regarded publication venues and 8 popular digital libraries.
In addition, we apply text mining to derive search strings and use Kappa coefficient to mitigate disagreements of researchers.
Results: We selected 109 papers during the period of 2003-2013 and presented the research distributions over various kinds of factors.
We extracted 29 modeling methods which are classified into 8 categories and identified 14 requirements activities which are classified into 4 requirements timelines.
We captured 8 concerned software quality attributes based on the ISO 9126 standard and 12 application domains.
Conclusion: The frequency of application of modeling methods varies greatly.
Enterprise models were more widely used while behavior models were more rigorously evaluated.
Requirements-driven runtime adaptation was the most frequently studied requirements activity.
Activities at runtime were conveyed with more details.
Finally, we draw other conclusions by discussing how well modeling dimensions were considered in these modeling methods and how well assurance dimensions were conveyed in requirements activities.
Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference.
By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process.
Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG).
The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging.
In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs.
We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs.
Specifically, we introduce two high performance Gaussian (pseudo) random number generators: the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator.
To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization.
Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.
We introduce Tempered Geodesic Markov Chain Monte Carlo (TG-MCMC) algorithm for initializing pose graph optimization problems, arising in various scenarios such as SFM (structure from motion) or SLAM (simultaneous localization and mapping).
TG-MCMC is first of its kind as it unites asymptotically global non-convex optimization on the spherical manifold of quaternions with posterior sampling, in order to provide both reliable initial poses and uncertainty estimates that are informative about the quality of individual solutions.
We devise rigorous theoretical convergence guarantees for our method and extensively evaluate it on synthetic and real benchmark datasets.
Besides its elegance in formulation and theory, we show that our method is robust to missing data, noise and the estimated uncertainties capture intuitive properties of the data.
In this paper, we present multi-threaded algorithms for graph coloring suitable to the shared memory programming model.
We modify an existing algorithm widely used in the literature and prove the correctness of the modified algorithm.
We also propose a new approach to solve the problem of coloring using locks.
Using datasets from real world graphs, we evaluate the performance of the algorithms on the Intel platform.
We compare the performance of the sequential approach v/s our proposed approach and analyze the speedup obtained against the existing algorithm from the literature.
The results show that the speedup obtained is consequential.
We also provide a direction for future work towards improving the performance further in terms of different metrics.
Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates.
The high variance problem is particularly exasperated in problems with long horizons or high-dimensional action spaces.
To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.
We demonstrate and quantify the benefit of the action-dependent baseline through both theoretical analysis as well as numerical results, including an analysis of the suboptimality of the optimal state-dependent baseline.
The result is a computationally efficient policy gradient algorithm, which scales to high-dimensional control problems, as demonstrated by a synthetic 2000-dimensional target matching task.
Our experimental results indicate that action-dependent baselines allow for faster learning on standard reinforcement learning benchmarks and high-dimensional hand manipulation and synthetic tasks.
Finally, we show that the general idea of including additional information in baselines for improved variance reduction can be extended to partially observed and multi-agent tasks.
In timeline-based planning, domains are described as sets of independent, but interacting, components, whose behaviour over time (the set of timelines) is governed by a set of temporal constraints.
A distinguishing feature of timeline-based planning systems is the ability to integrate planning with execution by synthesising control strategies for flexible plans.
However, flexible plans can only represent temporal uncertainty, while more complex forms of nondeterminism are needed to deal with a wider range of realistic problems.
In this paper, we propose a novel game-theoretic approach to timeline-based planning problems, generalising the state of the art while uniformly handling temporal uncertainty and nondeterminism.
We define a general concept of timeline-based game and we show that the notion of winning strategy for these games is strictly more general than that of control strategy for dynamically controllable flexible plans.
Moreover, we show that the problem of establishing the existence of such winning strategies is decidable using a doubly exponential amount of space.
An LDPC coded modulation scheme with probabilistic shaping, optimized interleavers and noniterative demapping is proposed.
Full-field simulations show an increase in transmission distance by 8% compared to uniformly distributed input.
Data driven segmentation is the powerhouse behind the success of online advertising.
Various underlying challenges for successful segmentation have been studied by the academic community, with one notable exception - consumers incentives have been typically ignored.
This lacuna is troubling as consumers have much control over the data being collected.
Missing or manipulated data could lead to inferior segmentation.
The current work proposes a model of prior-free segmentation, inspired by models of facility location, and to the best of our knowledge provides the first segmentation mechanism that addresses incentive compatibility, efficient market segmentation and privacy in the absence of a common prior.
At the heart of the Bitcoin is a blockchain protocol, a protocol for achieving consensus on a public ledger that records bitcoin transactions.
To the extent that a blockchain protocol is used for applications such as contract signing and making certain transactions (such as house sales) public, we need to understand what guarantees the protocol gives us in terms of agents' knowledge.
Here, we provide a complete characterization of agent's knowledge when running a blockchain protocol using a variant of common knowledge that takes into account the fact that agents can enter and leave the system, it is not known which agents are in fact following the protocol (some agents may want to deviate if they can gain by doing so), and the fact that the guarantees provided by blockchain protocols are probabilistic.
We then consider some scenarios involving contracts and show that this level of knowledge suffices for some scenarios, but not others.
We present a novel hierarchical graphical model based context-aware hybrid brain-machine interface (hBMI) using probabilistic fusion of electroencephalographic (EEG) and electromyographic (EMG) activities.
Based on experimental data collected during stationary executions and subsequent imageries of five different hand gestures with both limbs, we demonstrate feasibility of the proposed hBMI system through within session and online across sessions classification analyses.
Furthermore, we investigate the context-aware extent of the model by a simulated probabilistic approach and highlight potential implications of our work in the field of neurophysiologically-driven robotic hand prosthetics.
It has been shown that most machine learning algorithms are susceptible to adversarial perturbations.
Slightly perturbing an image in a carefully chosen direction in the image space may cause a trained neural network model to misclassify it.
Recently, it was shown that physical adversarial examples exist: printing perturbed images then taking pictures of them would still result in misclassification.
This raises security and safety concerns.
However, these experiments ignore a crucial property of physical objects: the camera can view objects from different distances and at different angles.
In this paper, we show experiments that suggest that current constructions of physical adversarial examples do not disrupt object detection from a moving platform.
Instead, a trained neural network classifies most of the pictures taken from different distances and angles of a perturbed image correctly.
We believe this is because the adversarial property of the perturbation is sensitive to the scale at which the perturbed picture is viewed, so (for example) an autonomous car will misclassify a stop sign only from a small range of distances.
Our work raises an important question: can one construct examples that are adversarial for many or most viewing conditions?
If so, the construction should offer very significant insights into the internal representation of patterns by deep networks.
If not, there is a good prospect that adversarial examples can be reduced to a curiosity with little practical impact.
Verification activities are necessary to ensure that the requirements are specified in a correct way.
However, until now requirements verification research has focused on traditional up-front requirements.
Agile or just-in-time requirements are by definition incomplete, not specific and might be ambiguous when initially specified, indicating a different notion of 'correctness'.
We analyze how verification of agile requirements quality should be performed, based on literature of traditional and agile requirements.
This leads to an agile quality framework, instantiated for the specific requirement types of feature requests in open source projects and user stories in agile projects.
We have performed an initial qualitative validation of our framework for feature requests with eight practitioners from the Dutch agile community, receiving overall positive feedback.
The world is connected through the Internet.
As the abundance of Internet users connected into the Web and the popularity of cloud computing research, the need of Artificial Intelligence (AI) is demanding.
In this research, Genetic Algorithm (GA) as AI optimization method through natural selection and genetic evolution is utilized.
There are many applications of GA such as web mining, load balancing, routing, and scheduling or web service selection.
Hence, it is a challenging task to discover whether the code mainly server side and web based language technology affects the performance of GA. Travelling Salesperson Problem (TSP) as Non Polynomial-hard (NP-hard) problem is provided to be a problem domain to be solved by GA.
While many scientists prefer Python in GA implementation, another popular high-level interpreter programming language such as PHP (PHP Hypertext Preprocessor) and Ruby were benchmarked.
Line of codes, file sizes, and performances based on GA implementation and runtime were found varies among these programming languages.
Based on the result, the use of Ruby in GA implementation is recommended.
As Cook-Levin theorem showed, every NP problem can be reduced to SAT in polynomial time.
In this paper I show a simpler and more efficent method to reduce some factorization problems to the satisfability of a boolean formula.
Motivated by recent advance of machine learning using Deep Reinforcement Learning this paper proposes a modified architecture that produces more robust agents and speeds up the training process.
Our architecture is based on Asynchronous Advantage Actor-Critic (A3C) algorithm where the total input dimensionality is halved by dividing the input into two independent streams.
We use ViZDoom, 3D world software that is based on the classical first person shooter video game, Doom, as a test case.
The experiments show that in comparison to single input agents, the proposed architecture succeeds to have the same playing performance and shows more robust behavior, achieving significant reduction in the number of training parameters of almost 30%.
Sybil detection in social networks is a basic security research problem.
Structure-based methods have been shown to be promising at detecting Sybils.
Existing structure-based methods can be classified into Random Walk (RW)-based methods and Loop Belief Propagation (LBP)-based methods.
RW-based methods cannot leverage labeled Sybils and labeled benign users simultaneously, which limits their detection accuracy, and/or they are not robust to noisy labels.
LBP-based methods are not scalable and cannot guarantee convergence.
In this work, we propose SybilSCAR, a novel structure-based method to detect Sybils in social networks.
SybilSCAR is Scalable, Convergent, Accurate, and Robust to label noise.
We first propose a framework to unify RW-based and LBP-based methods.
Under our framework, these methods can be viewed as iteratively applying a (different) local rule to every user, which propagates label information among a social graph.
Second, we design a new local rule, which SybilSCAR iteratively applies to every user to detect Sybils.
We compare SybilSCAR with state-of-the-art RW-based and LBP-based methods theoretically and empirically.
Theoretically, we show that, with proper parameter settings, SybilSCAR has a tighter asymptotical bound on the number of Sybils that are falsely accepted into a social network than existing structure-based methods.
Empirically, we perform evaluation using both social networks with synthesized Sybils and a large-scale Twitter dataset (41.7M nodes and 1.2B edges) with real Sybils.
Our results show that 1) SybilSCAR is substantially more accurate and more robust to label noise than state-of-the-art RW-based methods; 2) SybilSCAR is more accurate and one order of magnitude more scalable than state-of-the-art LBP-based methods.
We introduce a space-filling curve for triangular and tetrahedral red-refinement that can be computed using bitwise interleaving operations similar to the well-known Z-order or Morton curve for cubical meshes.
To store sufficient information for random access, we define a low-memory encoding using 10 bytes per triangle and 14 bytes per tetrahedron.
We present algorithms that compute the parent, children, and face-neighbors of a mesh element in constant time, as well as the next and previous element in the space-filling curve and whether a given element is on the boundary of the root simplex or not.
Our presentation concludes with a scalability demonstration that creates and adapts selected meshes on a large distributed-memory system.
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments.
However, SW is very computationally demanding for large protein databases.
There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput.
In this paper, we have explored SW acceleration on Intel KNL processor.
The novelty of this architecture requires the revision of previous programming and optimization techniques on many-core architectures.
To the best of authors knowledge, this is the first KNL architecture assessment for SW algorithm.
Our evaluation, using the renowned Environmental NR database as benchmark, has shown that multi-threading and SIMD exploitation reports competitive performance (351 GCUPS) in comparison with other implementations.
This paper claims that a new field of empirical software engineering research and practice is emerging: data mining using/used-by optimizers for empirical studies, or DUO.
For example, data miners can generate the models that are explored by optimizers.Also, optimizers can advise how to best adjust the control parameters of a data miner.
This combined approach acts like an agent leaning over the shoulder of an analyst that advises "ask this question next" or "ignore that problem, it is not relevant to your goals".
Further, those agents can help us build "better" predictive models, where "better" can be either greater predictive accuracy, or faster modeling time (which, in turn, enables the exploration of a wider range of options).
We also caution that the era of papers that just use data miners is coming to an end.
Results obtained from an unoptimized data miner can be quickly refuted, just by applying an optimizer to produce a different (and better performing) model.
Our conclusion, hence, is that for software analytics it is possible, useful and necessary to combine data mining and optimization using DUO.
In programming by example, users "write" programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs.
We consider a challenging problem in this domain: learning regular expressions (regexes) from positive and negative example strings.
This problem is challenging, as (1) user-generated examples may not be informative enough to sufficiently constrain the hypothesis space, and (2) even if user-generated examples are in principle informative, there is still a massive search space to examine.
We frame regex induction as the problem of inferring a probabilistic regular grammar and propose an efficient inference approach that uses a novel stochastic process recognition model.
This model incrementally "grows" a grammar using positive examples as a scaffold.
We show that this approach is competitive with human ability to learn regexes from examples.
Robotic grasping detection is one of the most important fields in robotics, in which great progress has been made recent years with the help of convolutional neural network (CNN).
However, including multiple objects in one scene can invalidate the existing CNN-based grasping detection algorithms, because manipulation relationships among objects are not considered, which are required to guide the robot to grasp things in the right order.
This paper presents a new CNN architecture called Visual Manipulation Relationship Network (VMRN) to help robot detect targets and predict the manipulation relationships in real time.
To implement end-to-end training and meet real-time requirements in robot tasks, we propose the Object Pairing Pooling Layer (OP2L) to help to predict all manipulation relationships in one forward process.
Moreover, in order to train VMRN, we collect a dataset named Visual Manipulation Relationship Dataset (VMRD) consisting of 5185 images with more than 17000 object instances and the manipulation relationships between all possible pairs of objects in every image, which is labeled by the manipulation relationship tree.
The experimental results show that the new network architecture can detect objects and predict manipulation relationships simultaneously and meet the real-time requirements in robot tasks.
We characterize the finite sets S of words such that that the iterated shuffle of S is co-finite and we give some bounds on the length of a longest word not in the iterated shuffle of S.
This paper presents an analytical taxonomy that can suitably describe, rather than simply classify, techniques for data presentation.
Unlike previous works, we do not consider particular aspects of visualization techniques, but their mechanisms and foundational vision perception.
Instead of just adjusting visualization research to a classification system, our aim is to better understand its process.
For doing so, we depart from elementary concepts to reach a model that can describe how visualization techniques work and how they convey meaning.
We introduce a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description.
While previous works have successfully tackled each one of these problems individually, we show how to learn to do all three in a unified manner while preserving end-to-end differentiability.
We then demonstrate that our Deep pipeline outperforms state-of-the-art methods on a number of benchmark datasets, without the need of retraining.
The problem of hand shape classification is challenging since a hand is characterized by a large number of degrees of freedom.
Numerous shape descriptors have been proposed and applied over the years to estimate and classify hand poses in reasonable time.
In this paper we discuss our parallel framework for real-time hand shape classification applicable in real-time applications.
We show how the number of gallery images influences the classification accuracy and execution time of the parallel algorithm.
We present the speedup and efficiency analyses that prove the efficacy of the parallel implementation.
Noteworthy, different methods can be used at each step of our parallel framework.
Here, we combine the shape contexts with the appearance-based techniques to enhance the robustness of the algorithm and to increase the classification score.
An extensive experimental study proves the superiority of the proposed approach over existing state-of-the-art methods.
This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora.
The structural transfer rules are based on alignment templates, like those used in statistical MT.
Alignment templates are extracted from sentence-aligned parallel corpora and extended with a set of restrictions which are derived from the bilingual dictionary of the MT system and control their application as transfer rules.
The experiments conducted using three different language pairs in the free/open-source MT platform Apertium show that translation quality is improved as compared to word-for-word translation (when no transfer rules are used), and that the resulting translation quality is close to that obtained using hand-coded transfer rules.
The method we present is entirely unsupervised and benefits from information in the rest of modules of the MT system in which the inferred rules are applied.
In this paper we study the complexity of the problems: given a loop, described by linear constraints over a finite set of variables, is there a linear or lexicographical-linear ranking function for this loop?
While existence of such functions implies termination, these problems are not equivalent to termination.
When the variables range over the rationals (or reals), it is known that both problems are PTIME decidable.
However, when they range over the integers, whether for single-path or multipath loops, the complexity has not yet been determined.
We show that both problems are coNP-complete.
However, we point out some special cases of importance of PTIME complexity.
We also present complete algorithms for synthesizing linear and lexicographical-linear ranking functions, both for the general case and the special PTIME cases.
Moreover, in the rational setting, our algorithm for synthesizing lexicographical-linear ranking functions extends existing ones, because our class of ranking functions is more general, yet it has polynomial time complexity.
An experiment to study the entropy method for an anomaly detection system has been performed.
The study has been conducted using real data generated from the distributed sensor networks at the Intel Berkeley Research Laboratory.
The experimental results were compared with the elliptical method and has been analyzed in two dimensional data sets acquired from temperature and humidity sensors across 52 micro controllers.
Using the binary classification to determine the upper and lower boundaries for each series of sensors, it has been shown that the entropy method are able to detect more number of out ranging sensor nodes than the elliptical methods.
It can be argued that the better result was mainly due to the lack of elliptical approach which is requiring certain correlation between two sensor series, while in the entropy approach each sensor series is treated independently.
This is very important in the current case where both sensor series are not correlated each other.
The Internet of things (IoT) comprises of wireless sensors and actuators connected via access points to the Internet.
Often, the sensing devices are remotely deployed with limited battery power and equipped with energy harvesting equipment such as solar panels.
These devices transmit real-time data to the base stations which is used in the detection of other applications.
Under sufficient power availability, wireless transmissions from sensors can be scheduled at regular time intervals to maintain real-time detection and information retrieval by the base station.
However, once the battery is significantly depleted, the devices enter into power saving mode and is required to be more selective in transmitting information to the base station (BS).
Transmitting a particular piece of sensed data will result in power consumption while discarding it might result in loss of utility at the BS.
The goal is to design an optimal dynamic policy which enables the device to decide whether to transmit or to discard a piece of sensing data particularly under the power saving mode.
This will enable the sensor to prolong its operation while causing minimum loss of utility of the application.
We develop a mathematical model to capture the utility of the IoT sensor transmissions and use tools from dynamic programming to derive an optimal real-time transmission policy that is based on the statistics of information arrival, the likelihood of harvested energy, and designed lifetime of the sensors.
Numerical results show that if the statistics of future data valuation can be accurately predicted, there is a significant increase in the utility obtained at the BS as well as the battery lifetime.
Contemporary social media networks can be viewed as a break to the early two-step flow model in which influential individuals act as intermediaries between the media and the public for information diffusion.
Today's social media platforms enable users to both generate and consume online contents.
Users continuously engage and disengage in discussions with varying degrees of interaction leading to formation of distinct online communities.
Such communities are often formed at high-level either based on metadata, such as hashtags on Twitter, or popular content triggered by few influential users.
These online communities often do not reflect true connectivity and lack the cohesiveness of traditional communities.
In this study, we investigate real-time formation of temporal communities on Twitter.
We aim at defining both high and low levels connections and to reveal the magnitude of clustering cohesion on temporal basis.
Inspired by a real-life event center sitting arrangement scenario, the proposed method aims to cluster users into distinct and cohesive online temporal communities.
Membership to a community relies on intrinsic tweet properties to define similarity as the basis for interaction networks.
The proposed method can be useful for local event monitoring and clique-based marketing among other applications.
Social Live Stream Services (SLSS) exploit a new level of social interaction.
One of the main challenges in these services is how to detect and prevent deviant behaviors that violate community guidelines.
In this work, we focus on adult content production and consumption in two widely used SLSS, namely Live.me and Loops Live, which have millions of users producing massive amounts of video content on a daily basis.
We use a pre-trained deep learning model to identify broadcasters of adult content.
Our results indicate that moderation systems in place are highly ineffective in suspending the accounts of such users.
We create two large datasets by crawling the social graphs of these platforms, which we analyze to identify characterizing traits of adult content producers and consumers, and discover interesting patterns of relationships among them, evident in both networks.
In this paper we investigate the problem of optimal MDS-encoded cache placement at the wireless edge to minimize the backhaul rate in heterogeneous networks.
We derive the backhaul rate performance of any caching scheme based on file splitting and MDS encoding and we formulate the optimal caching scheme as a convex optimization problem.
We then thoroughly investigate the performance of this optimal scheme for an important heterogeneous network scenario.
We compare it to several other caching strategies and we analyze the influence of the system parameters, such as the popularity and size of the library files and the capabilities of the small-cell base stations, on the overall performance of our optimal caching strategy.
Our results show that the careful placement of MDS-encoded content in caches at the wireless edge leads to a significant decrease of the load of the network backhaul and hence to a considerable performance enhancement of the network.
Triplet networks are widely used models that are characterized by good performance in classification and retrieval tasks.
In this work we propose to train a triplet network by putting it as the discriminator in Generative Adversarial Nets (GANs).
We make use of the good capability of representation learning of the discriminator to increase the predictive quality of the model.
We evaluated our approach on Cifar10 and MNIST datasets and observed significant improvement on the classification performance using the simple k-nn method.
This paper addresses the problem of manipulating images using natural language description.
Our task aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance.
Although existing methods synthesize images having new attributes, they do not fully preserve text-irrelevant contents of the original image.
In this paper, we propose the text-adaptive generative adversarial network (TAGAN) to generate semantically manipulated images while preserving text-irrelevant contents.
The key to our method is the text-adaptive discriminator that creates word-level local discriminators according to input text to classify fine-grained attributes independently.
With this discriminator, the generator learns to generate images where only regions that correspond to the given text are modified.
Experimental results show that our method outperforms existing methods on CUB and Oxford-102 datasets, and our results were mostly preferred on a user study.
Extensive analysis shows that our method is able to effectively disentangle visual attributes and produce pleasing outputs.
Reeb graphs are structural descriptors that capture shape properties of a topological space from the perspective of a chosen function.
In this work we define a combinatorial metric for Reeb graphs of orientable surfaces in terms of the cost necessary to transform one graph into another by edit operations.
The main contributions of this paper are the stability property and the optimality of this edit distance.
More precisely, the stability result states that changes in the functions, measured by the maximum norm, imply not greater changes in the corresponding Reeb graphs, measured by the edit distance.
The optimality result states that our edit distance discriminates Reeb graphs better than any other metric for Reeb graphs of surfaces satisfying the stability property.
Many sequence learning tasks require the localization of certain events in sequences.
Because it can be expensive to obtain strong labeling that specifies the starting and ending times of the events, modern systems are often trained with weak labeling without explicit timing information.
Multiple instance learning (MIL) is a popular framework for learning from weak labeling.
In a common scenario of MIL, it is necessary to choose a pooling function to aggregate the predictions for the individual steps of the sequences.
In this paper, we compare the "max" and "noisy-or" pooling functions on a speech recognition task and a sound event detection task.
We find that max pooling is able to localize phonemes and sound events, while noisy-or pooling fails.
We provide a theoretical explanation of the different behavior of the two pooling functions on sequence learning tasks.
Boolean automata networks (BANs) are a well established model for biological regulation systems such as neural networks or genetic networks.
Studies on the dynamics of BANs, whether it is synchronous or asynchronous, have mainly focused on monotonic networks, where fundamental questions on the links relating their static and dynamical properties have been raised and addressed.
This paper explores analogous questions on asynchronous non-monotonic networks, xor-BANs, that are BANs where all the local transition functions are xor-functions.
Using algorithmic tools, we give a general characterisation of the asynchronous transition graphs for most of the cactus xor-BANs and strongly connected xor-BANs.
As an illustration of the results, we provide a complete description of the asynchronous dynamics of two particular classes of xor-BAN, namely xor-Flowers and xor-Cycle Chains.
This work also leads to new bisimulation equivalences specific to xor-BANs.
Focusing on only semantic instances that only salient in a scene gains more benefits for robot navigation and self-driving cars than looking at all objects in the whole scene.
This paper pushes the envelope on salient regions in a video to decompose them into semantically meaningful components, namely, semantic salient instances.
We provide the baseline for the new task of video semantic salient instance segmentation (VSSIS), that is, Semantic Instance - Salient Object (SISO) framework.
The SISO framework is simple yet efficient, leveraging advantages of two different segmentation tasks, i.e.semantic instance segmentation and salient object segmentation to eventually fuse them for the final result.
In SISO, we introduce a sequential fusion by looking at overlapping pixels between semantic instances and salient regions to have non-overlapping instances one by one.
We also introduce a recurrent instance propagation to refine the shapes and semantic meanings of instances, and an identity tracking to maintain both the identity and the semantic meaning of instances over the entire video.
Experimental results demonstrated the effectiveness of our SISO baseline, which can handle occlusions in videos.
In addition, to tackle the task of VSSIS, we augment the DAVIS-2017 benchmark dataset by assigning semantic ground-truth for salient instance labels, obtaining SEmantic Salient Instance Video (SESIV) dataset.
Our SESIV dataset consists of 84 high-quality video sequences with pixel-wisely per-frame ground-truth labels.
The optimal degree-of-freedom (DoF) region of the non-coherent multiple-access channels is still unknown in general.
In this paper, we make some progress by deriving the entire optimal DoF region in the case of the two-user single-input multiple-output (SIMO) generic block fading channels.
The achievability is based on a simple training-based scheme.
The novelty of our result lies in the converse using a genie-aided bound and the duality upper bound.
As a by-product, our result generalizes previous proofs for the single-user Rayleigh block fading channels.
Collections of biological specimens are fundamental to scientific understanding and characterization of natural diversity.
This paper presents a system for liberating useful information from physical collections by bringing specimens into the digital domain so they can be more readily shared, analyzed, annotated and compared.
It focuses on insects and is strongly motivated by the desire to accelerate and augment current practices in insect taxonomy which predominantly use text, 2D diagrams and images to describe and characterize species.
While these traditional kinds of descriptions are informative and useful, they cannot cover insect specimens "from all angles" and precious specimens are still exchanged between researchers and collections for this reason.
Furthermore, insects can be complex in structure and pose many challenges to computer vision systems.
We present a new prototype for a practical, cost-effective system of off-the-shelf components to acquire natural-colour 3D models of insects from around 3mm to 30mm in length.
Colour images are captured from different angles and focal depths using a digital single lens reflex (DSLR) camera rig and two-axis turntable.
These 2D images are processed into 3D reconstructions using software based on a visual hull algorithm.
The resulting models are compact (around 10 megabytes), afford excellent optical resolution, and can be readily embedded into documents and web pages, as well as viewed on mobile devices.
The system is portable, safe, relatively affordable, and complements the sort of volumetric data that can be acquired by computed tomography.
This system provides a new way to augment the description and documentation of insect species holotypes, reducing the need to handle or ship specimens.
It opens up new opportunities to collect data for research, education, art, entertainment, biodiversity assessment and biosecurity control.
Echocardiography is essential to modern cardiology.
However, human interpretation limits high throughput analysis, limiting echocardiography from reaching its full clinical and research potential for precision medicine.
Deep learning is a cutting-edge machine-learning technique that has been useful in analyzing medical images but has not yet been widely applied to echocardiography, partly due to the complexity of echocardiograms' multi view, multi modality format.
The essential first step toward comprehensive computer assisted echocardiographic interpretation is determining whether computers can learn to recognize standard views.
To this end, we anonymized 834,267 transthoracic echocardiogram (TTE) images from 267 patients (20 to 96 years, 51 percent female, 26 percent obese) seen between 2000 and 2017 and labeled them according to standard views.
Images covered a range of real world clinical variation.
We built a multilayer convolutional neural network and used supervised learning to simultaneously classify 15 standard views.
Eighty percent of data used was randomly chosen for training and 20 percent reserved for validation and testing on never seen echocardiograms.
Using multiple images from each clip, the model classified among 12 video views with 97.8 percent overall test accuracy without overfitting.
Even on single low resolution images, test accuracy among 15 views was 91.7 percent versus 70.2 to 83.5 percent for board-certified echocardiographers.
Confusional matrices, occlusion experiments, and saliency mapping showed that the model finds recognizable similarities among related views and classifies using clinically relevant image features.
In conclusion, deep neural networks can classify essential echocardiographic views simultaneously and with high accuracy.
Our results provide a foundation for more complex deep learning assisted echocardiographic interpretation.
This paper deals with area-based subpixel image registration under rotation-isometric scaling-translation transformation hypothesis.
Our approach is based on a parametrical modeling of geometrically transformed textural image fragments and maximum likelihood estimation of transformation vector between them.
Due to the parametrical approach based on the fractional Brownian motion modeling of the local fragments texture, the proposed estimator MLfBm (ML stands for "Maximum Likelihood" and fBm for "Fractal Brownian motion") has the ability to better adapt to real image texture content compared to other methods relying on universal similarity measures like mutual information or normalized correlation.
The main benefits are observed when assumptions underlying the fBm model are fully satisfied, e.g. for isotropic normally distributed textures with stationary increments.
Experiments on both simulated and real images and for high and weak correlation between registered images show that the MLfBm estimator offers significant improvement compared to other state-of-the-art methods.
It reduces translation vector, rotation angle and scaling factor estimation errors by a factor of about 1.75...2 and it decreases probability of false match by up to 5 times.
Besides, an accurate confidence interval for MLfBm estimates can be obtained from the Cramer-Rao lower bound on rotation-scaling-translation parameters estimation error.
This bound depends on texture roughness, noise level in reference and template images, correlation between these images and geometrical transformation parameters.
We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers.
The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers.
We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods.
Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans.
We also show that a range of different reasoning skills are needed to solve our task.
These results indicate that few-shot relation classification remains an open problem and still requires further research.
Our detailed analysis points multiple directions for future research.
All details and resources about the dataset and baselines are released on http://zhuhao.me/fewrel.
Steering a car through traffic is a complex task that is difficult to cast into algorithms.
Therefore, researchers turn to training artificial neural networks from front-facing camera data stream along with the associated steering angles.
Nevertheless, most existing solutions consider only the visual camera frames as input, thus ignoring the temporal relationship between frames.
In this work, we propose a Convolutional Long Short-Term Memory Recurrent Neural Network (C-LSTM), that is end-to-end trainable, to learn both visual and dynamic temporal dependencies of driving.
Additionally, We introduce posing the steering angle regression problem as classification while imposing a spatial relationship between the output layer neurons.
Such method is based on learning a sinusoidal function that encodes steering angles.
To train and validate our proposed methods, we used the publicly available Comma.ai dataset.
Our solution improved steering root mean square error by 35% over recent methods, and led to a more stable steering by 87%.
In this work, we introduce a compositional framework for the construction of finite abstractions (a.k.a. symbolic models) of interconnected discrete-time control systems.
The compositional scheme is based on the joint dissipativity-type properties of discrete-time control subsystems and their finite abstractions.
In the first part of the paper, we use a notion of so-called storage function as a relation between each subsystem and its finite abstraction to construct compositionally a notion of so-called simulation function as a relation between interconnected finite abstractions and that of control systems.
The derived simulation function is used to quantify the error between the output behavior of the overall interconnected concrete system and that of its finite abstraction.
In the second part of the paper, we propose a technique to construct finite abstractions together with their corresponding storage functions for a class of discrete-time control systems under some incremental passivity property.
We show that if a discrete-time control system is so-called incrementally passivable, then one can construct its finite abstraction by a suitable quantization of the input and state sets together with the corresponding storage function.
Finally, the proposed results are illustrated by constructing a finite abstraction of a network of linear discrete-time control systems and its corresponding simulation function in a compositional way.
The compositional conditions in this example do not impose any restriction on the gains or the number of the subsystems which, in particular, elucidates the effectiveness of dissipativity-type compositional reasoning for networks of systems.
Online social media such as Twitter, Facebook, Wikis and Linkedin have made a great impact on the way we consume information in our day to day life.
Now it has become increasingly important that we come across appropriate content from the social media to avoid information explosion.
In case of Twitter, popular information can be tracked using hashtags.
Studying the characteristics of tweets containing hashtags becomes important for a number of tasks, such as breaking news detection, personalized message recommendation, friends recommendation, and sentiment analysis among others.
In this paper, we have analyzed Twitter data based on trending hashtags, which is widely used nowadays.
We have used event based hashtags to know users' thoughts on those events and to decide whether the rest of the users might find it interesting or not.
We have used topic modeling, which reveals the hidden thematic structure of the documents (tweets in this case) in addition to sentiment analysis in exploring and summarizing the content of the documents.
A technique to find the interestingness of event based twitter hashtag and the associated sentiment has been proposed.
The proposed technique helps twitter follower to read, relevant and interesting hashtag.
With the advancement of technology in the last few decades, leading to the widespread availability of miniaturized sensors and internet-connected things (IoT), security of electronic devices has become a top priority.
Side-channel attack (SCA) is one of the prominent methods to break the security of an encryption system by exploiting the information leaked from the physical devices.
Correlational power attack (CPA) is an efficient power side-channel attack technique, which analyses the correlation between the estimated and measured supply current traces to extract the secret key.
The existing countermeasures to the power attacks are mainly based on reducing the SNR of the leaked data, or introducing large overhead using techniques like power balancing.
This paper presents an attenuated signature AES (AS-AES), which resists SCA with minimal noise current overhead.
AS-AES uses a shunt low-drop-out (LDO) regulator to suppress the AES current signature by 400x in the supply current traces.
The shunt LDO has been fabricated and validated in 130 nm CMOS technology.
System-level implementation of the AS-AES along with noise injection, shows that the system remains secure even after 50K encryptions, with 10x reduction in power overhead compared to that of noise addition alone.
Learning features from massive unlabelled data is a vast prevalent topic for high-level tasks in many machine learning applications.
The recent great improvements on benchmark data sets achieved by increasingly complex unsupervised learning methods and deep learning models with lots of parameters usually requires many tedious tricks and much expertise to tune.
However, filters learned by these complex architectures are quite similar to standard hand-crafted features visually.
In this paper, unsupervised learning methods, such as PCA or auto-encoder, are employed as the building block to learn filter banks at each layer.
The lower layer responses are transferred to the last layer (trans-layer) to form a more complete representation retaining more information.
In addition, some beneficial methods such as local contrast normalization and whitening are added to the proposed deep trans-layer networks to further boost performance.
The trans-layer representations are followed by block histograms with binary encoder schema to learn translation and rotation invariant representations, which are utilized to do high-level tasks such as recognition and classification.
Compared to traditional deep learning methods, the implemented feature learning method has much less parameters and is validated in several typical experiments, such as digit recognition on MNIST and MNIST variations, object recognition on Caltech 101 dataset and face verification on LFW dataset.
The deep trans-layer unsupervised learning achieves 99.45% accuracy on MNIST dataset, 67.11% accuracy on 15 samples per class and 75.98% accuracy on 30 samples per class on Caltech 101 dataset, 87.10% on LFW dataset.
We generalize a result by Carlen and Cordero-Erausquin on the equivalence between the Brascamp-Lieb inequality and the subadditivity of relative entropy by allowing for random transformations (a broadcast channel).
This leads to a unified perspective on several functional inequalities that have been gaining popularity in the context of proving impossibility results.
We demonstrate that the information theoretic dual of the Brascamp-Lieb inequality is a convenient setting for proving properties such as data processing, tensorization, convexity and Gaussian optimality.
Consequences of the latter include an extension of the Brascamp-Lieb inequality allowing for Gaussian random transformations, the determination of the multivariate Wyner common information for Gaussian sources, and a multivariate version of Nelson's hypercontractivity theorem.
Finally we present an information theoretic characterization of a reverse Brascamp-Lieb inequality involving a random transformation (a multiple access channel).
Gradient descent training techniques are remarkably successful in training analog-valued artificial neural networks (ANNs).
Such training techniques, however, do not transfer easily to spiking networks due to the spike generation hard non-linearity and the discrete nature of spike communication.
We show that in a feedforward spiking network that uses a temporal coding scheme where information is encoded in spike times instead of spike rates, the network input-output relation is differentiable almost everywhere.
Moreover, this relation is piece-wise linear after a transformation of variables.
Methods for training ANNs thus carry directly to the training of such spiking networks as we show when training on the permutation invariant MNIST task.
In contrast to rate-based spiking networks that are often used to approximate the behavior of ANNs, the networks we present spike much more sparsely and their behavior can not be directly approximated by conventional ANNs.
Our results highlight a new approach for controlling the behavior of spiking networks with realistic temporal dynamics, opening up the potential for using these networks to process spike patterns with complex temporal information.
In the recent years it turned out that multidimensional recurrent neural networks (MDRNN) perform very well for offline handwriting recognition tasks like the OpenHaRT 2013 evaluation DIR.
With suitable writing preprocessing and dictionary lookup, our ARGUS software completed this task with an error rate of 26.27% in its primary setup.
Deep neural networks (DNNs) have been proven to have many redundancies.
Hence, many efforts have been made to compress DNNs.
However, the existing model compression methods treat all the input samples equally while ignoring the fact that the difficulties of various input samples being correctly classified are different.
To address this problem, DNNs with adaptive dropping mechanism are well explored in this work.
To inform the DNNs how difficult the input samples can be classified, a guideline that contains the information of input samples is introduced to improve the performance.
Based on the developed guideline and adaptive dropping mechanism, an innovative soft-guided adaptively-dropped (SGAD) neural network is proposed in this paper.
Compared with the 32 layers residual neural networks, the presented SGAD can reduce the FLOPs by 77% with less than 1% drop in accuracy on CIFAR-10.
The ability to consolidate information of different types is at the core of intelligence, and has tremendous practical value in allowing learning for one task to benefit from generalizations learned for others.
In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, SDP and Universal Dependencies (UD) parsing as auxiliary tasks.
We experiment on three languages, using a uniform transition-based system and learning architecture for all parsing tasks.
Despite notable conceptual, formal and domain differences, we show that multitask learning significantly improves UCCA parsing in both in-domain and out-of-domain settings.
This paper studies the performance of a feedback control loop closed via an error-free digital communication channel with transmission delay.
The system comprises a discrete-time noisy linear time-invariant (LTI) plant whose single measurement output is mapped into its single control input by a causal, but otherwise arbitrary, coding and control scheme.
We consider a single-input multiple-output (SIMO) channel between the encoder-controller and the decoder-controller which is lossless and imposes random time delay.
We derive a lower bound on the minimum average feedback data rate that guarantees achieving a certain level of average quadratic performance over all possible realizations of the random delay.
For the special case of a constant channel delay, we obtain an upper bound by proposing linear source-coding schemes that attain desired performance levels with rates that are at most 1.254 bits per sample greater than the lower bound.
We give a numerical example demonstrating that bounds and operational rates are increasing functions of the constant delay.
In other words, to achieve a specific performance level, greater channel delay necessitates spending higher data rate.
We propose a new clustering method based on optimal transportation.
We solve optimal transportation with variational principles, and investigate the use of power diagrams as transportation plans for aggregating arbitrary domains into a fixed number of clusters.
We iteratively drive centroids through target domains while maintaining the minimum clustering energy by adjusting the power diagrams.
Thus, we simultaneously pursue clustering and the Wasserstein distances between the centroids and the target domains, resulting in a measure-preserving mapping.
We demonstrate the use of our method in domain adaptation, remeshing, and representation learning on synthetic and real data.
A Petri net is structurally cyclic if every configuration is reachable from itself in one or more steps.
We show that structural cyclicity is decidable in deterministic polynomial time.
For this, we adapt the Kosaraju's approach for the general reachability problem for Petri nets.
This paper presents text normalization which is an integral part of any text-to-speech synthesis system.
Text normalization is a set of methods with a task to write non-standard words, like numbers, dates, times, abbreviations, acronyms and the most common symbols, in their full expanded form are presented.
The whole taxonomy for classification of non-standard words in Croatian language together with rule-based normalization methods combined with a lookup dictionary are proposed.
Achieved token rate for normalization of Croatian texts is 95%, where 80% of expanded words are in correct morphological form.
Abstract Machine understanding of questions is tightly related to recognition of articulation in the context of the computational capabilities of an underlying processing algorithm.
In this paper a mathematical model to capture and distinguish the latent structure in the articulation of questions is presented.
We propose an objective-driven approach to represent this latent structure and show that such an approach is beneficial when examples of complementary objectives are not available.
We show that the latent structure can be represented as a system that maximizes a cost function related to the underlying objective.
Further, we show that the optimization formulation can be approximated to building a memory of patterns represented as a trained neural auto-encoder.
Experimental evaluation using many clusters of questions, each related to an objective, shows 80% recognition accuracy and negligible false positive across these clusters of questions.
We then extend the same memory to a related task where the goal is to iteratively refine a dataset of questions based on the latent articulation.
We also demonstrate a refinement scheme called K-fingerprints, that achieves nearly 100% recognition with negligible false positive across the different clusters of questions.
The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting.
We extend the previous approach for single view parsing of indoor scenes to video sequences and formulate the problem of recovering the floor plan of the environment as an optimal labeling problem solved using dynamic programming.
The temporal continuity is enforced in a recursive setting, where labeling from previous frames is used as a prior term in the objective function.
In addition to recovery of piecewise planar weak Manhattan structure of the extended environment, the orthogonality constraints are also exploited by visual odometry and pose graph optimization.
This yields reliable estimates in the presence of large motions and absence of distinctive features to track.
We evaluate our method on several challenging indoors sequences demonstrating accurate SLAM and dense mapping of low texture environments.
On existing TUM benchmark we achieve competitive results with the alternative approaches which fail in our environments.
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency.
We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING but significantly improves its proposal localization quality.
At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially.
We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction.
We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets.
Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5% and 16.7% on both VOC2007 and Microhsoft COCO datasets, respectively.
Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster.
Identifying the occurrence of congestion in a Mobile Ad-hoc Network (MANET) is a major task.
The inbuilt congestion control techniques of existing Transmission Control Protocol (TCP) designed for wired networks do not handle the unique properties of shared wireless multi-hop link.
There are several approaches proposed for detecting and overcoming the congestion in the mobile ad-hoc network.
In this paper we present a Modified AD-hoc Transmission Control Protocol (M-ADTCP) method where the receiver detects the probable current network status and transmits this information to the sender as feedback.
The sender behavior is altered appropriately.
The proposed technique is also compatible with standard TCP.
We present an algorithm that incorporates a tabu search procedure into the framework of path relinking to tackle the job shop scheduling problem (JSP).
This tabu search/path relinking (TS/PR) algorithm comprises several distinguishing features, such as a specific relinking procedure and a reference solution determination method.
To test the performance of TS/PR, we apply it to tackle almost all of the benchmark JSP instances available in the literature.
The test results show that TS/PR obtains competitive results compared with state-of-the-art algorithms for JSP in the literature, demonstrating its efficacy in terms of both solution quality and computational efficiency.
In particular, TS/PR is able to improve the upper bounds for 49 out of the 205 tested instances and it solves a challenging instance that has remained unsolved for over 20 years.
This paper introduces a new computing model based on the cooperation among Turing machines called orchestrated machines.
Like universal Turing machines, orchestrated machines are also designed to simulate Turing machines but they can also modify the original operation of the included Turing machines to create a new layer of some kind of collective behavior.
Using this new model we can define some interested notions related to cooperation ability of Turing machines such as the intelligence quotient or the emotional intelligence quotient for Turing machines.
Social Networking Sites (SNSs) are powerful marketing and communication tools.
There are hundreds of SNSs that have entered and exited the market over time.
The coexistence of multiple SNSs is a rarely observed phenomenon.
Most coexisting SNSs either serve different purposes for its users or have cultural differences among them.
The introduction of a new SNS with a better set of features can lead to the demise of an existing SNS, as observed in the transition from Orkut to Facebook.
The paper proposes a model for analyzing the transition of users from one SNS to another, when a new SNS is introduced in the system.
The game theoretic model proposed considers two major factors in determining the success of a new SNS.
The first being time that an old SNS gets to stabilise.
We study whether the time that a SNS like Facebook received to monopolize its reach had a distinguishable effect.
The second factor is the set of features showcased by the new SNS.
The results of the model are also experimentally verified with data collected by means of a survey.
Many real-world applications are characterized by a number of conflicting performance measures.
As optimizing in a multi-objective setting leads to a set of non-dominated solutions, a preference function is required for selecting the solution with the appropriate trade-off between the objectives.
The question is: how good do estimations of these objectives have to be in order for the solution maximizing the preference function to remain unchanged?
In this paper, we introduce the concept of preference radius to characterize the robustness of the preference function and provide guidelines for controlling the quality of estimations in the multi-objective setting.
More specifically, we provide a general formulation of multi-objective optimization under the bandits setting.
We show how the preference radius relates to the optimal gap and we use this concept to provide a theoretical analysis of the Thompson sampling algorithm from multivariate normal priors.
We finally present experiments to support the theoretical results and highlight the fact that one cannot simply scalarize multi-objective problems into single-objective problems.
Energy consumption is a major limitation of low power and mobile devices.
Efficient transmission protocols are required to minimize an energy consumption of the mobile devices for ubiquitous connectivity in the next generation wireless networks.
Opportunistic schemes select a single relay using the criteria of the best channel and achieve a near-optimal diversity performance in a cooperative wireless system.
In this paper, we study the energy efficiency of the opportunistic schemes for device-to-device communication.
In the opportunistic approach, an energy consumed by devices is minimized by selecting a single neighboring device as a relay using the criteria of minimum consumed energy in each transmission in the uplink of a wireless network.
We derive analytical bounds and scaling laws on the expected energy consumption when the devices experience log-normal shadowing with respect to a base station considering both the transmission as well as circuit energy consumptions.
We show that the protocol improves the energy efficiency of the network comparing to the direct transmission even if only a few devices are considered for relaying.
We also demonstrate the effectiveness of the protocol by means of simulations in realistic scenarios of the wireless network.
It has been shown that an extension of the basic binary polar transformation also polarizes over finite fields.
With it the direct encoding of q-ary sources and channels is a process that can be implemented with simple and efficient algorithms.
However, direct polar decoding of q-ary sources and channels is more involved.
In this paper we obtain a recursive equation for the likelihood ratio expressed as a LR vector.
With it successive cancellation (SC) decoding is applied in a straightforward way.
The complexity is quadratic in the order of the field, but the use of the LR vector introduces factors that soften that complexity.
We also show that operations can be parallelized in the decoder.
The Bhattacharyya parameters are expressed as a function of the LR vectors, as in the binary case, simplifying the construction of the codes.
We have applied direct polar coding to several sources and channels and we have compared it with other multilevel strategies.
The direct q-ary polar coding is closer to the theoretical limit than other techniques when the alphabet size is large.
Our results suggest that direct q-ary polar coding could be used in real scenarios.
We introduce a corpus of 7,032 sentences rated by human annotators for formality, informativeness, and implicature on a 1-7 scale.
The corpus was annotated using Amazon Mechanical Turk.
Reliability in the obtained judgments was examined by comparing mean ratings across two MTurk experiments, and correlation with pilot annotations (on sentence formality) conducted in a more controlled setting.
Despite the subjectivity and inherent difficulty of the annotation task, correlations between mean ratings were quite encouraging, especially on formality and informativeness.
We further explored correlation between the three linguistic variables, genre-wise variation of ratings and correlations within genres, compatibility with automatic stylistic scoring, and sentential make-up of a document in terms of style.
To date, our corpus is the largest sentence-level annotated corpus released for formality, informativeness, and implicature.
Truck Factor (TF) is a metric proposed by the agile community as a tool to identify concentration of knowledge in software development environments.
It states the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated.
In other words, TF helps to measure how prepared is a project to deal with developer turnover.
Despite its clear relevance, few studies explore this metric.
Altogether there is no consensus about how to calculate it, and no supporting evidence backing estimates for systems in the wild.
To mitigate both issues, we propose a novel (and automated) approach for estimating TF-values, which we execute against a corpus of 133 popular project in GitHub.
We later survey developers as a means to assess the reliability of our results.
Among others, we find that the majority of our target systems (65%) have TF <= 2.
Surveying developers from 67 target systems provides confidence towards our estimates; in 84% of the valid answers we collect, developers agree or partially agree that the TF's authors are the main authors of their systems; in 53% we receive a positive or partially positive answer regarding our estimated truck factors.
Different neural networks trained on the same dataset often learn similar input-output mappings with very different weights.
Is there some correspondence between these neural network solutions?
For linear networks, it has been shown that different instances of the same network architecture encode the same representational similarity matrix, and their neural activity patterns are connected by orthogonal transformations.
However, it is unclear if this holds for non-linear networks.
Using a shared response model, we show that different neural networks encode the same input examples as different orthogonal transformations of an underlying shared representation.
We test this claim using both standard convolutional neural networks and residual networks on CIFAR10 and CIFAR100.
Depth estimation from a single image in the wild remains a challenging problem.
One main obstacle is the lack of high-quality training data for images in the wild.
In this paper we propose a method to automatically generate such data through Structure-from-Motion (SfM) on Internet videos.
The core of this method is a Quality Assessment Network that identifies high-quality reconstructions obtained from SfM.
Using this method, we collect single-view depth training data from a large number of YouTube videos and construct a new dataset called YouTube3D.
Experiments show that YouTube3D is useful in training depth estimation networks and advances the state of the art of single-view depth estimation in the wild.
We present probabilistic neural programs, a framework for program induction that permits flexible specification of both a computational model and inference algorithm while simultaneously enabling the use of deep neural networks.
Probabilistic neural programs combine a computation graph for specifying a neural network with an operator for weighted nondeterministic choice.
Thus, a program describes both a collection of decisions as well as the neural network architecture used to make each one.
We evaluate our approach on a challenging diagram question answering task where probabilistic neural programs correctly execute nearly twice as many programs as a baseline model.
In this paper, we tackle the problem of constructing a differentially private synopsis for two-dimensional datasets such as geospatial datasets.
The current state-of-the-art methods work by performing recursive binary partitioning of the data domains, and constructing a hierarchy of partitions.
We show that the key challenge in partition-based synopsis methods lies in choosing the right partition granularity to balance the noise error and the non-uniformity error.
We study the uniform-grid approach, which applies an equi-width grid of a certain size over the data domain and then issues independent count queries on the grid cells.
This method has received no attention in the literature, probably due to the fact that no good method for choosing a grid size was known.
Based on an analysis of the two kinds of errors, we propose a method for choosing the grid size.
Experimental results validate our method, and show that this approach performs as well as, and often times better than, the state-of-the-art methods.
We further introduce a novel adaptive-grid method.
The adaptive grid method lays a coarse-grained grid over the dataset, and then further partitions each cell according to its noisy count.
Both levels of partitions are then used in answering queries over the dataset.
This method exploits the need to have finer granularity partitioning over dense regions and, at the same time, coarse partitioning over sparse regions.
Through extensive experiments on real-world datasets, we show that this approach consistently and significantly outperforms the uniform-grid method and other state-of-the-art methods.
Strategic suppression of grades, as well as early offers and contracts, are well-known phenomena in the matching process where graduating students apply to jobs or further education.
In this paper, we consider a game theoretic model of these phenomena introduced by Ostrovsky and Schwarz, and study the loss in social welfare resulting from strategic behavior of the schools, employers, and students.
We model grading of students as a game where schools suppress grades in order to improve their students' placements.
We also consider the quality loss due to unraveling of the matching market, the strategic behavior of students and employers in offering early contracts with the goal to improve the quality.
Our goal is to evaluate if strategic grading or unraveling of the market (or a combination of the two) can cause significant welfare loss compared to the optimal assignment of students to jobs.
To measure welfare of the assignment, we assume that welfare resulting from a job -- student pair is a separable and monotone function of student ability and the quality of the jobs.
Assuming uniform student quality distribution, we show that the quality loss from the above strategic manipulation is bounded by at most a factor of 2, and give improved bounds for some special cases of welfare functions.
Medical errors are leading causes of death in the US and as such, prevention of these errors is paramount to promoting health care.
Patient Safety Event reports are narratives describing potential adverse events to the patients and are important in identifying and preventing medical errors.
We present a neural network architecture for identifying the type of safety events which is the first step in understanding these narratives.
Our proposed model is based on a soft neural attention model to improve the effectiveness of encoding long sequences.
Empirical results on two large-scale real-world datasets of patient safety reports demonstrate the effectiveness of our method with significant improvements over existing methods.
The rapidly growing size of RDF graphs in recent years necessitates distributed storage and parallel processing strategies.
To obtain efficient query processing using computer clusters a wide variety of different approaches have been proposed.
Related to the approach presented in the current paper are systems built on top of Hadoop HDFS, for example using Apache Accumulo or using Apache Spark.
We present a new RDF store called PRoST (Partitioned RDF on Spark Tables) based on Apache Spark.
PRoST introduces an innovative strategy that combines the Vertical Partitioning approach with the Property Table, two preexisting models for storing RDF datasets.
We demonstrate that our proposal outperforms state-of-the-art systems w.r.t. the runtime for a wide range of query types and without any extensive precomputing phase.
Tracking moving objects from a video sequence requires segmentation of these objects from the background image.
However, getting the actual background image automatically without object detection and using only the video is difficult.
In this paper, we describe a novel algorithm that generates background from real world images without foreground detection.
The algorithm assumes that the background image is shown in the majority of the video.
Given this simple assumption, the method described in this paper is able to accurately generate, with high probability, the background image from a video using only a small number of binary operations.
Edge bundling is an important concept heavily used for graph visualization purposes.
To enable the comparison with other established near-planarity models in graph drawing, we formulate a new edge-bundling model which is inspired by the recently introduced fan-planar graphs.
In particular, we restrict the bundling to the end segments of the edges.
Similarly to 1-planarity, we call our model 1-fan-bundle-planarity, as we allow at most one crossing per bundle.
For the two variants where we allow either one or, more naturally, both end segments of each edge to be part of bundles, we present edge density results and consider various recognition questions, not only for general graphs, but also for the outer and 2-layer variants.
We conclude with a series of challenging questions.
The notion of a Persistent Phylogeny generalizes the well-known Perfect phylogeny model that has been thoroughly investigated and is used to explain a wide range of evolutionary phenomena.
More precisely, while the Perfect Phylogeny model allows each character to be acquired once in the entire evolutionary history while character losses are not allowed, the Persistent Phylogeny model allows each character to be both acquired and lost exactly once in the evolutionary history.
The Persistent Phylogeny Problem (PPP) is the problem of reconstructing a Persistent phylogeny tree, if it exists, from a binary matrix where the rows represent the species (or the individuals) studied and the columns represent the characters that each species can have.
While the Perfect Phylogeny has a linear-time algorithm, the computational complexity of PPP has been posed, albeit in an equivalent formulation, 20 years ago.
We settle the question by providing a polynomial time algorithm for the Persistent Phylogeny problem.
The scale of Android applications in the market is growing rapidly.
To efficiently detect the malicious behavior in these applications, an array of static analysis tools are proposed.
However, static analysis tools suffer from code hiding techniques like packing, dynamic loading, self modifying, and reflection.
In this paper, we thus present DexLego, a novel system that performs a reassembleable bytecode extraction for aiding static analysis tools to reveal the malicious behavior of Android applications.
DexLego leverages just-in-time collection to extract data and bytecode from an application at runtime, and reassembles them to a new Dalvik Executable (DEX) file offline.
The experiments on DroidBench and real-world applications show that DexLego correctly reconstructs the behavior of an application in the reassembled DEX file, and significantly improves analysis result of the existing static analysis systems.
We present a novel stereo vision algorithm that is capable of obstacle detection on a mobile-CPU processor at 120 frames per second.
Our system performs a subset of standard block-matching stereo processing, searching only for obstacles at a single depth.
By using an onboard IMU and state-estimator, we can recover the position of obstacles at all other depths, building and updating a full depth-map at framerate.
Here, we describe both the algorithm and our implementation on a high-speed, small UAV, flying at over 20 MPH (9 m/s) close to obstacles.
The system requires no external sensing or computation and is, to the best of our knowledge, the first high-framerate stereo detection system running onboard a small UAV.
Internet of Things is changing the world.
The manufacturing industry has already identified that the IoT brings great opportunities to retain its leading position in economy and society.
However, the adoption of this new technology changes the development process of the manufacturing system and raises many challenges.
In this paper the modern manufacturing system is considered as a composition of cyber-physical, cyber and human components and IoT is used as a glue for their integration as far as it regards their cyber interfaces.
The key idea is a UML profile for the IoT with an alternative to apply the approach also at the source code level specification of the component in case that a UML design specification is not available.
The proposed approach, namely UML4IoT, fully automates the generation process of the IoT-compliant layer that is required for the cyber-physical component to be integrated in the modern IoT manufacturing environment.
A prototype implementation of the myLiqueur laboratory system has been developed to demonstrate the applicability and effectiveness of the UML4IoT approach.
This paper proposes a novel approach for uncertainty quantification in dense Conditional Random Fields (CRFs).
The presented approach, called Perturb-and-MPM, enables efficient, approximate sampling from dense multi-label CRFs via random perturbations.
An analytic error analysis was performed which identified the main cause of approximation error as well as showed that the error is bounded.
Spatial uncertainty maps can be derived from the Perturb-and-MPM model, which can be used to visualize uncertainty in image segmentation results.
The method is validated on synthetic and clinical Magnetic Resonance Imaging data.
The effectiveness of the approach is demonstrated on the challenging problem of segmenting the tumor core in glioblastoma.
We found that areas of high uncertainty correspond well to wrongly segmented image regions.
Furthermore, we demonstrate the potential use of uncertainty maps to refine imaging biomarkers in the case of extent of resection and residual tumor volume in brain tumor patients.
Graphics Processing Units allow for running massively parallel applications offloading the CPU from computationally intensive resources, however GPUs have a limited amount of memory.
In this paper a trie compression algorithm for massively parallel pattern matching is presented demonstrating 85% less space requirements than the original highly efficient parallel failure-less aho-corasick, whilst demonstrating over 22 Gbps throughput.
The algorithm presented takes advantage of compressed row storage matrices as well as shared and texture memory on the GPU.
The h-index can be a useful metric for evaluating a person's output of Internet media.
Here we advocate and demonstrate adaption of the h-index and the g-index to the top video content creators on YouTube.
The h-index for Internet video media is based on videos and their view counts.
The index h is defined as the number of videos with >= h*10^5 views.
The index g is defined as the number of videos with >= g*10^5 views on average.
When compared to a video creator's total view count, the h-index and g-index better capture both productivity and impact in a single metric.
Surprisingly promising results have been achieved by deep learning (DL) systems in recent years.
Many of these achievements have been reached in academic settings, or by large technology companies with highly skilled research groups and advanced supporting infrastructure.
For companies without large research groups or advanced infrastructure, building high-quality production-ready systems with DL components has proven challenging.
There is a clear lack of well-functioning tools and best practices for building DL systems.
It is the goal of this research to identify what the main challenges are, by applying an interpretive research approach in close collaboration with companies of varying size and type.
A set of seven projects have been selected to describe the potential with this new technology and to identify associated main challenges.
A set of 12 main challenges has been identified and categorized into the three areas of development, production, and organizational challenges.
Furthermore, a mapping between the challenges and the projects is defined, together with selected motivating descriptions of how and why the challenges apply to specific projects.
Compared to other areas such as software engineering or database technologies, it is clear that DL is still rather immature and in need of further work to facilitate development of high-quality systems.
The challenges identified in this paper can be used to guide future research by the software engineering and DL communities.
Together, we could enable a large number of companies to start taking advantage of the high potential of the DL technology.
The sustainability of any Data Warehouse System (DWS) is closely correlated with user satisfaction.
Therefore, analysts, designers and developers focused more on achieving all its functionality, without considering others kinds of requirement such as dependability s aspects.
Moreover, these latter are often considered as properties of the system that will must be checked and corrected once the project is completed.
The practice of "fix it later" can cause the obsolescence of the entire Data Warehouse System.
Therefore, it requires the adoption of a methodology that will ensure the integration of aspects of dependability since the early stages of project DWS.
In this paper, we first define the concepts related to dependability of DWS.
Then we present our approach inspired from the MDA (Model Driven Architecture) approach to model dependability s aspects namely: availability, reliability, maintainability and security, taking into account their interaction.
Cloud computing changed the way of computing as utility services offered through public network.
Selecting multiple providers for various computational requirements improves performance and minimizes cost of cloud services than choosing a single cloud provider.
Federated cloud improves scalability, cost minimization, performance maximization, collaboration with other providers, multi-site deployment for fault tolerance and recovery, reliability and less energy consumption.
Both providers and consumers could benefit from federated cloud where providers serve the consumers by satisfying Service Level Agreement, minimizing overall management and infrastructure cost; consumers get best services with less deployment cost and high availability.
Efficient provisioning of resources to consumers in federated cloud is a challenging task.
In this paper, the benefits of utilizing services from federated cloud, architecture with various coupling levels, different optimized resource provisioning methods and challenges associated with it are discussed and a comparative study is carried out over these aspects.
To cluster sequences given only their read-set representations, one may try to reconstruct each one from the corresponding read set, and then employ conventional (dis)similarity measures such as the edit distance on the assembled sequences.
This approach is however problematic and we propose instead to estimate the similarities directly from the read sets.
Our approach is based on an adaptation of the Monge-Elkan similarity known from the field of databases.
It avoids the NP-hard problem of sequence assembly.
For low coverage data it results in a better approximation of the true sequence similarities and consequently in better clustering, in comparison to the first-assemble-then-cluster approach.
While off-policy temporal difference (TD) methods have widely been used in reinforcement learning due to their efficiency and simple implementation, their Bayesian counterparts have not been utilized as frequently.
One reason is that the non-linear max operation in the Bellman optimality equation makes it difficult to define conjugate distributions over the value functions.
In this paper, we introduce a novel Bayesian approach to off-policy TD methods, called as ADFQ, which updates beliefs on state-action values, Q, through an online Bayesian inference method known as Assumed Density Filtering.
In order to formulate a closed-form update, we approximately estimate analytic parameters of the posterior of the Q-beliefs.
Uncertainty measures in the beliefs not only are used in exploration but also provide a natural regularization for learning.
We show that ADFQ converges to Q-learning as the uncertainty measures of the Q-beliefs decrease.
ADFQ improves common drawbacks of other Bayesian RL algorithms such as computational complexity.
We also extend ADFQ with a neural network.
Our empirical results demonstrate that the proposed ADFQ algorithm outperforms comparable algorithms on various domains including continuous state domains and games from the Arcade Learning Environment.
Comparison between multidimensional persistent Betti numbers is often based on the multidimensional matching distance.
While this metric is rather simple to define and compute by considering a suitable family of filtering functions associated with lines having a positive slope, it has two main drawbacks.
First, it forgets the natural link between the homological properties of filtrations associated with lines that are close to each other.
As a consequence, part of the interesting homological information is lost.
Second, its intrinsically discontinuous definition makes it difficult to study its properties.
In this paper we introduce a new matching distance for 2D persistent Betti numbers, called coherent matching distance and based on matchings that change coherently with the filtrations we take into account.
Its definition is not trivial, as it must face the presence of monodromy in multidimensional persistence, i.e. the fact that different paths in the space parameterizing the above filtrations can induce different matchings between the associated persistent diagrams.
In our paper we prove that the coherent 2D matching distance is well-defined and stable.
In order to disseminate the exponential extent of knowledge being produced in the form of scientific publications, it would be best to design mechanisms that connect it with already existing rich repository of concepts -- the Wikipedia.
Not only does it make scientific reading simple and easy (by connecting the involved concepts used in the scientific articles to their Wikipedia explanations) but also improves the overall quality of the article.
In this paper, we present a novel metapath based method, WikiM, to efficiently wikify scientific abstracts -- a topic that has been rarely investigated in the literature.
One of the prime motivations for this work comes from the observation that, wikified abstracts of scientific documents help a reader to decide better, in comparison to the plain abstracts, whether (s)he would be interested to read the full article.
We perform mention extraction mostly through traditional tf-idf measures coupled with a set of smart filters.
The entity linking heavily leverages on the rich citation and author publication networks.
Our observation is that various metapaths defined over these networks can significantly enhance the overall performance of the system.
For mention extraction and entity linking, we outperform most of the competing state-of-the-art techniques by a large margin arriving at precision values of 72.42% and 73.8% respectively over a dataset from the ACL Anthology Network.
In order to establish the robustness of our scheme, we wikify three other datasets and get precision values of 63.41%-94.03% and 67.67%-73.29% respectively for the mention extraction and the entity linking phase.
In future traffic scenarios, vehicles and other traffic participants will be interconnected and equipped with various types of sensors, allowing for cooperation based on data or information exchange.
This article presents an approach to cooperative tracking of cyclists using smart devices and infrastructure-based sensors.
A smart device is carried by the cyclists and an intersection is equipped with a wide angle stereo camera system.
Two tracking models are presented and compared.
The first model is based on the stereo camera system detections only, whereas the second model cooperatively combines the camera based detections with velocity and yaw rate data provided by the smart device.
Our aim is to overcome limitations of tracking approaches based on single data sources.
We show in numerical evaluations on scenes where cyclists are starting or turning right that the cooperation leads to an improvement in both the ability to keep track of a cyclist and the accuracy of the track particularly when it comes to occlusions in the visual system.
We, therefore, contribute to the safety of vulnerable road users in future traffic.
Lifted Relational Neural Networks (LRNNs) describe relational domains using weighted first-order rules which act as templates for constructing feed-forward neural networks.
While previous work has shown that using LRNNs can lead to state-of-the-art results in various ILP tasks, these results depended on hand-crafted rules.
In this paper, we extend the framework of LRNNs with structure learning, thus enabling a fully automated learning process.
Similarly to many ILP methods, our structure learning algorithm proceeds in an iterative fashion by top-down searching through the hypothesis space of all possible Horn clauses, considering the predicates that occur in the training examples as well as invented soft concepts entailed by the best weighted rules found so far.
In the experiments, we demonstrate the ability to automatically induce useful hierarchical soft concepts leading to deep LRNNs with a competitive predictive power.
The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children.
LDs affect about 10 percent of all children enrolled in schools.
The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time.
Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining.
Different rules extracted from the decision tree are used for prediction of learning disabilities.
Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child.
In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters.
By applying these classification techniques, LD in any child can be identified.
Smartphone apps provide a vitally important opportunity for monitoring human mobility, human experience of ubiquitous information aids, and human activity in our increasingly well-instrumented spaces.
As wireless data capabilities move steadily up in performance, from 2&3G to 4G (today's LTE) and 5G, it has become more important to measure human activity in this connected world from the phones themselves.
The newer protocols serve larger areas than ever before and a wider range of data, not just voice calls, so only the phone can accurately measure its location.
Access to the application activity permits not only monitoring the performance and spatial coverage with which the users are served, but as a crowd-sourced, unbiased background source of input on all these subjects, becomes a uniquely valuable resource for input to social science and government as well as telecom providers
Membership inference attacks seek to infer membership of individual training instances of a model to which an adversary has black-box access through a machine learning-as-a-service API.
In providing an in-depth characterization of membership privacy risks against machine learning models, this paper presents a comprehensive study towards demystifying membership inference attacks from two complimentary perspectives.
First, we provide a generalized formulation of the development of a black-box membership inference attack model.
Second, we characterize the importance of model choice on model vulnerability through a systematic evaluation of a variety of machine learning models and model combinations using multiple datasets.
Through formal analysis and empirical evidence from extensive experimentation, we characterize under what conditions a model may be vulnerable to such black-box membership inference attacks.
We show that membership inference vulnerability is data-driven and corresponding attack models are largely transferable.
Though different model types display different vulnerabilities to membership inference, so do different datasets.
Our empirical results additionally show that (1) using the type of target model under attack within the attack model may not increase attack effectiveness and (2) collaborative learning exposes vulnerabilities to membership inference risks when the adversary is a participant.
We also discuss countermeasure and mitigation strategies.
Dropout is a simple yet effective algorithm for regularizing neural networks by randomly dropping out units through Bernoulli multiplicative noise, and for some restricted problem classes, such as linear or logistic regression, several theoretical studies have demonstrated the equivalence between dropout and a fully deterministic optimization problem with data-dependent Tikhonov regularization.
This work presents a theoretical analysis of dropout for matrix factorization, where Bernoulli random variables are used to drop a factor, thereby attempting to control the size of the factorization.
While recent work has demonstrated the empirical effectiveness of dropout for matrix factorization, a theoretical understanding of the regularization properties of dropout in this context remains elusive.
This work demonstrates the equivalence between dropout and a fully deterministic model for matrix factorization in which the factors are regularized by the sum of the product of the norms of the columns.
While the resulting regularizer is closely related to a variational form of the nuclear norm, suggesting that dropout may limit the size of the factorization, we show that it is possible to trivially lower the objective value by doubling the size of the factorization.
We show that this problem is caused by the use of a fixed dropout rate, which motivates the use of a rate that increases with the size of the factorization.
Synthetic experiments validate our theoretical findings.
Energy efficiency is a key requirement for the Internet of Things, as many sensors are expected to be completely stand-alone and able to run for years without battery replacement.
Data compression aims at saving some energy by reducing the volume of data sent over the network, but also affects the quality of the received information.
In this work, we formulate an optimization problem to jointly design the source coding and transmission strategies for time-varying channels and sources, with the twofold goal of extending the network lifetime and granting low distortion levels.
We propose a scalable offline optimal policy that allocates both energy and transmission parameters (i.e., times and powers) in a network with a dynamic Time Division Multiple Access (TDMA)-based access scheme.
After providing a brief historical overview on the synergies between artificial intelligence research, in the areas of evolutionary computations and machine learning, and the optimal design of interplanetary trajectories, we propose and study the use of deep artificial neural networks to represent, on-board, the optimal guidance profile of an interplanetary mission.
The results, limited to the chosen test case of an Earth-Mars orbital transfer, extend the findings made previously for landing scenarios and quadcopter dynamics, opening a new research area in interplanetary trajectory planning.
The University of Cambridge submission to the WMT18 news translation task focuses on the combination of diverse models of translation.
We compare recurrent, convolutional, and self-attention-based neural models on German-English, English-German, and Chinese-English.
Our final system combines all neural models together with a phrase-based SMT system in an MBR-based scheme.
We report small but consistent gains on top of strong Transformer ensembles.
An important goal common to domain adaptation and causal inference is to make accurate predictions when the distributions for the source (or training) domain(s) and target (or test) domain(s) differ.
In many cases, these different distributions can be modeled as different contexts of a single underlying system, in which each distribution corresponds to a different perturbation of the system, or in causal terms, an intervention.
We focus on a class of such causal domain adaptation problems, where data for one or more source domains are given, and the task is to predict the distribution of a certain target variable from measurements of other variables in one or more target domains.
We propose an approach for solving these problems that exploits causal inference and does not rely on prior knowledge of the causal graph, the type of interventions or the intervention targets.
We demonstrate our approach by evaluating a possible implementation on simulated and real world data.
The selection of the best classification algorithm for a given dataset is a very widespread problem, occuring each time one has to choose a classifier to solve a real-world problem.
It is also a complex task with many important methodological decisions to make.
Among those, one of the most crucial is the choice of an appropriate measure in order to properly assess the classification performance and rank the algorithms.
In this article, we focus on this specific task.
We present the most popular measures and compare their behavior through discrimination plots.
We then discuss their properties from a more theoretical perspective.
It turns out several of them are equivalent for classifiers comparison purposes.
Futhermore. they can also lead to interpretation problems.
Among the numerous measures proposed over the years, it appears that the classical overall success rate and marginal rates are the more suitable for classifier comparison task.
In this work we are interested in the problem of energy management in Mobile Ad-hoc Network (MANET).
The solving and optimization of MANET allow assisting the users to efficiently use their devices in order to minimize the batteries power consumption.
In this framework, we propose a modelling of the MANET in form of a Constraint Optimization Problem called COMANET.
Then, in the objective to minimize the consumption of batteries power, we present an approach based on an adaptation of the A star algorithm to the MANET problem called MANED.
Finally, we expose some experimental results showing utility of this approach.
Recent deep learning based denoisers often outperform state-of-the-art conventional denoisers such as BM3D.
They are typically trained to minimize the mean squared error (MSE) between the output of a deep neural network and the ground truth image.
In deep learning based denoisers, it is important to use high quality noiseless ground truth for high performance, but it is often challenging or even infeasible to obtain such a clean image in application areas such as hyperspectral remote sensing and medical imaging.
We propose a Stein's Unbiased Risk Estimator (SURE) based method for training deep neural network denoisers only with noisy images.
We demonstrated that our SURE based method without ground truth was able to train deep neural network denoisers to yield performance close to deep learning denoisers trained with ground truth and to outperform state-of-the-art BM3D.
Further improvements were achieved by including noisy test images for training denoiser networks using our proposed SURE based method.
As the interest in the representation of context dependent knowledge in the Semantic Web has been recognized, a number of logic based solutions have been proposed in this regard.
In our recent works, in response to this need, we presented the description logic-based Contextualized Knowledge Repository (CKR) framework.
CKR is not only a theoretical framework, but it has been effectively implemented over state-of-the-art tools for the management of Semantic Web data: inference inside and across contexts has been realized in the form of forward SPARQL-based rules over different RDF named graphs.
In this paper we present the first evaluation results for such CKR implementation.
In particular, in first experiment we study its scalability with respect to different reasoning regimes.
In a second experiment we analyze the effects of knowledge propagation on the computation of inferences.
Maximum-likelihood estimation (MLE) is widely used in sequence to sequence tasks for model training.
It uniformly treats the generation/prediction of each target token as multi-class classification, and yields non-smooth prediction probabilities: in a target sequence, some tokens are predicted with small probabilities while other tokens are with large probabilities.
According to our empirical study, we find that the non-smoothness of the probabilities results in low quality of generated sequences.
In this paper, we propose a sentence-wise regularization method which aims to output smooth prediction probabilities for all the tokens in the target sequence.
Our proposed method can automatically adjust the weights and gradients of each token in one sentence to ensure the predictions in a sequence uniformly well.
Experiments on three neural machine translation tasks and one text summarization task show that our method outperforms conventional MLE loss on all these tasks and achieves promising BLEU scores on WMT14 English-German and WMT17 Chinese-English translation task.
In this paper the problem of driving the state of a network of identical agents, modeled by boundary-controlled heat equations, towards a common steady-state profile is addressed.
Decentralized consensus protocols are proposed to address two distinct problems.
The first problem is that of steering the states of all agents towards the same constant steady-state profile which corresponds to the spatial average of the agents initial condition.
A linear local interaction rule addressing this requirement is given.
The second problem deals with the case where the controlled boundaries of the agents dynamics are corrupted by additive persistent disturbances.
To achieve synchronization between agents, while completely rejecting the effect of the boundary disturbances, a nonlinear sliding-mode based consensus protocol is proposed.
Performance of the proposed local interaction rules are analyzed by applying a Lyapunov-based approach.
Simulation results are presented to support the effectiveness of the proposed algorithms.
Over the last decade, the rise of the mobile internet and the usage of mobile devices has enabled ubiquitous traffic information.
With the increased adoption of specific smartphone applications, the number of users of routing applications has become large enough to disrupt traffic flow patterns in a significant manner.
Similarly, but at a slightly slower pace, novel services for freight transportation and city logistics improve the efficiency of goods transportation and change the use of road infrastructure.
The present article provides a general four-layer framework for modeling these new trends.
The main motivation behind the development is to provide a unifying formal system description that can at the same time encompass system physics (flow and motion of vehicles) as well as coordination strategies under various information and cooperation structures.
To showcase the framework, we apply it to the specific challenge of modeling and analyzing the integration of routing applications in today's transportation systems.
In this framework, at the lowest layer (flow dynamics) we distinguish app users from non-app users.
A distributed parameter model based on a non-local partial differential equation is introduced and analyzed.
The second layer incorporates connected services (e.g., routing) and other applications used to optimize the local performance of the system.
As inputs to those applications, we propose a third layer introducing the incentive design and global objectives, which are typically varying over the day depending on road and weather conditions, external events etc.
The high-level planning is handled on the fourth layer taking social long-term objectives into account.
We present a light formalism for proofs that encodes their inferential structure, along with a system that transforms these representations into flow-chart diagrams.
Such diagrams should improve the comprehensibility of proofs.
We discuss language syntax, diagram semantics, and our goal of building a repository of diagrammatic representations of proofs from canonical mathematical literature.
The repository will be available online in the form of a wiki at proofflow.org, where the flow chart drawing software will be deployable through the wiki editor.
We also consider the possibility of a semantic tagging of the assertions in a proof, to permit data mining.
We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features.
In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data.
When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be generated from a Gaussian mixture model.
We call our model the latent tree variational autoencoder (LTVAE).
Whereas previous deep learning methods for clustering produce only one partition of data, LTVAE produces multiple partitions of data, each being given by one super latent variable.
This is desirable because high dimensional data usually have many different natural facets and can be meaningfully partitioned in multiple ways.
The two significant tasks of a focused Web crawler are finding relevant topic-specific documents on the Web and analytically prioritizing them for later effective and reliable download.
For the first task, we propose a sophisticated custom algorithm to fetch and analyze the most effective HTML structural elements of the page as well as the topical boundary and anchor text of each unvisited link, based on which the topical focus of an unvisited page can be predicted and elicited with a high accuracy.
Thus, our novel method uniquely combines both link-based and content-based approaches.
For the second task, we propose a scoring function of the relevant URLs through the use of T-Graph (Treasure Graph) to assist in prioritizing the unvisited links that will later be put into the fetching queue.
Our Web search system is called the Treasure-Crawler.
This research paper embodies the architectural design of the Treasure-Crawler system which satisfies the principle requirements of a focused Web crawler, and asserts the correctness of the system structure including all its modules through illustrations and by the test results.
The survey data sets are important sources of data and their successful exploitation is of key importance for informed policy-decision making.
We present how a survey analysis approach initially developed for customer satisfaction research in marketing can be adapted for the introduction of clinical pharmacy services into hospital.
We use two analytical approaches to extract relevant managerial consequences.
With OrdEval algorithm we first evaluate the importance of competences for the users of clinical pharmacy and extract their nature according to the users expectations.
Next, we build a model for predicting a successful introduction of clinical pharmacy to the clinical departments.
We the wards with the highest probability of successful cooperation with a clinical pharmacist.
We obtain useful managerially relevant information from a relatively small sample of highly relevant respondents.
We show how the OrdEval algorithm exploits the information hidden in the ordering of class and attribute values and their inherent correlation.
Its output can be effectively visualized and complemented with confidence intervals.
The recently developed variational autoencoders (VAEs) have proved to be an effective confluence of the rich representational power of neural networks with Bayesian methods.
However, most work on VAEs use a rather simple prior over the latent variables such as standard normal distribution, thereby restricting its applications to relatively simple phenomena.
In this work, we propose hierarchical nonparametric variational autoencoders, which combines tree-structured Bayesian nonparametric priors with VAEs, to enable infinite flexibility of the latent representation space.
Both the neural parameters and Bayesian priors are learned jointly using tailored variational inference.
The resulting model induces a hierarchical structure of latent semantic concepts underlying the data corpus, and infers accurate representations of data instances.
We apply our model in video representation learning.
Our method is able to discover highly interpretable activity hierarchies, and obtain improved clustering accuracy and generalization capacity based on the learned rich representations.
Finding minimum distortion of adversarial examples and thus certifying robustness in neural network classifiers for given data points is known to be a challenging problem.
Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum adversarial distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations.
However, a generic robustness certification for general activation functions still remains largely unexplored.
To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points.
The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to four popular choices: ReLU, tanh, sigmoid and arctan.
In addition, we facilitate the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.
Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency.
Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.
We propose to solve any algorithm on discrete variables by a technique of statistical estimation using deterministic convex analysis.
In this framework, the variables are represented by their probability and the distinction between the complexity classes vanishes.
The method is illustrated by solving the 3-SAT problem in polynomial time.
Information theory allows us to investigate information processing in neural systems in terms of information transfer, storage and modification.
Especially the measure of information transfer, transfer entropy, has seen a dramatic surge of interest in neuroscience.
Estimating transfer entropy from two processes requires the observation of multiple realizations of these processes to estimate associated probability density functions.
To obtain these observations, available estimators assume stationarity of processes to allow pooling of observations over time.
This assumption however, is a major obstacle to the application of these estimators in neuroscience as observed processes are often non-stationary.
As a solution, Gomez-Herrero and colleagues theoretically showed that the stationarity assumption may be avoided by estimating transfer entropy from an ensemble of realizations.
Such an ensemble is often readily available in neuroscience experiments in the form of experimental trials.
Thus, in this work we combine the ensemble method with a recently proposed transfer entropy estimator to make transfer entropy estimation applicable to non-stationary time series.
We present an efficient implementation of the approach that deals with the increased computational demand of the ensemble method's practical application.
In particular, we use a massively parallel implementation for a graphics processing unit to handle the computationally most heavy aspects of the ensemble method.
We test the performance and robustness of our implementation on data from simulated stochastic processes and demonstrate the method's applicability to magnetoencephalographic data.
While we mainly evaluate the proposed method for neuroscientific data, we expect it to be applicable in a variety of fields that are concerned with the analysis of information transfer in complex biological, social, and artificial systems.
The historical research line on the algebraic properties of structured CF languages initiated by McNaughton's Parenthesis Languages has recently attracted much renewed interest with the Balanced Languages, the Visibly Pushdown Automata languages (VPDA), the Synchronized Languages, and the Height-deterministic ones.
Such families preserve to a varying degree the basic algebraic properties of Regular languages: boolean closure, closure under reversal, under concatenation, and Kleene star.
We prove that the VPDA family is strictly contained within the Floyd Grammars (FG) family historically known as operator precedence.
Languages over the same precedence matrix are known to be closed under boolean operations, and are recognized by a machine whose pop or push operations on the stack are purely determined by terminal letters.
We characterize VPDA's as the subclass of FG having a peculiarly structured set of precedence relations, and balanced grammars as a further restricted case.
The non-counting invariance property of FG has a direct implication for VPDA too.
Graph representations have increasingly grown in popularity during the last years.
Existing representation learning approaches explicitly encode network structure.
Despite their good performance in downstream processes (e.g., node classification, link prediction), there is still room for improvement in different aspects, like efficacy, visualization, and interpretability.
In this paper, we propose, t-PINE, a method that addresses these limitations.
Contrary to baseline methods, which generally learn explicit graph representations by solely using an adjacency matrix, t-PINE avails a multi-view information graph, the adjacency matrix represents the first view, and a nearest neighbor adjacency, computed over the node features, is the second view, in order to learn explicit and implicit node representations, using the Canonical Polyadic (a.k.a.CP) decomposition.
We argue that the implicit and the explicit mapping from a higher-dimensional to a lower-dimensional vector space is the key to learn more useful, highly predictable, and gracefully interpretable representations.
Having good interpretable representations provides a good guidance to understand how each view contributes to the representation learning process.
In addition, it helps us to exclude unrelated dimensions.
Extensive experiments show that t-PINE drastically outperforms baseline methods by up to 158.6% with respect to Micro-F1, in several multi-label classification problems, while it has high visualization and interpretability utility.
Motivated by value function estimation in reinforcement learning, we study statistical linear inverse problems, i.e., problems where the coefficients of a linear system to be solved are observed in noise.
We consider penalized estimators, where performance is evaluated using a matrix-weighted two-norm of the defect of the estimator measured with respect to the true, unknown coefficients.
Two objective functions are considered depending whether the error of the defect measured with respect to the noisy coefficients is squared or unsquared.
We propose simple, yet novel and theoretically well-founded data-dependent choices for the regularization parameters for both cases that avoid data-splitting.
A distinguishing feature of our analysis is that we derive deterministic error bounds in terms of the error of the coefficients, thus allowing the complete separation of the analysis of the stochastic properties of these errors.
We show that our results lead to new insights and bounds for linear value function estimation in reinforcement learning.
We propose a novel semi-direct approach for monocular simultaneous localization and mapping (SLAM) that combines the complementary strengths of direct and feature-based methods.
The proposed pipeline loosely couples direct odometry and feature-based SLAM to perform three levels of parallel optimizations: (1) photometric bundle adjustment (BA) that jointly optimizes the local structure and motion, (2) geometric BA that refines keyframe poses and associated feature map points, and (3) pose graph optimization to achieve global map consistency in the presence of loop closures.
This is achieved in real-time by limiting the feature-based operations to marginalized keyframes from the direct odometry module.
Exhaustive evaluation on two benchmark datasets demonstrates that our system outperforms the state-of-the-art monocular odometry and SLAM systems in terms of overall accuracy and robustness.
Contexts play an important role in the saliency detection task.
However, given a context region, not all contextual information is helpful for the final task.
In this paper, we propose a novel pixel-wise contextual attention network, i.e., the PiCANet, to learn to selectively attend to informative context locations for each pixel.
Specifically, for each pixel, it can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location.
An attended contextual feature can then be constructed by selectively aggregating the contextual information.
We formulate the proposed PiCANet in both global and local forms to attend to global and local contexts, respectively.
Both models are fully differentiable and can be embedded into CNNs for joint training.
We also incorporate the proposed models with the U-Net architecture to detect salient objects.
Extensive experiments show that the proposed PiCANets can consistently improve saliency detection performance.
The global and local PiCANets facilitate learning global contrast and homogeneousness, respectively.
As a result, our saliency model can detect salient objects more accurately and uniformly, thus performing favorably against the state-of-the-art methods.
We consider algorithms for "smoothed online convex optimization" problems, a variant of the class of online convex optimization problems that is strongly related to metrical task systems.
Prior literature on these problems has focused on two performance metrics: regret and the competitive ratio.
There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however, no known algorithm achieves both simultaneously.
We show that this is due to a fundamental incompatibility between these two metrics - no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear.
However, we also exhibit an algorithm that, for the important special case of one-dimensional decision spaces, provides sublinear regret while maintaining a competitive ratio that grows arbitrarily slowly.
An article about the transformation of the theory and practice of marketing in terms of e-commerce and network economy.
The author considers Internet Marketing as an independent marketing communication in a virtual environment.
The main thesis of the article: virtual environment determines the transformation of marketing, changing methods, priorities and structure not only practice, but also the theory of marketing.
The handwriting of an individual may vary substantially with factors such as mood, time, space, writing speed, writing medium and tool, writing topic, etc.
It becomes challenging to perform automated writer verification/identification on a particular set of handwritten patterns (e.g., speedy handwriting) of a person, especially when the system is trained using a different set of writing patterns (e.g., normal speed) of that same person.
However, it would be interesting to experimentally analyze if there exists any implicit characteristic of individuality which is insensitive to high intra-variable handwriting.
In this paper, we study some handcrafted features and auto-derived features extracted from intra-variable writing.
Here, we work on writer identification/verification from offline Bengali handwriting of high intra-variability.
To this end, we use various models mainly based on handcrafted features with SVM (Support Vector Machine) and features auto-derived by the convolutional network.
For experimentation, we have generated two handwritten databases from two different sets of 100 writers and enlarged the dataset by a data-augmentation technique.
We have obtained some interesting results.
Network Functions Virtualization (NFV) and Network Coding (NC) have attracted much attention in recent years as key concepts for providing 5G networks with flexibility and differentiated reliability, respectively.
In this paper, we present the integration of NC architectural design and NFV.
In order to do so we first describe what we call a virtualization process upon our proposed architectural design of NC that should help to offer the reliability functionality to a network.
The process consists of identifying the required functional entities of NC and analyzing when the functionality should be activated towards complexity/energy efficiency.
The relevance of our proposed NC function virtualization is its applicability to any underlying physical network, satellite or hybrid thus enabling softwarization, and rapid innovative deployment.
Finally, we validate our framework to a study case of geo-control of network reliability that is based on device's geographical location-based signal/network information.
We present NAVREN-RL, an approach to NAVigate an unmanned aerial vehicle in an indoor Real ENvironment via end-to-end reinforcement learning RL.
A suitable reward function is designed keeping in mind the cost and weight constraints for micro drone with minimum number of sensing modalities.
Collection of small number of expert data and knowledge based data aggregation is integrated into the RL process to aid convergence.
Experimentation is carried out on a Parrot AR drone in different indoor arenas and the results are compared with other baseline technologies.
We demonstrate how the drone successfully avoids obstacles and navigates across different arenas.
Widespread use of memory unsafe programming languages (e.g., C and C++) leaves many systems vulnerable to memory corruption attacks.
A variety of defenses have been proposed to mitigate attacks that exploit memory errors to hijack the control flow of the code at run-time, e.g., (fine-grained) randomization or Control Flow Integrity.
However, recent work on data-oriented programming (DOP) demonstrated highly expressive (Turing-complete) attacks, even in the presence of these state-of-the-art defenses.
Although multiple real-world DOP attacks have been demonstrated, no efficient defenses are yet available.
We propose run-time scope enforcement (RSE), a novel approach designed to efficiently mitigate all currently known DOP attacks by enforcing compile-time memory safety constraints (e.g., variable visibility rules) at run-time.
We present HardScope, a proof-of-concept implementation of hardware-assisted RSE for the new RISC-V open instruction set architecture.
We discuss our systematic empirical evaluation of HardScope which demonstrates that it can mitigate all currently known DOP attacks, and has a real-world performance overhead of 3.2% in embedded benchmarks.
In "Reliable Communication in the Absence of a Common Clock" (Yeung et al., 2009), the authors introduce general run-length sets, which form a class of constrained systems that permit run-lengths from a countably infinite set.
For a particular definition of probabilistic capacity, they show that probabilistic capacity is equal to combinatorial capacity.
In the present work, it is shown that the same result also holds for Shannon's original definition of probabilistic capacity.
The derivation presented here is based on generating functions of constrained systems as developed in "On the Capacity of Constrained Systems" (Boecherer et al., 2010) and provides a unified information-theoretic treatment of general run-length sets.
Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train.
However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation.
This limits BN's usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption.
In this paper, we present Group Normalization (GN) as a simple alternative to BN.
GN divides the channels into groups and computes within each group the mean and variance for normalization.
GN's computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes.
On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants.
Moreover, GN can be naturally transferred from pre-training to fine-tuning.
GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.
GN can be easily implemented by a few lines of code in modern libraries.
Collaborative Filtering (CF) is one of the most commonly used recommendation methods.
CF consists in predicting whether, or how much, a user will like (or dislike) an item by leveraging the knowledge of the user's preferences as well as that of other users.
In practice, users interact and express their opinion on only a small subset of items, which makes the corresponding user-item rating matrix very sparse.
Such data sparsity yields two main problems for recommender systems: (1) the lack of data to effectively model users' preferences, and (2) the lack of data to effectively model item characteristics.
However, there are often many other data sources that are available to a recommender system provider, which can describe user interests and item characteristics (e.g., users' social network, tags associated to items, etc.).
These valuable data sources may supply useful information to enhance a recommendation system in modeling users' preferences and item characteristics more accurately and thus, hopefully, to make recommenders more precise.
For various reasons, these data sources may be managed by clusters of different data centers, thus requiring the development of distributed solutions.
In this paper, we propose a new distributed collaborative filtering algorithm, which exploits and combines multiple and diverse data sources to improve recommendation quality.
Our experimental evaluation using real datasets shows the effectiveness of our algorithm compared to state-of-the-art recommendation algorithms.
Traditionally it had been a problem that researchers did not have access to enough spatial data to answer pressing research questions or build compelling visualizations.
Today, however, the problem is often that we have too much data.
Spatially redundant or approximately redundant points may refer to a single feature (plus noise) rather than many distinct spatial features.
We use a machine learning approach with density-based clustering to compress such spatial data into a set of representative features.
We present a novel framework for finding complex activities matching user-described queries in cluttered surveillance videos.
The wide diversity of queries coupled with unavailability of annotated activity data limits our ability to train activity models.
To bridge the semantic gap we propose to let users describe an activity as a semantic graph with object attributes and inter-object relationships associated with nodes and edges, respectively.
We learn node/edge-level visual predictors during training and, at test-time, propose to retrieve activity by identifying likely locations that match the semantic graph.
We formulate a novel CRF based probabilistic activity localization objective that accounts for mis-detections, mis-classifications and track-losses, and outputs a likelihood score for a candidate grounded location of the query in the video.
We seek groundings that maximize overall precision and recall.
To handle the combinatorial search over all high-probability groundings, we propose a highest precision subgraph matching algorithm.
Our method outperforms existing retrieval methods on benchmarked datasets.
The electricity market is threatened by supply scarcity, which may lead to very sharp price spikes in the spot market.
On the other hand, demand-side's activities could effectively mitigate the supply scarcity and absorb most of these shocks and therefore smooth out the price volatility.
In this paper, the positive effects of employing demand response programs on the spot market price are investigated.
A demand-price elasticity based model is used to simulate the customer reaction function in the presence of a real time pricing.
The demand achieve by DR program is used to adjust the spot market price by using a price regression model.
SAS software is used to run the multiple linear regression model and MATLAB is used to simulate the demand response model.
The approach is applied on one week data in summer 2014 of Connecticut in New England ISO.
It could be concluded from the results of this study that applying DR program smooths out most of the price spikes in the electricity spot market and considerably reduces the customers' electricity cost.
Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images.
In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes.
We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework.
For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge.
The experimental results suggest that our method yields promising results for ZSD.
The rising trend of coauthored academic works obscures the credit assignment that is the basis for decisions of funding and career advancements.
In this paper, a simple model based on the assumption of an unvarying "author ability" is introduced.
With this assumption, the weight of author contributions to a body of coauthored work can be statistically estimated.
The method is tested on a set of some more than five-hundred authors in a coauthor network from the CiteSeerX database.
The ranking obtained agrees fairly well with that given by total fractional citation counts for an author, but noticeable differences exist.
To participate in the Outback Medical Express UAV Challenge 2016, a vehicle was designed and tested that can hover precisely, take-off and land vertically, fly fast forward efficiently and use computer vision to locate a person and a suitable landing location.
A rotor blade was designed that can deliver sufficient thrust in hover, while still being efficient in fast forward flight.
Energy measurements and windtunnel tests were performed.
A rotor-head and corresponding control algorithms were developed to allow transitioning flight with the non-conventional rotor dynamics.
Dedicated electronics were designed that meet vehicle needs and regulations to allow safe flight beyond visual line of sight.
Vision based search and guidance algorithms were developed and tested.
Flight tests and a competition participation illustrate the applicability of the DelftaCopter concept.
We make available to the community a new dataset to support action-recognition research.
This dataset is different from prior datasets in several key ways.
It is significantly larger.
It contains streaming video with long segments containing multiple action occurrences that often overlap in space and/or time.
All actions were filmed in the same collection of backgrounds so that background gives little clue as to action class.
We had five humans replicate the annotation of temporal extent of action occurrences labeled with their class and measured a surprisingly low level of intercoder agreement.
A baseline experiment shows that recent state-of-the-art methods perform poorly on this dataset.
This suggests that this will be a challenging dataset to foster advances in action-recognition research.
This manuscript serves to describe the novel content and characteristics of the LCA dataset, present the design decisions made when filming the dataset, and document the novel methods employed to annotate the dataset.
Future networks are expected to connect an enormous number of nodes wirelessly using wide-band transmission.
This brings great challenges.
To avoid collecting a large amount of data from the massive number of nodes, computation over multi-access channel (CoMAC) is proposed to compute a desired function over the air utilizing the signal-superposition property of MAC.
Due to frequency selective fading, wide-band CoMAC is more challenging and has never been studied before.
In this work, we propose the use of orthogonal frequency division multiplexing (OFDM) in wide-band CoMAC to transmit functions in a similar way to bit sequences through division, allocation and reconstruction of function.
An achievable rate without any adaptive resource allocation is derived.
To prevent a vanishing computation rate from the increase of the number of nodes, a novel sub-function allocation of sub-carriers is derived.
Furthermore, we formulate an optimization problem considering power allocation.
A sponge-squeezing algorithm adapted from the classical water-filling algorithm is proposed to solve the optimal power allocation problem.
The improved computation rate of the proposed framework and the corresponding allocation has been verified through both theoretical analysis and simulation.
The problem of recovering a signal from its phaseless Fourier transform measurements, called Fourier phase retrieval, arises in many applications in engineering and science.
Fourier phase retrieval poses fundamental theoretical and algorithmic challenges.
In general, there is no unique mapping between a one-dimensional signal and its Fourier magnitude and therefore the problem is ill-posed.
Additionally, while almost all multidimensional signals are uniquely mapped to their Fourier magnitude, the performance of existing algorithms is generally not well-understood.
In this chapter we survey methods to guarantee uniqueness in Fourier phase retrieval.
We then present different algorithmic approaches to retrieve the signal in practice.
We conclude by outlining some of the main open questions in this field.
We perform a Systematic Literature Review to discover how Humanoid robots are being applied in Socially Assistive Robotics experiments.
Our search returned 24 papers, from which 16 were included for closer analysis.
To do this analysis we used a conceptual framework inspired by Behavior-based Robotics.
The results of this study can be used for designing software frameworks targeting Humanoid Socially Assistive Robotics, especially in the context of Software Product Line Engineering projects.
One of the key research interests in the area of Constraint Satisfaction Problem (CSP) is to identify tractable classes of constraints and develop efficient solutions for them.
In this paper, we introduce generalized staircase (GS) constraints which is an important generalization of one such tractable class found in the literature, namely, staircase constraints.
GS constraints are of two kinds, down staircase (DS) and up staircase (US).
We first examine several properties of GS constraints, and then show that arc consistency is sufficient to determine a solution to a CSP over DS constraints.
Further, we propose an optimal O(cd) time and space algorithm to compute arc consistency for GS constraints where c is the number of constraints and d is the size of the largest domain.
Next, observing that arc consistency is not necessary for solving a DSCSP, we propose a more efficient algorithm for solving it.
With regard to US constraints, arc consistency is not known to be sufficient to determine a solution, and therefore, methods such as path consistency or variable elimination are required.
Since arc consistency acts as a subroutine for these existing methods, replacing it by our optimal O(cd) arc consistency algorithm produces a more efficient method for solving a USCSP.
Current State-of-the-Art High Throughput Satellite systems provide wide-area connectivity through multi-beam architectures.
Due to the tremendous system throughput requirements that next generation Satellite Communications (SatCom) expect to achieve, traditional 4-colour frequency reuse schemes are not sufficient anymore and more aggressive solutions as full frequency reuse are being considered for multi-beam SatCom.
These approaches require advanced interference management techniques to cope with the significantly increased inter-beam interference both at the transmitter, e.g., precoding, and at the receiver, e.g., Multi User Detection (MUD).
With respect to the former, several peculiar challenges arise when designed for SatCom systems.
In particular, multiple users are multiplexed in the same transmission radio frame, thus imposing to consider multiple channel matrices when computing the precoding coefficients.
In previous works, the main focus has been on the users' clustering and precoding design.
However, even though achieving significant throughput gains, no analysis has been performed on the impact of the system scheduling algorithm on multicast precoding, which is typically assumed random.
In this paper, we focus on this aspect by showing that, although the overall system performance is improved, a random scheduler does not properly tackle specific scenarios in which the precoding algorithm can poorly perform.
Based on these considerations, we design a Geographical Scheduling Algorithm (GSA) aimed at improving the precoding performance in these critical scenarios and, consequently, the performance at system level as well.
Through extensive numerical simulations, we show that the proposed GSA provides a significant performance improvement with respect to the legacy random scheduling.
A team of robots sharing a common goal can benefit from coordination of the activities of team members, helping the team to reach the goal more reliably or quickly.
We address the problem of coordinating the actions of a team of robots with periodic communication capability executing an information gathering task.
We cast the problem as a multi-agent optimal decision-making problem with an information theoretic objective function.
We show that appropriate techniques for solving decentralized partially observable Markov decision processes (Dec-POMDPs) are applicable in such information gathering problems.
We quantify the usefulness of coordinated information gathering through simulation studies, and demonstrate the feasibility of the method in a real-world target tracking domain.
Style transfer is an important problem in natural language processing (NLP).
However, the progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data and principle evaluation metrics.
In this paper, we propose to learn style transfer with non-parallel data.
We explore two models to achieve this goal, and the key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks.
We also propose novel evaluation metrics which measure two aspects of style transfer: transfer strength and content preservation.
We access our models and the evaluation metrics on two tasks: paper-news title transfer, and positive-negative review transfer.
Results show that the proposed content preservation metric is highly correlate to human judgments, and the proposed models are able to generate sentences with higher style transfer strength and similar content preservation score comparing to auto-encoder.
Modern cyber security operations collect an enormous amount of logging and alerting data.
While analysts have the ability to query and compute simple statistics and plots from their data, current analytical tools are too simple to admit deep understanding.
To detect advanced and novel attacks, analysts turn to manual investigations.
While commonplace, current investigations are time-consuming, intuition-based, and proving insufficient.
Our hypothesis is that arming the analyst with easy-to-use data science tools will increase their work efficiency, provide them with the ability to resolve hypotheses with scientific inquiry of their data, and support their decisions with evidence over intuition.
To this end, we present our work to build IDEAS (Interactive Data Exploration and Analysis System).
We present three real-world use-cases that drive the system design from the algorithmic capabilities to the user interface.
Finally, a modular and scalable software architecture is discussed along with plans for our pilot deployment with a security operation command.
In the framework of convolutional neural networks that lie at the heart of deep learning, downsampling is often performed with a max-pooling operation that only retains the element with maximum activation, while completely discarding the information contained in other elements in a pooling region.
To address this issue, a novel pooling scheme, Ordinal Pooling Network (OPN), is introduced in this work.
OPN rearranges all the elements of a pooling region in a sequence and assigns different weights to these elements based upon their orders in the sequence, where the weights are learned via the gradient-based optimisation.
The results of our small-scale experiments on image classification task demonstrate that this scheme leads to a consistent improvement in the accuracy over max-pooling operation.
This improvement is expected to increase in deeper networks, where several layers of pooling become necessary.
Language decoding studies have identified word representations which can be used to predict brain activity in response to novel words and sentences (Anderson et al., 2016; Pereira et al., 2018).
The unspoken assumption of these studies is that, during processing, linguistic information is transformed into some shared semantic space, and those semantic representations are then used for a variety of linguistic and non-linguistic tasks.
We claim that current studies vastly underdetermine the content of these representations, the algorithms which the brain deploys to produce and consume them, and the computational tasks which they are designed to solve.
We illustrate this indeterminacy with an extension of the sentence-decoding experiment of Pereira et al.
(2018), showing how standard evaluations fail to distinguish between language processing models which deploy different mechanisms and which are optimized to solve very different tasks.
We conclude by suggesting changes to the brain decoding paradigm which can support stronger claims of neural representation.
Vehicle color information is one of the important elements in ITS (Intelligent Traffic System).
In this paper, we present a vehicle color recognition method using convolutional neural network (CNN).
Naturally, CNN is designed to learn classification method based on shape information, but we proved that CNN can also learn classification based on color distribution.
In our method, we convert the input image to two different color spaces, HSV and CIE Lab, and run it to some CNN architecture.
The training process follow procedure introduce by Krizhevsky, that learning rate is decreasing by factor of 10 after some iterations.
To test our method, we use publicly vehicle color recognition dataset provided by Chen.
The results, our model outperform the original system provide by Chen with 2% higher overall accuracy.
Founsure is an open-source software library, distributed under LGPLv3 license and implements a multi-dimensional graph-based erasure coding entirely based on fast exclusive OR (XOR) logic.
Its implementation utilizes compiler optimizations and the multi-threaded implementation to generate the right assembly code for the given multi-core CPU architectures with vector processing capabilities.
Founsure (version 1.0) supports a variety of features that shall find interesting applications in modern data storage as well as communication and computer network systems which are becoming hungry in terms of network bandwidth, computational resources and average consumed power.
In particular, Founsure library provides a three dimensional design space that consists of computation complexity, coding overhead and data/node repair bandwidth to meet different requirements of modern distributed data storage and processing systems in which the data needs to be protected against device, hardware and node failures.
Unique features of Founsure include encoding, decoding, repairs/rebuilds and updates while the data and computation can be distributed across the network nodes.
The interaction between an artificial agent and its environment is bi-directional.
The agent extracts relevant information from the environment, and affects the environment by its actions in return to accumulate high expected reward.
Standard reinforcement learning (RL) deals with the expected reward maximization.
However, there are always information-theoretic limitations that restrict the expected reward, which are not properly considered by the standard RL.
In this work we consider RL objectives with information-theoretic limitations.
For the first time we derive a Bellman-type recursive equa- tion for the causal information between the environment and the agent, which is combined plausibly with the Bellman recursion for the value function.
The unified equitation serves to explore the typical behavior of artificial agents in an infinite time horizon.
Force-directed approach is one of the most widely used methods in graph drawing research.
There are two main problems with the traditional force-directed algorithms.
First, there is no mature theory to ensure the convergence of iteration sequence used in the algorithm and further, it is hard to estimate the rate of convergence even if the convergence is satisfied.
Second, the running time cost is increased intolerablely in drawing large- scale graphs, and therefore the advantages of the force-directed approach are limited in practice.
This paper is focused on these problems and presents a sufficient condition for ensuring the convergence of iterations.
We then develop a practical heuristic algorithm for speeding up the iteration in force-directed approach using a successive over-relaxation (SOR) strategy.
The results of computational tests on the several benchmark graph datasets used widely in graph drawing research show that our algorithm can dramatically improve the performance of force-directed approach by decreasing both the number of iterations and running time, and is 1.5 times faster than the latter on average.
The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning.
This model, however, was originally designed to be learned with the presence of both training and test data.
Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs.
To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures.
Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work---FastGCN.
Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference.
We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models.
In particular, training is orders of magnitude more efficient while predictions remain comparably accurate.
An open concept of rough evolution and an axiomatic approach to granules was also developed recently by the present author.
Subsequently the concepts were used in the formal framework of rough Y-systems (RYS) for developing on granular correspondences by her.
These have since been used for a new approach towards comparison of rough algebraic semantics across different semantic domains by way of correspondences that preserve rough evolution and try to avoid contamination.
In this research paper, new methods are proposed and a semantics for handling possibly contaminated operations and structured bigness is developed.
These would also be of natural interest for relative consistency of one collection of knowledge relative other.
This paper proposes several nonlinear control strategies for trajectory tracking of a quadcopter system based on the property of differential flatness.
Its originality is twofold.
Firstly, it provides a flat output for the quadcopter dynamics capable of creating full flat parametrization of the states and inputs.
Moreover, B-splines characterizations of the flat output and their properties allow for optimal trajectory generation subject to way-point constraints.
Secondly, several control strategies based on computed torque control and feedback linearization are presented and compared.
The advantages of flatness within each control strategy are analyzed and detailed through extensive simulation results.
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of copyable code snippets.
Using those snippets raises maintenance and legal issues.
SO's license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license.
While there is a heated debate on SO's license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution.
We present results of a large-scale empirical study analyzing the usage and attribution of non-trivial Java code snippets from SO answers in public GitHub (GH) projects.
We followed three different approaches to triangulate an estimate for the ratio of unattributed usages and conducted two online surveys with software developers to complement our results.
For the different sets of projects that we analyzed, the ratio of projects containing files with a reference to SO varied between 3.3% and 11.9%.
We found that at most 1.8% of all analyzed repositories containing code from SO used the code in a way compatible with CC BY-SA 3.0.
Moreover, we estimate that at most a quarter of the copied code snippets from SO are attributed as required.
Of the surveyed developers, almost one half admitted copying code from SO without attribution and about two thirds were not aware of the license of SO code snippets and its implications.
Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception.
In computer vision, the two problems are traditionally solved in separate tracks.
In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference.
Specifically, we use a pair of focal stacks as input to emulate human perception.
We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering.
We then construct three individual networks: a FocusNet to extract depth from a single focal stack, a EDoFNet to obtain the extended depth of field (EDoF) image from the focal stack, and a StereoNet to conduct stereo matching.
We then integrate them into a unified solution to obtain high quality depth maps.
Comprehensive experiments show that our approach outperforms the state-of-the-art in both accuracy and speed and effectively emulates human vision systems.
A fall is an abnormal activity that occurs rarely; however, missing to identify falls can have serious health and safety implications on an individual.
Due to the rarity of occurrence of falls, there may be insufficient or no training data available for them.
Therefore, standard supervised machine learning methods may not be directly applied to handle this problem.
In this paper, we present a taxonomy for the study of fall detection from the perspective of availability of fall data.
The proposed taxonomy is independent of the type of sensors used and specific feature extraction/selection methods.
The taxonomy identifies different categories of classification methods for the study of fall detection based on the availability of their data during training the classifiers.
Then, we present a comprehensive literature review within those categories and identify the approach of treating a fall as an abnormal activity to be a plausible research direction.
We conclude our paper by discussing several open research problems in the field and pointers for future research.
We present an active detection model for localizing objects in scenes.
The model is class-specific and allows an agent to focus attention on candidate regions for identifying the correct location of a target object.
This agent learns to deform a bounding box using simple transformation actions, with the goal of determining the most specific location of target objects following top-down reasoning.
The proposed localization agent is trained using deep reinforcement learning, and evaluated on the Pascal VOC 2007 dataset.
We show that agents guided by the proposed model are able to localize a single instance of an object after analyzing only between 11 and 25 regions in an image, and obtain the best detection results among systems that do not use object proposals for object localization.
With the recent advancements in Image Processing Techniques and development of new robust computer vision algorithms, new areas of research within Medical Diagnosis and Biomedical Engineering are picking up pace.
This paper provides a comprehensive in-depth case study of Image Processing, Feature Extraction and Analysis of Apical Periodontitis diagnostic cases in IOPA (Intra Oral Peri-Apical) Radiographs, a common case in oral diagnostic pipeline.
This paper provides a detailed analytical approach towards improving the diagnostic procedure with improved and faster results with higher accuracy targeting to eliminate True Negative and False Positive cases.
A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using general-purpose lexical models such as word embeddings.
We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings.
With causality as a use case, we implement this insight in three steps.
First, we generate causal embeddings cost-effectively by bootstrapping cause-effect pairs extracted from free text using a small set of seed patterns.
Second, we train dedicated embeddings over this data, by using task-specific contexts, i.e., the context of a cause is its effect.
Finally, we extend a state-of-the-art reranking approach for QA to incorporate these causal embeddings.
We evaluate the causal embedding models both directly with a casual implication task, and indirectly, in a downstream causal QA task using data from Yahoo!Answers.
We show that explicitly modeling causality improves performance in both tasks.
In the QA task our best model achieves 37.3% P@1, significantly outperforming a strong baseline by 7.7% (relative).
SimRank is a similarity measure between vertices in a graph, which has become a fundamental technique in graph analytics.
Recently, many algorithms have been proposed for efficient evaluation of SimRank similarities.
However, the existing SimRank computation algorithms either overlook uncertainty in graph structures or is based on an unreasonable assumption (Du et al).
In this paper, we study SimRank similarities on uncertain graphs based on the possible world model of uncertain graphs.
Following the random-walk-based formulation of SimRank on deterministic graphs and the possible worlds model of uncertain graphs, we define random walks on uncertain graphs for the first time and show that our definition of random walks satisfies Markov's property.
We formulate the SimRank measure based on random walks on uncertain graphs.
We discover a critical difference between random walks on uncertain graphs and random walks on deterministic graphs, which makes all existing SimRank computation algorithms on deterministic graphs inapplicable to uncertain graphs.
To efficiently compute SimRank similarities, we propose three algorithms, namely the baseline algorithm with high accuracy, the sampling algorithm with high efficiency, and the two-phase algorithm with comparable efficiency as the sampling algorithm and about an order of magnitude smaller relative error than the sampling algorithm.
The extensive experiments and case studies verify the effectiveness of our SimRank measure and the efficiency of our SimRank computation algorithms.
Many real-world problems are composed of several interacting components.
In order to facilitate research on such interactions, the Traveling Thief Problem (TTP) was created in 2013 as the combination of two well-understood combinatorial optimization problems.
With this article, we contribute in four ways.
First, we create a comprehensive dataset that comprises the performance data of 21 TTP algorithms on the full original set of 9720 TTP instances.
Second, we define 55 characteristics for all TPP instances that can be used to select the best algorithm on a per-instance basis.
Third, we use these algorithms and features to construct the first algorithm portfolios for TTP, clearly outperforming the single best algorithm.
Finally, we study which algorithms contribute most to this portfolio.
The widespread usage of surveillance cameras in smart cities has resulted in a gigantic volume of video data whose indexing, retrieval and management is a challenging issue.
Video summarization tends to detect important visual data from the surveillance stream and can help in efficient indexing and retrieval of required data from huge surveillance datasets.
In this research article, we propose an efficient convolutional neural network based summarization method for surveillance videos of resource-constrained devices.
Shot segmentation is considered as a backbone of video summarization methods and it affects the overall quality of the generated summary.
Thus, we propose an effective shot segmentation method using deep features.
Furthermore, our framework maintains the interestingness of the generated summary using image memorability and entropy.
Within each shot, the frame with highest memorability and entropy score is considered as a keyframe.
The proposed method is evaluated on two benchmark video datasets and the results are encouraging compared to state-of-the-art video summarization methods.
Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty inherent in the problem.
In this paper, we propose a novel Bayesian model-agnostic meta-learning method.
The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework.
During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation.
In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update.
Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement.
Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.
This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics.
The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities.
The re-search results are applied to support and foster interactions in online communities and manage their resources.
In the paper, the approximate sequence for entropy of some binary hidden Markov models has been found to have two bound sequences, the low bound sequence and the upper bound sequence.
The error bias of the approximate sequence is bound by a geometric sequence with a scale factor less than 1 which decreases quickly to zero.
It helps to understand the convergence of entropy rate of generic hidden Markov models, and it provides a theoretical base for estimating the entropy rate of some hidden Markov models at any accuracy.
The complexity of the healthcare ecosystem and the trans-disciplinary convergence which is essential for its function, makes it difficult to address healthcare as one domain.
Data curation and analysis of the information may boost our health related knowledge.
Increasing connectivity and improving infrastructure may help, among other things, to uncover facts and observations which may influence the future of global health.
Constrained model predictive control (MPC) is a widely used control strategy, which employs moving horizon-based on-line optimisation to compute the optimum path of the manipulated variables.
Nonlinear MPC can utilize detailed models but it is computationally expensive; on the other hand linear MPC may not be adequate.
Piecewise affine (PWA) models can describe the underlying nonlinear dynamics more accurately, therefore they can provide a viable trade-off through their use in multi-model linear MPC configurations, which avoid integer programming.
However, such schemes may introduce uncertainty affecting the closed loop stability.
In this work, we propose an input to output stability analysis for closed loop systems, consisting of PWA models, where an observer and multi-model linear MPC are applied together, under unstructured uncertainty.
Integral quadratic constraints (IQCs) are employed to assess the robustness of MPC under uncertainty.
We create a model pool, by performing linearisation on selected transient points.
All the possible uncertainties and nonlinearities (including the controller) can be introduced in the framework, assuming that they admit the appropriate IQCs, whilst the dissipation inequality can provide necessary conditions incorporating IQCs.
We demonstrate the existence of static multipliers, which can reduce the conservatism of the stability analysis significantly.
The proposed methodology is demonstrated through two engineering case studies.
Theoretical analysis of the error landscape of deep neural networks has garnered significant interest in recent years.
In this work, we theoretically study the importance of noise in the trajectories of gradient descent towards optimal solutions in multi-layer neural networks.
We show that adding noise (in different ways) to a neural network while training increases the rank of the product of weight matrices of a multi-layer linear neural network.
We thus study how adding noise can assist reaching a global optimum when the product matrix is full-rank (under certain conditions).
We establish theoretical foundations between the noise induced into the neural network - either to the gradient, to the architecture, or to the input/output to a neural network - and the rank of product of weight matrices.
We corroborate our theoretical findings with empirical results.
Privacy is a major good for users of personalized services such as recommender systems.
When applied to the field of health informatics, privacy concerns of users may be amplified, but the possible utility of such services is also high.
Despite availability of technologies such as k-anonymity, differential privacy, privacy-aware recommendation, and personalized privacy trade-offs, little research has been conducted on the users' willingness to share health data for usage in such systems.
In two conjoint-decision studies (sample size n=521), we investigate importance and utility of privacy-preserving techniques related to sharing of personal health data for k-anonymity and differential privacy.
Users were asked to pick a preferred sharing scenario depending on the recipient of the data, the benefit of sharing data, the type of data, and the parameterized privacy.
Users disagreed with sharing data for commercial purposes regarding mental illnesses and with high de-anonymization risks but showed little concern when data is used for scientific purposes and is related to physical illnesses.
Suggestions for health recommender system development are derived from the findings.
Visual object recognition plays an essential role in human daily life.
This ability is so efficient that we can recognize a face or an object seemingly without effort, though they may vary in position, scale, pose, and illumination.
In the field of computer vision, a large number of studies have been carried out to build a human-like object recognition system.
Recently, deep neural networks have shown impressive progress in object classification performance, and have been reported to surpass humans.
Yet there is still lack of thorough and fair comparison between humans and artificial recognition systems.
While some studies consider artificially degraded images, human recognition performance on dataset widely used for deep neural networks has not been fully evaluated.
The present paper carries out an extensive experiment to evaluate human classification accuracy on CIFAR10, a well-known dataset of natural images.
This then allows for a fair comparison with the state-of-the-art deep neural networks.
Our CIFAR10-based evaluations show very efficient object recognition of recent CNNs but, at the same time, prove that they are still far from human-level capability of generalization.
Moreover, a detailed investigation using multiple levels of difficulty reveals that easy images for humans may not be easy for deep neural networks.
Such images form a subset of CIFAR10 that can be employed to evaluate and improve future neural networks.
In this paper, we propose a mechanism for packet marking called Probabilistic Congestion Notification (PCN).
This scheme makes use of the 1-bit Explicit Congestion Notification (ECN) field in the Internet Protocol (IP) header.
It allows the source to estimate the exact level of congestion at each intermediate queue.
By knowing this, the source could take avoiding action either by adapting its sending rate or by using alternate routes.
The estimation mechanism makes use of time series analysis both to improve the quality of the congestion estimation and to predict, ahead of time, the congestion level which subsequent packets will encounter.
The proposed protocol is tested in ns-2 simulator using a background of real Internet traffic traces.
Results show that the methods can successfully calculate the congestion at any queue along the path with low error levels.
Digital platforms enable the observation of learning behaviors through fine-grained log traces, offering more detailed clues for analysis.
In addition to previous descriptive and predictive log analysis, this study aims to simultaneously model learner activities, event time spans, and interaction levels using the proposed Hidden Behavior Traits Model (HBTM).
We evaluated model performance and explored their capability of clustering learners on a public dataset, and tried to interpret the machine recognized latent behavior patterns.
Quantitative and qualitative results demonstrated the promising value of HBTM.
Results of this study can contribute to the literature of online learner modeling and learning service planning.
We present a framework and its implementation relying on Natural Language Processing methods, which aims at the identification of exercise item candidates from corpora.
The hybrid system combining heuristics and machine learning methods includes a number of relevant selection criteria.
We focus on two fundamental aspects: linguistic complexity and the dependence of the extracted sentences on their original context.
Previous work on exercise generation addressed these two criteria only to a limited extent, and a refined overall candidate sentence selection framework appears also to be lacking.
In addition to a detailed description of the system, we present the results of an empirical evaluation conducted with language teachers and learners which indicate the usefulness of the system for educational purposes.
We have integrated our system into a freely available online learning platform.
Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science.
Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets.
Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has.
The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products.
Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation.
The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing.
Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.
We consider the problem of maximizing the harvested power in Multiple Input Multiple Output (MIMO) Simultaneous Wireless Information and Power Transfer (SWIPT) systems with power splitting reception.
Different from recently proposed designs, with our optimization problem formulation we target for the jointly optimal transmit precoding and receive uniform power splitting (UPS) ratio maximizing the harvested power, while ensuring that the quality-of-service requirement of the MIMO link is satisfied.
We assume practical Radio-Frequency (RF) energy harvesting (EH) receive operation that results in a non-convex optimization problem for the design parameters, which we first formulate in an equivalent generalized convex problem that we then solve optimally.
We also derive the globally optimal transmit precoding design for ideal reception.
Furthermore, we present analytical bounds for the key variables of both considered problems along with tight high signal-to-noise ratio approximations for their optimal solutions.
Two algorithms for the efficient computation of the globally optimal designs are outlined.
The first requires solving a small number of non-linear equations, while the second is based on a two-dimensional search having linear complexity.
Computer simulation results are presented validating the proposed analysis, providing key insights on various system parameters, and investigating the achievable EH gains over benchmark schemes.
We consider the framework of aggregative games, in which the cost function of each agent depends on his own strategy and on the average population strategy.
As first contribution, we investigate the relations between the concepts of Nash and Wardrop equilibrium.
By exploiting a characterization of the two equilibria as solutions of variational inequalities, we bound their distance with a decreasing function of the population size.
As second contribution, we propose two decentralized algorithms that converge to such equilibria and are capable of coping with constraints coupling the strategies of different agents.
Finally, we study the applications of charging of electric vehicles and of route choice on a road network.
Recent years have seen the increasing need of location awareness by mobile applications.
This paper presents a room-level indoor localization approach based on the measured room's echos in response to a two-millisecond single-tone inaudible chirp emitted by a smartphone's loudspeaker.
Different from other acoustics-based room recognition systems that record full-spectrum audio for up to ten seconds, our approach records audio in a narrow inaudible band for 0.1 seconds only to preserve the user's privacy.
However, the short-time and narrowband audio signal carries limited information about the room's characteristics, presenting challenges to accurate room recognition.
This paper applies deep learning to effectively capture the subtle fingerprints in the rooms' acoustic responses.
Our extensive experiments show that a two-layer convolutional neural network fed with the spectrogram of the inaudible echos achieve the best performance, compared with alternative designs using other raw data formats and deep models.
Based on this result, we design a RoomRecognize cloud service and its mobile client library that enable the mobile application developers to readily implement the room recognition functionality without resorting to any existing infrastructures and add-on hardware.
Extensive evaluation shows that RoomRecognize achieves 99.7%, 97.7%, 99%, and 89% accuracy in differentiating 22 and 50 residential/office rooms, 19 spots in a quiet museum, and 15 spots in a crowded museum, respectively.
Compared with the state-of-the-art approaches based on support vector machine, RoomRecognize significantly improves the Pareto frontier of recognition accuracy versus robustness against interfering sounds (e.g., ambient music).
In-memory computing is a promising approach to addressing the processor-memory data transfer bottleneck in computing systems.
We propose Spin-Transfer Torque Compute-in-Memory (STT-CiM), a design for in-memory computing with Spin-Transfer Torque Magnetic RAM (STT-MRAM).
The unique properties of spintronic memory allow multiple wordlines within an array to be simultaneously enabled, opening up the possibility of directly sensing functions of the values stored in multiple rows using a single access.
We propose modifications to STT-MRAM peripheral circuits that leverage this principle to perform logic, arithmetic, and complex vector operations.
We address the challenge of reliable in-memory computing under process variations by extending ECC schemes to detect and correct errors that occur during CiM operations.
We also address the question of how STT-CiM should be integrated within a general-purpose computing system.
To this end, we propose architectural enhancements to processor instruction sets and on-chip buses that enable STT-CiM to be utilized as a scratchpad memory.
Finally, we present data mapping techniques to increase the effectiveness of STT-CiM.
We evaluate STT-CiM using a device-to-architecture modeling framework, and integrate cycle-accurate models of STT-CiM with a commercial processor and on-chip bus (Nios II and Avalon from Intel).
Our system-level evaluation shows that STT-CiM provides system-level performance improvements of 3.93x on average (upto 10.4x), and concurrently reduces memory system energy by 3.83x on average (upto 12.4x).
Nowadays, editors tend to separate different subtopics of a long Wiki-pedia article into multiple sub-articles.
This separation seeks to improve human readability.
However, it also has a deleterious effect on many Wikipedia-based tasks that rely on the article-as-concept assumption, which requires each entity (or concept) to be described solely by one article.
This underlying assumption significantly simplifies knowledge representation and extraction, and it is vital to many existing technologies such as automated knowledge base construction, cross-lingual knowledge alignment, semantic search and data lineage of Wikipedia entities.
In this paper we provide an approach to match the scattered sub-articles back to their corresponding main-articles, with the intent of facilitating automated Wikipedia curation and processing.
The proposed model adopts a hierarchical learning structure that combines multiple variants of neural document pair encoders with a comprehensive set of explicit features.
A large crowdsourced dataset is created to support the evaluation and feature extraction for the task.
Based on the large dataset, the proposed model achieves promising results of cross-validation and significantly outperforms previous approaches.
Large-scale serving on the entire English Wikipedia also proves the practicability and scalability of the proposed model by effectively extracting a vast collection of newly paired main and sub-articles.
Context: Existing knowledge in agile software development suggests that individual competency (e.g. skills) is a critical success factor for agile projects.
While assuming that technical skills are important for every kind of software development project, many researchers suggest that non-technical individual skills are especially important in agile software development.
Objective: In this paper, we investigate whether non-technical individual skills can predict the use of agile practices.
Method: Through creating a set of multiple linear regression models using a total of 113 participants from agile teams in six software development organizations from The Netherlands and Brazil, we analyzed the predictive power of non-technical individual skills in relation to agile practices.
Results: The results show that there is surprisingly low power in using non-technical individual skills to predict (i.e. explain variance in) the mature use of agile practices in software development.
Conclusions: Therefore, we conclude that looking at non-technical individual skills is not the optimal level of analysis when trying to understand, and explain, the mature use of agile practices in the software development context.
We argue that it is more important to focus on the non-technical skills as a team-level capacity instead of assuring that all individuals possess such skills when understanding the use of the agile practices.
Scholarly document creation continues to face various obstacles.
Scholarly text production requires more complex word processors than other forms of texts because of the complex structures of citations, formulas and figures.
The need for peer review, often single-blind or double-blind, creates needs for document management that other texts do not require.
Additionally, the need for collaborative editing, security and strict document access rules means that many existing word processors are imperfect solutions for academics.
Nevertheless, most papers continue to be written using Microsoft Word (Sadeghi et al.
2017).
We here analyze some of the problems with existing academic solutions and then present an argument why we believe that running an open source academic writing solution for academic purposes, such as Fidus Writer, on a Network Attached Storage (NAS) server could be a viable alternative.
As the amount of textual data has been rapidly increasing over the past decade, efficient similarity search methods have become a crucial component of large-scale information retrieval systems.
A popular strategy is to represent original data samples by compact binary codes through hashing.
A spectrum of machine learning methods have been utilized, but they often lack expressiveness and flexibility in modeling to learn effective representations.
The recent advances of deep learning in a wide range of applications has demonstrated its capability to learn robust and powerful feature representations for complex data.
Especially, deep generative models naturally combine the expressiveness of probabilistic generative models with the high capacity of deep neural networks, which is very suitable for text modeling.
However, little work has leveraged the recent progress in deep learning for text hashing.
In this paper, we propose a series of novel deep document generative models for text hashing.
The first proposed model is unsupervised while the second one is supervised by utilizing document labels/tags for hashing.
The third model further considers document-specific factors that affect the generation of words.
The probabilistic generative formulation of the proposed models provides a principled framework for model extension, uncertainty estimation, simulation, and interpretability.
Based on variational inference and reparameterization, the proposed models can be interpreted as encoder-decoder deep neural networks and thus they are capable of learning complex nonlinear distributed representations of the original documents.
We conduct a comprehensive set of experiments on four public testbeds.
The experimental results have demonstrated the effectiveness of the proposed supervised learning models for text hashing.
Software ecosystems can be viewed as socio-technical networks consisting of technical components (software packages) and social components (communities of developers) that maintain the technical components.
Ecosystems evolve over time through socio-technical changes that may greatly impact the ecosystem's sustainability.
Social changes like developer turnover may lead to technical degradation.
This motivates the need to identify those factors leading to developer abandonment, in order to automate the process of identifying developers with high abandonment risk.
This paper compares such factors for two software package ecosystems, RubyGems and npm.
We analyse the evolution of their packages hosted on GitHub, considering development activity in terms of commits, and social interaction with other developers in terms of comments associated to commits, issues or pull requests.
We analyse this socio-technical activity for more than 30k and 60k developers for RubyGems and npm respectively.
We use survival analysis to identify which factors coincide with a lower survival probability.
Our results reveal that developers with a higher probability to abandon an ecosystem: do not engage in discussions with other developers; do not have strong social and technical activity intensity; communicate or commit less frequently; and do not participate to both technical and social activities for long periods of time.
Such observations could be used to automate the identification of developers with a high probability of abandoning the ecosystem and, as such, reduce the risks associated to knowledge loss.
Surrogate Text Representation (STR) is a profitable solution to efficient similarity search on metric space using conventional text search engines, such as Apache Lucene.
This technique is based on comparing the permutations of some reference objects in place of the original metric distance.
However, the Achilles heel of STR approach is the need to reorder the result set of the search according to the metric distance.
This forces to use a support database to store the original objects, which requires efficient random I/O on a fast secondary memory (such as flash-based storages).
In this paper, we propose to extend the Surrogate Text Representation to specifically address a class of visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD).
This approach is based on representing the individual sub-vectors forming the VLAD vector with the STR, providing a finer representation of the vector and enabling us to get rid of the reordering phase.
The experiments on a publicly available dataset show that the extended STR outperforms the baseline STR achieving satisfactory performance near to the one obtained with the original VLAD vectors.
Three different algorithms used for eye pupil location were described and tested.
Algorithm efficiency comparison was based on human faces images taken from the BioID database.
Moreover all the eye localisation methods were implemented in a dedicated application supporting eye movement based computer control.
In this case human face images were acquired by a webcam and processed in a real-time.
To strengthen the anonymity of Bitcoin, several centralized coin-mixing providers (mixers) such as BitcoinFog.com, BitLaundry.com, and Blockchain.info assist users to mix Bitcoins through CoinJoin transactions with multiple inputs and multiple outputs to uncover the relationship between them.
However, these mixers know the output address of each user, such that they cannot provide true anonymity.
This paper proposes a centralized coin-mixing algorithm based on an elliptic curve blind signature scheme (denoted as Blind-Mixing) that obstructs mixers from linking an input address with an output address.
Comparisons among three blind signature based algorithms, Blind-Mixing, BlindCoin, and RSA Coin-Mixing, are conducted.
It is determined that BlindCoin may be deanonymized because of its use of a public log.
In RSA Coin-Mixing, a user's Bitcoins may be falsely claimed by another.
In addition, the blind signature scheme of Blind-Mixing executes 10.5 times faster than that of RSA Coin-Mixing.
In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns.
Representation of output co-occurrence patterns is typically limited to a hand-designed graphical model, such as a linear-chain CRF representing short-term Markov dependencies among successive labels.
This paper presents a method that learns embedded representations of latent output structure in sequence data.
Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized---effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels.
We demonstrate accuracy improvements and interpretable latent structure in a synthetic but complex task based on CoNLL named entity recognition.
In this work we seek for an approach to integrate safety in the learning process that relies on a partly known state-space model of the system and regards the unknown dynamics as an additive bounded disturbance.
We introduce a framework for safely learning a control strategy for a given system with an additive disturbance.
On the basis of the known part of the model, a safe set in which the system can learn safely, the algorithm can choose optimal actions for pursuing the target set as long as the safety-preserving condition is satisfied.
After some learning episodes, the disturbance can be updated based on real-world data.
To this end, Gaussian Process regression is conducted on the collected disturbance samples.
Since the unstable nature of the law of the real world, for example, change of friction or conductivity with the temperature, we expect to have the more robust solution of optimal control problem.
For evaluation of approach described above we choose an inverted pendulum as a benchmark model.
The proposed algorithm manages to learn a policy that does not violate the pre-specified safety constraints.
Observed performance is improved when it was incorporated exploration set up to make sure that an optimal policy is learned everywhere in the safe set.
Finally, we outline some promising directions for future research beyond the scope of this paper.
Widespread adoption of indoor positioning systems based on WiFi fingerprinting is at present hindered by the large efforts required for measurements collection during the offline phase.
Two approaches were recently proposed to address such issue: crowdsourcing and RSS radiomap prediction based on either interpolation or propagation channel model fitting from a small set of measurements.
RSS prediction promises better positioning accuracy when compared to crowdsourcing but no systematic analysis of the impact of system parameters on positioning accuracy is available.
This paper fills this gap by introducing ViFi, an indoor positioning system that relies on RSS prediction based on Multi-Wall Multi-Floor (MWMF) propagation model to generate a discrete RSS radiomap (virtual fingerprints).
The ViFi system is subject to an extensive experimental analysis in order to address the role of all relevant system parameters.
Experimental results obtained in two different testbeds show that the introduction of virtual fingerprints allows reduction by a factor of 10 of the number of measurements, without significant loss in positioning accuracy.
The use of two testbeds also allows to derive general guidelines for the design and the implementation of a virtual fingerprinting system.
Various studies have attempted to assess the amount of free full text available on the web and recent work have suggested that we are close to the 50% mark for freely available articles (Archambault et al.2013; Bjork et al.2010; Jamali and Nabavi 2015).
It is natural to wonder if this might reduce researchers' reliance on library subscriptions for access.
To do so, we need to determine not just what papers researchers are citing to that are free today, but to estimate if the papers they were citing were freely available at the time they were citing it.
We attempt to do so for a sample of citations made by researchers in the Singapore Management University in the field of Economics.
Recent studies have shown that sketches and diagrams play an important role in the daily work of software developers.
If these visual artifacts are archived, they are often detached from the source code they document, because there is no adequate tool support to assist developers in capturing, archiving, and retrieving sketches related to certain source code artifacts.
This paper presents SketchLink, a tool that aims at increasing the value of sketches and diagrams created during software development by supporting developers in these tasks.
Our prototype implementation provides a web application that employs the camera of smartphones and tablets to capture analog sketches, but can also be used on desktop computers to upload, for instance, computer-generated diagrams.
We also implemented a plugin for a Java IDE that embeds the links in Javadoc comments and visualizes them in situ in the source code editor as graphical icons.
Autism spectrum condition (ASC) or autism spectrum disorder (ASD) is primarily identified with the help of behavioral indications encompassing social, sensory and motor characteristics.
Although categorized, recurring motor actions are measured during diagnosis, quantifiable measures that ascertain kinematic physiognomies in the movement configurations of autistic persons are not adequately studied, hindering the advances in understanding the etiology of motor mutilation.
Subject aspects such as behavioral characters that influences ASD need further exploration.
Presently, limited autism datasets concomitant with screening ASD are available, and a majority of them are genetic.
Hence, in this study, we used a dataset related to autism screening enveloping ten behavioral and ten personal attributes that have been effective in diagnosing ASD cases from controls in behavior science.
ASD diagnosis is time exhaustive and uneconomical.
The burgeoning ASD cases worldwide mandate a need for the fast and economical screening tool.
Our study aimed to implement an artificial neural network with the Levenberg-Marquardt algorithm to detect ASD and examine its predictive accuracy.
Consecutively, develop a clinical decision support system for early ASD identification.
Information security is a critical issue in modern society and image watermarking can effectively prevent unauthorized information access.
Optical image watermarking techniques generally have advantages of parallel high-speed processing and multi-dimensional capabilities compared with digital approaches.
This paper provides a comprehensive review on the research works related to optical image hiding and watermarking techniques conducted in the past decade.
The past research works are focused on two major aspects, various optical systems for image hiding and the methods for embedding optical system output into a host image.
A summary of the state-of-the-art works is made from these two perspectives.
Data labeling is a necessary but often slow process that impedes the development of interactive systems for modern data analysis.
Despite rising demand for manual data labeling, there is a surprising lack of work addressing its high and unpredictable latency.
In this paper, we introduce CLAMShell, a system that speeds up crowds in order to achieve consistently low-latency data labeling.
We offer a taxonomy of the sources of labeling latency and study several large crowd-sourced labeling deployments to understand their empirical latency profiles.
Driven by these insights, we comprehensively tackle each source of latency, both by developing novel techniques such as straggler mitigation and pool maintenance and by optimizing existing methods such as crowd retainer pools and active learning.
We evaluate CLAMShell in simulation and on live workers on Amazon's Mechanical Turk, demonstrating that our techniques can provide an order of magnitude speedup and variance reduction over existing crowdsourced labeling strategies.
Over the years, software architecture has become a established discipline, both in academia and industry, and the interest on software architecture documentation has increased.
In this context, the improvement of methods, tools, and techniques around architecture documentation is of paramount importance.
We conducted a survey with 147 industrial participants (31 from Brazil), analyzing their current problems and future wishes.
We identified that Brazilian stakeholders need updated architecture documents with the right information.
Finally, the automation of some parts of the documentation will reduce the effort during the creation of the documents.
But first, is necessary to change the culture of the stakeholders.
They have to participate actively in the architecture documents creation.
Videos represent the primary source of information for surveillance applications and are available in large amounts but in most cases contain little or no annotation for supervised learning.
This article reviews the state-of-the-art deep learning based methods for video anomaly detection and categorizes them based on the type of model and criteria of detection.
We also perform simple studies to understand the different approaches and provide the criteria of evaluation for spatio-temporal anomaly detection.
Understanding causal explanations - reasons given for happenings in one's life - has been found to be an important psychological factor linked to physical and mental health.
Causal explanations are often studied through manual identification of phrases over limited samples of personal writing.
Automatic identification of causal explanations in social media, while challenging in relying on contextual and sequential cues, offers a larger-scale alternative to expensive manual ratings and opens the door for new applications (e.g. studying prevailing beliefs about causes, such as climate change).
Here, we explore automating causal explanation analysis, building on discourse parsing, and presenting two novel subtasks: causality detection (determining whether a causal explanation exists at all) and causal explanation identification (identifying the specific phrase that is the explanation).
We achieve strong accuracies for both tasks but find different approaches best: an SVM for causality prediction (F1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F1 = 0.853).
Finally, we explore applications of our complete pipeline (F1 = 0.868), showing demographic differences in mentions of causal explanation and that the association between a word and sentiment can change when it is used within a causal explanation.
Traditional approaches for color propagation in videos rely on some form of matching between consecutive video frames.
Using appearance descriptors, colors are then propagated both spatially and temporally.
These methods, however, are computationally expensive and do not take advantage of semantic information of the scene.
In this work we propose a deep learning framework for color propagation that combines a local strategy, to propagate colors frame-by-frame ensuring temporal stability, and a global strategy, using semantics for color propagation within a longer range.
Our evaluation shows the superiority of our strategy over existing video and image color propagation methods as well as neural photo-realistic style transfer approaches.
This paper presents an intelligent traffic monitoring system using wireless vision sensor network that captures and processes the real-time video image to obtain the traffic flow rate and vehicle speeds along different urban roadways.
This system will display the traffic states on the front roadways that can guide the drivers to select the right way and avoid potential traffic congestions.
On the other hand, it will also monitor the vehicle speeds and store the vehicle details, for those breaking the roadway speed limits, in its database.
The real-time traffic data is processed by the Personal Computer (PC) at the sub roadway station and the traffic flow rate data is transmitted to the main roadway station Arduino 3G via email, where the data is extracted and traffic flow rate displayed.
Emergency events involving fire are potentially harmful, demanding a fast and precise decision making.
The use of crowdsourcing image and videos on crisis management systems can aid in these situations by providing more information than verbal/textual descriptions.
Due to the usual high volume of data, automatic solutions need to discard non-relevant content without losing relevant information.
There are several methods for fire detection on video using color-based models.
However, they are not adequate for still image processing, because they can suffer on high false-positive results.
These methods also suffer from parameters with little physical meaning, which makes fine tuning a difficult task.
In this context, we propose a novel fire detection method for still images that uses classification based on color features combined with texture classification on superpixel regions.
Our method uses a reduced number of parameters if compared to previous works, easing the process of fine tuning the method.
Results show the effectiveness of our method of reducing false-positives while its precision remains compatible with the state-of-the-art methods.
Cooperative ITS is enabling vehicles to communicate with the infrastructure to provide improvements in traffic control.
A promising approach consists in anticipating the road profile and the upcoming dynamic events like traffic lights.
This topic has been addressed in the French public project Co-Drive through functions developed by Valeo named Green Light Optimal Speed Advisor (GLOSA).
The system advises the optimal speed to pass the next traffic light without stopping.
This paper presents results of its performance in different scenarios through simulations and real driving measurements.
A scaling is done in an urban area, with different penetration rates in vehicle and infrastructure equipment for vehicular communication.
Our simulation results indicate that GLOSA can reduce CO2 emissions, waiting time and travel time, both in experimental conditions and in real traffic conditions.
We introduce segmental recurrent neural networks (SRNNs) which define, given an input sequence, a joint probability distribution over segmentations of the input and labelings of the segments.
Representations of the input segments (i.e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.
These local compatibility scores are integrated using a global semi-Markov conditional random field.
Both fully supervised training -- in which segment boundaries and labels are observed -- as well as partially supervised training -- in which segment boundaries are latent -- are straightforward.
Experiments on handwriting recognition and joint Chinese word segmentation/POS tagging show that, compared to models that do not explicitly represent segments such as BIO tagging schemes and connectionist temporal classification (CTC), SRNNs obtain substantially higher accuracies.
Deep neural networks have shown incredible performance for inference tasks in a variety of domains.
Unfortunately, most current deep networks are enormous cloud-based structures that require significant storage space, which limits scaling of deep learning as a service (DLaaS) and use for on-device augmented intelligence.
This paper is concerned with finding universal lossless compressed representations of deep feedforward networks with synaptic weights drawn from discrete sets, and directly performing inference without full decompression.
The basic insight that allows less rate than naive approaches is the recognition that the bipartite graph layers of feedforward networks have a kind of permutation invariance to the labeling of nodes, in terms of inferential operation.
We provide efficient algorithms to dissipate this irrelevant uncertainty and then use arithmetic coding to nearly achieve the entropy bound in a universal manner.
We also provide experimental results of our approach on the MNIST dataset.
Recent studies show the increasing popularity of distributed cloud applications, which are composed of multiple microservices.
Besides their known benefits, microservice architecture also enables to mix and match cloud applications and Network Function Virtualization (NFV) services (service chains), which are composed of Virtual Network Functions (VNFs).
Provisioning complex services containing VNFs and microservices in a combined NFV/cloud platform can enhance service quality and optimise cost.
Such a platform can be based on the multi-cloud concept.
However, current multi-cloud solutions do not support NFV requirements, making them inadequate to support complex services.
In this paper, we investigate these challenges and propose a solution for jointly managing and orchestrating microservices and virtual network functions.
In this paper, we address the issue of how to enhance the generalization performance of convolutional neural networks (CNN) in the early learning stage for image classification.
This is motivated by real-time applications that require the generalization performance of CNN to be satisfactory within limited training time.
In order to achieve this, a novel hierarchical transfer CNN framework is proposed.
It consists of a group of shallow CNNs and a cloud CNN, where the shallow CNNs are trained firstly and then the first layers of the trained shallow CNNs are used to initialize the first layer of the cloud CNN.
This method will boost the generalization performance of the cloud CNN significantly, especially during the early stage of training.
Experiments using CIFAR-10 and ImageNet datasets are performed to examine the proposed method.
Results demonstrate the improvement of testing accuracy is 12% on average and as much as 20% for the CIFAR-10 case while 5% testing accuracy improvement for the ImageNet case during the early stage of learning.
It is also shown that universal improvements of testing accuracy are obtained across different settings of dropout and number of shallow CNNs.
In this paper we study Z2Z4Z8-additive codes, which are the extension of recently introduced Z2Z4-additive codes.
We determine the standard forms of the generator and parity-check matrices of Z2Z4Z8-additive codes.
Moreover, we investigate Z2Z4Z8-cyclic codes giving their generator polynomials and spanning sets.
We also give some illustrative examples of both Z2Z4Z8-additive codes and Z2Z4Z8-cyclic codes.
We study two aspects of noisy computations during inference.
The first aspect is how to mitigate their side effects for naturally trained deep learning systems.
One of the motivations for looking into this problem is to reduce the high power cost of conventional computing of neural networks through the use of analog neuromorphic circuits.
Traditional GPU/CPU-centered deep learning architectures exhibit bottlenecks in power-restricted applications (e.g., embedded systems).
The use of specialized neuromorphic circuits, where analog signals passed through memory-cell arrays are sensed to accomplish matrix-vector multiplications, promises large power savings and speed gains but brings with it the problems of limited precision of computations and unavoidable analog noise.
We manage to improve inference accuracy from 21.1% to 99.5% for MNIST images, from 29.9% to 89.1% for CIFAR10, and from 15.5% to 89.6% for MNIST stroke sequences with the presence of strong noise (with signal-to-noise power ratio being 0 dB) by noise-injected training and a voting method.
This observation promises neural networks that are insensitive to inference noise, which reduces the quality requirements on neuromorphic circuits and is crucial for their practical usage.
The second aspect is how to utilize the noisy inference as a defensive architecture against black-box adversarial attacks.
During inference, by injecting proper noise to signals in the neural networks, the robustness of adversarially-trained neural networks against black-box attacks has been further enhanced by 0.5% and 1.13% for two adversarially trained models for MNIST and CIFAR10, respectively.
Network coding can significantly improve the transmission rate of communication networks with packet loss compared with routing.
However, using network coding usually incurs high computational and storage costs in the network devices and terminals.
For example, some network coding schemes require the computational and/or storage capacities of an intermediate network node to increase linearly with the number of packets for transmission, making such schemes difficult to be implemented in a router-like device that has only constant computational and storage capacities.
In this paper, we introduce BATched Sparse code (BATS code), which enables a digital fountain approach to resolve the above issue.
BATS code is a coding scheme that consists of an outer code and an inner code.
The outer code is a matrix generation of a fountain code.
It works with the inner code that comprises random linear coding at the intermediate network nodes.
BATS codes preserve such desirable properties of fountain codes as ratelessness and low encoding/decoding complexity.
The computational and storage capacities of the intermediate network nodes required for applying BATS codes are independent of the number of packets for transmission.
Almost capacity-achieving BATS code schemes are devised for unicast networks, two-way relay networks, tree networks, a class of three-layer networks, and the butterfly network.
For general networks, under different optimization criteria, guaranteed decoding rates for the receiving nodes can be obtained.
Case Law has a significant impact on the proceedings of legal cases.
Therefore, the information that can be obtained from previous court cases is valuable to lawyers and other legal officials when performing their duties.
This paper describes a methodology of applying discourse relations between sentences when processing text documents related to the legal domain.
In this study, we developed a mechanism to classify the relationships that can be observed among sentences in transcripts of United States court cases.
First, we defined relationship types that can be observed between sentences in court case transcripts.
Then we classified pairs of sentences according to the relationship type by combining a machine learning model and a rule-based approach.
The results obtained through our system were evaluated using human judges.
To the best of our knowledge, this is the first study where discourse relationships between sentences have been used to determine relationships among sentences in legal court case transcripts.
We analyze a large-scale mobile phone call dataset with the metadata of the mobile phone users, including age, gender, and billing locality, to uncover the nature of relationships between peers or individuals of similar ages.
We show that in addition to the age and gender of users, the information about the ranks of users to each other in their egocentric networks is crucial in characterizing intimate and casual relationships of peers.
The opposite-gender pairs in intimate relationships are found to show the highest levels of call frequency and daily regularity, consistent with small-scale studies on romantic partners.
This is followed by the same-gender pairs in intimate relationships, while the lowest call frequency and daily regularity are observed for the pairs in casual relationships.
We also find that older pairs tend to call less frequently and less regularly than younger pairs, while the average call durations exhibit a more complex dependence on age.
We expect that a more detailed analysis can help us better characterize the nature of peer relationships and distinguish various types of relations, such as siblings, friends, and romantic partners, more clearly.
Real-world applications could benefit from the ability to automatically generate a fine-grained ranking of photo aesthetics.
However, previous methods for image aesthetics analysis have primarily focused on the coarse, binary categorization of images into high- or low-aesthetic categories.
In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function.
Our model incorporates joint learning of meaningful photographic attributes and image content information which can help regularize the complicated photo aesthetics rating problem.
To train and analyze this model, we have assembled a new aesthetics and attributes database (AADB) which contains aesthetic scores and meaningful attributes assigned to each image by multiple human raters.
Anonymized rater identities are recorded across images allowing us to exploit intra-rater consistency using a novel sampling strategy when computing the ranking loss of training image pairs.
We show the proposed sampling strategy is very effective and robust in face of subjective judgement of image aesthetics by individuals with different aesthetic tastes.
Experiments demonstrate that our unified model can generate aesthetic rankings that are more consistent with human ratings.
To further validate our model, we show that by simply thresholding the estimated aesthetic scores, we are able to achieve state-or-the-art classification performance on the existing AVA dataset benchmark.
Smile and Learn is an Ed-Tech company that runs a smart library with more that 100 applications, games and interactive stories, aimed at children aged 2 to 10 and their families.
Given the complexity of navigating all the content, the library implements a recommender system.
The purpose of this paper is to evaluate two aspects of such system: the influence of the order of recommendations on user exploratory behavior, and the impact of the choice of the recommendation algorithm on engagement.
The assessment, based on data collected between 2018/10/15 and 2018/12/01, required the analysis of the number of clicks performed on the recommendations depending on their ordering, and an A/B/C testing where two recommender algorithms were compared with a random recommendation that served as baseline.
The results suggest a direct connection between the order of the recommendation and the interest raised, and the superiority of recommendations based on popularity against other alternatives.
In recent years, the persuasive interventions for inducing sustainable urban mobility behaviours has become a very active research field.
This review paper systematically analyses existing approaches and prototype systems and describes and classifies the persuasive strategies used for changing behaviour in the domain of transport.
It also studies the results and recommendations derived from pilot studies, and as a result of this analysis highlights the need for personalizing and tailoring persuasive technology to various user characteristics.
We also discuss the possible role of context-aware persuasive systems for increasing the number of sustainable choices.
Finally, recommendations for future investigations on scholarly persuasive systems are proposed.
We study pure-strategy Nash equilibria in multi-player concurrent deterministic games, for a variety of preference relations.
We provide a novel construction, called the suspect game, which transforms a multi-player concurrent game into a two-player turn-based game which turns Nash equilibria into winning strategies (for some objective that depends on the preference relations of the players in the original game).
We use that transformation to design algorithms for computing Nash equilibria in finite games, which in most cases have optimal worst-case complexity, for large classes of preference relations.
This includes the purely qualitative framework, where each player has a single omega-regular objective that she wants to satisfy, but also the larger class of semi-quantitative objectives, where each player has several omega-regular objectives equipped with a preorder (for instance, a player may want to satisfy all her objectives, or to maximise the number of objectives that she achieves.)
The overwhelming amount and rate of information update in online social media is making it increasingly difficult for users to allocate their attention to their topics of interest, thus there is a strong need for prioritizing news feeds.
The attractiveness of a post to a user depends on many complex contextual and temporal features of the post.
For instance, the contents of the post, the responsiveness of a third user, and the age of the post may all have impact.
So far, these static and dynamic features has not been incorporated in a unified framework to tackle the post prioritization problem.
In this paper, we propose a novel approach for prioritizing posts based on a feature modulated multi-dimensional point process.
Our model is able to simultaneously capture textual and sentiment features, and temporal features such as self-excitation, mutual-excitation and bursty nature of social interaction.
As an evaluation, we also curated a real-world conversational benchmark dataset crawled from Facebook.
In our experiments, we demonstrate that our algorithm is able to achieve the-state-of-the-art performance in terms of analyzing, predicting, and prioritizing events.
In terms of interpretability of our method, we observe that features indicating individual user profile and linguistic characteristics of the events work best for prediction and prioritization of new events.
NoSQL data storage systems have become very popular due to their scalability and ease of use.
This paper examines the maturity of security measures for NoSQL databases, addressing their new query and access mechanisms.
For example the emergence of new query formats makes the old SQL injection techniques irrelevant, but are NoSQL databases immune to injection in general?
The answer is NO.
Here we present a few techniques for attacking NoSQL databases such as injections and CSRF.
We analyze the source of these vulnerabilities and present methodologies to mitigate the attacks.
We show that this new vibrant technological area lacks the security measures and awareness which have developed over the years in traditional RDBMS SQL systems.
Time series forecasting gets much attention due to its impact on many practical applications.
Higher-order neural network with recurrent feedback is a powerful technique which used successfully for forecasting.
It maintains fast learning and the ability to learn the dynamics of the series over time.
For that, in this paper, we propose a novel model which is called Ridge Polynomial Neural Network with Error-Output Feedbacks (RPNN-EOFs) that combines the properties of higher order and error-output feedbacks.
The well-known Mackey-Glass time series is used to test the forecasting capability of RPNN-EOFS.
Simulation results showed that the proposed RPNN-EOFs provides better understanding for the Mackey-Glass time series with root mean square error equal to 0.00416.
This result is smaller than other models in the literature.
Therefore, we can conclude that the RPNN-EOFs can be applied successfully for time series forecasting.
We report results from a preliminary study exploring the memorability of spatial scientific visualizations, the goal of which is to understand the visual features that contribute to memorability.
The evaluation metrics include three objective measures (entropy, feature congestion, the number of edges), four subjective ratings (clutter, the number of distinct colors, familiarity, and realism), and two sentiment ratings (interestingness and happiness).
We curate 1142 scientific visualization (SciVis) images from the original 2231 images in published IEEE SciVis papers from 2008 to 2017 and compute memorability scores of 228 SciVis images from data collected on Amazon Mechanical Turk (MTurk).
Results showed that the memorability of SciVis images is mostly correlated with clutter and the number of distinct colors.
We further investigate the differences between scientific visualization and infographics as a means to understand memorability differences by data attributes.
Let G be a graph embedded on a surface of genus g with b boundary cycles.
We describe algorithms to compute multiple types of non-trivial cycles in G, using different techniques depending on whether or not G is an undirected graph.
If G is undirected, then we give an algorithm to compute a shortest non-separating cycle in 2^O(g) n log log n time.
Similar algorithms are given to compute a shortest non-contractible or non-null-homologous cycle in 2^O(g+b) n log log n time.
Our algorithms for undirected G combine an algorithm of Kutz with known techniques for efficiently enumerating homotopy classes of curves that may be shortest non-trivial cycles.
Our main technical contributions in this work arise from assuming G is a directed graph with possibly asymmetric edge weights.
For this case, we give an algorithm to compute a shortest non-contractible cycle in G in O((g^3 + g b)n log n) time.
In order to achieve this time bound, we use a restriction of the infinite cyclic cover that may be useful in other contexts.
We also describe an algorithm to compute a shortest non-null-homologous cycle in G in O((g^2 + g b)n log n) time, extending a known algorithm of Erickson to compute a shortest non-separating cycle.
In both the undirected and directed cases, our algorithms improve the best time bounds known for many values of g and b.
Impervious surface area is a direct consequence of the urbanization, which also plays an important role in urban planning and environmental management.
With the rapidly technical development of remote sensing, monitoring urban impervious surface via high spatial resolution (HSR) images has attracted unprecedented attention recently.
Traditional multi-classes models are inefficient for impervious surface extraction because it requires labeling all needed and unneeded classes that occur in the image exhaustively.
Therefore, we need to find a reliable one-class model to classify one specific land cover type without labeling other classes.
In this study, we investigate several one-class classifiers, such as Presence and Background Learning (PBL), Positive Unlabeled Learning (PUL), OCSVM, BSVM and MAXENT, to extract urban impervious surface area using high spatial resolution imagery of GF-1, China's new generation of high spatial remote sensing satellite, and evaluate the classification accuracy based on artificial interpretation results.
Compared to traditional multi-classes classifiers (ANN and SVM), the experimental results indicate that PBL and PUL provide higher classification accuracy, which is similar to the accuracy provided by ANN model.
Meanwhile, PBL and PUL outperforms OCSVM, BSVM, MAXENT and SVM models.
Hence, the one-class classifiers only need a small set of specific samples to train models without losing predictive accuracy, which is supposed to gain more attention on urban impervious surface extraction or other one specific land cover type.
Relays in cellular systems are interference limited.
The highest end-to-end sum rates are achieved when the relays are jointly optimized with the transmit strategy.
Unfortunately, interference couples the links together making joint optimization challenging.
Further, the end-to-end multi-hop performance is sensitive to rate mismatch, when some links have a dominant first link while others have a dominant second link.
This paper proposes an algorithm for designing the linear transmit precoders at the transmitters and relays of the relay interference broadcast channel, a generic model for relay-based cellular systems, to maximize the end-to-end sum-rates.
First, the relays are designed to maximize the second-hop sum-rates.
Next, approximate end-to-end rates that depend on the time-sharing fraction and the second-hop rates are used to formulate a sum-utility maximization problem for designing the transmitters.
This problem is solved by iteratively minimizing the weighted sum of mean square errors.
Finally, the norms of the transmit precoders at the transmitters are adjusted to eliminate rate mismatch.
The proposed algorithm allows for distributed implementation and has fast convergence.
Numerical results show that the proposed algorithm outperforms a reasonable application of single-hop interference management strategies separately on two hops.
Current analysis of tumor proliferation, the most salient prognostic biomarker for invasive breast cancer, is limited to subjective mitosis counting by pathologists in localized regions of tissue images.
This study presents the first data-driven integrative approach to characterize the severity of tumor growth and spread on a categorical and molecular level, utilizing multiple biologically salient deep learning classifiers to develop a comprehensive prognostic model.
Our approach achieves pathologist-level performance on three-class categorical tumor severity prediction.
It additionally pioneers prediction of molecular expression data from a tissue image, obtaining a Spearman's rank correlation coefficient of 0.60 with ex vivo mean calculated RNA expression.
Furthermore, our framework is applied to identify over two hundred unprecedented biomarkers critical to the accurate assessment of tumor proliferation, validating our proposed integrative pipeline as the first to holistically and objectively analyze histopathological images.
Deep reinforcement learning has led to several recent breakthroughs, though the learned policies are often based on black-box neural networks.
This makes them difficult to interpret and to impose desired specification constraints during learning.
We present an iterative framework, MORL, for improving the learned policies using program synthesis.
Concretely, we propose to use synthesis techniques to obtain a symbolic representation of the learned policy, which can then be debugged manually or automatically using program repair.
After the repair step, we use behavior cloning to obtain the policy corresponding to the repaired program, which is then further improved using gradient descent.
This process continues until the learned policy satisfies desired constraints.
We instantiate MORL for the simple CartPole problem and show that the programmatic representation allows for high-level modifications that in turn lead to improved learning of the policies.
This paper presents a study of the Internet infrastructure in India from the point of view of censorship.
First, we show that the current state of affairs---where each ISP implements its own content filters (nominally as per a governmental blacklist)---results in dramatic differences in the censorship experienced by customers.
In practice, a well-informed Indian citizen can escape censorship through a judicious choice of service provider.
We then consider the question of whether India might potentially follow the Chinese model and institute a single, government-controlled filter.
This would not be difficult, as the Indian Internet is quite centralized already.
A few "key" ASes (approx 1% of Indian ASes) collectively intercept approx 95% of paths to the censored sites we sample in our study, and also to all publicly-visible DNS servers.
5,000 routers spanning these key ASes would suffice to carry out IP or DNS filtering for the entire country; approx 70% of these routers belong to only two private ISPs.
If the government is willing to employ more powerful measures, such as an IP Prefix Hijacking attack, any one of several key ASes can censor traffic for nearly all Indian users.
Finally, we demonstrate that such federated censorship by India would cause substantial collateral damage to non-Indian ASes whose traffic passes through Indian cyberspace (which do not legally come under Indian jurisdiction at all).
The main goal of group testing with inhibitors (GTI) is to efficiently identify a small number of defective items and inhibitor items in a large set of items.
A test on a subset of items is positive if the subset satisfies some specific properties.
Inhibitor items cancel the effects of defective items, which often make the outcome of a test containing defective items negative.
Different GTI models can be formulated by considering how specific properties have different cancellation effects.
This work introduces generalized GTI (GGTI) in which a new type of items is added, i.e., hybrid items.
A hybrid item plays the roles of both defectives items and inhibitor items.
Since the number of instances of GGTI is large (more than 7 million), we introduce a framework for classifying all types of items non-adaptively, i.e., all tests are designed in advance.
We then explain how GGTI can be used to classify neurons in neuroscience.
Finally, we show how to realize our proposed scheme in practice.
The key limiting factor in graphical model inference and learning is the complexity of the partition function.
We thus ask the question: what are general conditions under which the partition function is tractable?
The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs).
SPNs are directed acyclic graphs with variables as leaves, sums and products as internal nodes, and weighted edges.
We show that if an SPN is complete and consistent it represents the partition function and all marginals of some graphical model, and give semantics to its nodes.
Essentially all tractable graphical models can be cast as SPNs, but SPNs are also strictly more general.
We then propose learning algorithms for SPNs, based on backpropagation and EM.
Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks.
For example, SPNs perform image completion better than state-of-the-art deep networks for this task.
SPNs also have intriguing potential connections to the architecture of the cortex.
Disentangling factors of variation has become a very challenging problem on representation learning.
Existing algorithms suffer from many limitations, such as unpredictable disentangling factors, poor quality of generated images from encodings, lack of identity information, etc.
In this paper, we propose a supervised learning model called DNA-GAN which tries to disentangle different factors or attributes of images.
The latent representations of images are DNA-like, in which each individual piece (of the encoding) represents an independent factor of the variation.
By annihilating the recessive piece and swapping a certain piece of one latent representation with that of the other one, we obtain two different representations which could be decoded into two kinds of images with the existence of the corresponding attribute being changed.
In order to obtain realistic images and also disentangled representations, we further introduce the discriminator for adversarial training.
Experiments on Multi-PIE and CelebA datasets finally demonstrate that our proposed method is effective for factors disentangling and even overcome certain limitations of the existing methods.
The paper describes a novel social network-based open educational resource for learning foreign languages in real time from native speakers, based on the predefined teaching materials.
This virtual learning platform, named i2istudy, eliminates misunderstanding by providing prepared and predefined scenarios, enabling the participants to understand each other and, as a consequence, to communicate freely.
The system allows communication through the real time video and audio feed.
In addition to establishing the communication, it tracks the student progress and allows rating the instructor, based on the learner's experience.
The system went live in April 2014, and had over six thousand active daily users, with over 40,000 total registered users.
Currently monetization is being added to the system, and time will show how popular the system will become in the future.
A blocking quadruple (BQ) is a quadruple of vertices of a graph such that any two vertices of the quadruple either miss (have no neighbours on) some path connecting the remaining two vertices of the quadruple, or are connected by some path missed by the remaining two vertices.
This is akin to the notion of asteroidal triple used in the classical characterization of interval graphs by Lekkerkerker and Boland.
We show that a circular-arc graph cannot have a blocking quadruple.
We also observe that the absence of blocking quadruples is not in general sufficient to guarantee that a graph is a circular-arc graph.
Nonetheless, it can be shown to be sufficient for some special classes of graphs, such as those investigated by Bonomo et al.
In this note, we focus on chordal graphs, and study the relationship between the structure of chordal graphs and the presence/absence of blocking quadruples.
Our contribution is two-fold.
Firstly, we provide a forbidden induced subgraph characterization of chordal graphs without blocking quadruples.
In particular, we observe that all the forbidden subgraphs are variants of the subgraphs forbidden for interval graphs.
Secondly, we show that the absence of blocking quadruples is sufficient to guarantee that a chordal graph with no independent set of size five is a circular-arc graph.
In our proof we use a novel geometric approach, constructing a circular-arc representation by traversing around a carefully chosen clique tree.
The application of mobile computing is currently altering patterns of our behavior to a greater degree than perhaps any other invention.
In combination with the introduction of power efficient wireless communication technologies, such as Bluetooth Low Energy (BLE), designers are today increasingly empowered to shape the way we interact with our physical surroundings and thus build entirely new experiences.
However, our evaluations of BLE and its abilities to facilitate mobile location-based experiences in public environments revealed a number of potential problems.
Most notably, the position and orientation of the user in combination with various environmental factors, such as crowds of people traversing the space, were found to cause major fluctuations of the received BLE signal strength.
These issues are rendering a seamless functioning of any location-based application practically impossible.
Instead of achieving seamlessness by eliminating these technical issues, we thus choose to advocate the use of a seamful approach, i.e. to reveal and exploit these problems and turn them into a part of the actual experience.
In order to demonstrate the viability of this approach, we designed, implemented and evaluated the Ghost Detector - an educational location-based museum game for children.
By presenting a qualitative evaluation of this game and by motivating our design decisions, this paper provides insight into some of the challenges and possible solutions connected to the process of developing location-based BLE-enabled experiences for public cultural spaces.
Robots with flexible spines based on tensegrity structures have potential advantages over traditional designs with rigid torsos.
However, these robots can be difficult to control due to their high-dimensional nonlinear dynamics.
To overcome these issues, this work presents two controllers for tensegrity spine robots, using model-predictive control (MPC), and demonstrates the first closed-loop control of such structures.
The first of the two controllers is formulated using only state tracking with smoothing constraints.
The second controller, newly introduced in this work, tracks both state and input reference trajectories without smoothing.
The reference input trajectory is calculated using a rigid-body reformulation of the inverse kinematics of tensegrity structures, and introduces the first feasible solutions to the problem for certain tensegrity topologies.
This second controller significantly reduces the number of parameters involved in designing the control system, making the task much easier.
The controllers are simulated with 2D and 3D models of a particular tensegrity spine, designed for use as the backbone of a quadruped robot.
These simulations illustrate the different benefits of the higher performance of the smoothing controller versus the lower tuning complexity of the more general input-tracking formulation.
Both controllers show noise insensitivity and low tracking error, and can be used for different control goals.
The reference input tracking controller is also simulated against an additional model of a similar robot, thereby demonstrating its generality.
In the present day, AES is one the most widely used and most secure Encryption Systems prevailing.
So, naturally lots of research work is going on to mount a significant attack on AES.
Many different forms of Linear and differential cryptanalysis have been performed on AES.
Of late, an active area of research has been Algebraic Cryptanalysis of AES, where although fast progress is being made, there are still numerous scopes for research and improvement.
One of the major reasons behind this being that algebraic cryptanalysis mainly depends on I/O relations of the AES S- Box (a major component of the AES).
As, already known, that the key recovery algorithm of AES can be broken down as an MQ problem which is itself considered hard.
Solving these equations depends on our ability reduce them into linear forms which are easily solvable under our current computational prowess.
The lower the degree of these equations, the easier it is for us to linearlize hence the attack complexity reduces.
The aim of this paper is to analyze the various relations involving small number of monomials of the AES S- Box and to answer the question whether it is actually possible to have such monomial equations for the S- Box if we restrict the degree of the monomials.
In other words this paper aims to study such equations and see if they can be applicable for AES.
We consider stochastic transition matrices from large social and information networks.
For these matrices, we describe and evaluate three fast methods to estimate one column of the matrix exponential.
The methods are designed to exploit the properties inherent in social networks, such as a power-law degree distribution.
Using only this property, we prove that one of our algorithms has a sublinear runtime.
We present further experimental evidence showing that all of them run quickly on social networks with billions of edges and accurately identify the largest elements of the column.
The area of online machine learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data.
The first requirement mostly concerns software architectures and efficient algorithms.
The second one also imposes nontrivial theoretical restrictions on the modeling methods: In the data stream model, older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives.
In this article, we provide an overview of distributed software architectures and libraries as well as machine learning models for online learning.
We highlight the most important ideas for classification, regression, recommendation, and unsupervised modeling from streaming data, and we show how they are implemented in various distributed data stream processing systems.
This article is a reference material and not a survey.
We do not attempt to be comprehensive in describing all existing methods and solutions; rather, we give pointers to the most important resources in the field.
All related sub-fields, online algorithms, online learning, and distributed data processing are hugely dominant in current research and development with conceptually new research results and software components emerging at the time of writing.
In this article, we refer to several survey results, both for distributed data processing and for online machine learning.
Compared to past surveys, our article is different because we discuss recommender systems in extended detail.
In this work, we propose a step towards a more accurate prediction of the environment light given a single picture of a known object.
To achieve this, we developed a deep learning method that is able to encode the latent space of indoor lighting using few parameters and that is trained on a database of environment maps.
This latent space is then used to generate predictions of the light that are both more realistic and accurate than previous methods.
To achieve this, our first contribution is a deep autoencoder which is capable of learning the feature space that compactly models lighting.
Our second contribution is a convolutional neural network that predicts the light from a single image of a known object.
To train these networks, our third contribution is a novel dataset that contains 21,000 HDR indoor environment maps.
The results indicate that the predictor can generate plausible lighting estimations even from diffuse objects.
With rapid growth in the amount of unstructured data produced by memory-intensive applications, large scale data analytics has recently attracted increasing interest.
Processing, managing and analyzing this huge amount of data poses several challenges in cloud and data center computing domain.
Especially, conventional frameworks for distributed data analytics are based on the assumption of homogeneity and non-stochastic distribution of different data-processing nodes.
The paper argues the fundamental limiting factors for scaling big data computation.
It is shown that as the number of series and parallel computing servers increase, the tail (mean and variance) of the job execution time increase.
We will first propose a model to predict the response time of highly distributed processing tasks and then propose a new practical computational algorithm to optimize the response time.
Time-varying delays adversely affect the performance of networked control sys-tems (NCS) and in the worst-case can destabilize the entire system.
Therefore, modelling network delays is important for designing NCS.
However, modelling time-varying delays is challenging because of their dependence on multiple pa-rameters such as length, contention, connected devices, protocol employed, and channel loading.
Further, these multiple parameters are inherently random and de-lays vary in a non-linear fashion with respect to time.
This makes estimating ran-dom delays challenging.
This investigation presents a methodology to model de-lays in NCS using experiments and general regression neural network (GRNN) due to their ability to capture non-linear relationship.
To compute the optimal smoothing parameter that computes the best estimates, genetic algorithm is used.
The objective of the genetic algorithm is to compute the optimal smoothing pa-rameter that minimizes the mean absolute percentage error (MAPE).
Our results illustrate that the resulting GRNN is able to predict the delays with less than 3% error.
The proposed delay model gives a framework to design compensation schemes for NCS subjected to time-varying delays.
The aggregate behaviors of users can collectively encode deep semantic information about the objects with which they interact.
In this paper, we demonstrate novel ways in which the synthesis of these data can illuminate the terrain of users' environment and support them in their decision making and wayfinding.
A novel application of Recurrent Neural Networks and skip-gram models, approaches popularized by their application to modeling language, are brought to bear on student university enrollment sequences to create vector representations of courses and map out traversals across them.
We present demonstrations of how scrutability from these neural networks can be gained and how the combination of these techniques can be seen as an evolution of content tagging and a means for a recommender to balance user preferences inferred from data with those explicitly specified.
From validation of the models to the development of a UI, we discuss additional requisite functionality informed by the results of a usability study leading to the ultimate deployment of the system at a university.
In this study, both Bayesian classifiers and mutual information classifiers are examined for binary classifications with or without a reject option.
The general decision rules in terms of distinctions on error types and reject types are derived for Bayesian classifiers.
A formal analysis is conducted to reveal the parameter redundancy of cost terms when abstaining classifications are enforced.
The redundancy implies an intrinsic problem of "non-consistency" for interpreting cost terms.
If no data is given to the cost terms, we demonstrate the weakness of Bayesian classifiers in class-imbalanced classifications.
On the contrary, mutual-information classifiers are able to provide an objective solution from the given data, which shows a reasonable balance among error types and reject types.
Numerical examples of using two types of classifiers are given for confirming the theoretical differences, including the extremely-class-imbalanced cases.
Finally, we briefly summarize the Bayesian classifiers and mutual-information classifiers in terms of their application advantages, respectively.
Effective monitoring and management of environment pollution is key to the development of modern metropolitan cities.
To sustain and to cope with the exponential growth of the cities with high industrialization, expert decision making is very essential in this process.
A good governance system must be supported by an actively participating population.
In participatory sensing, individuals and groups engages in the data collection actively and the helps the city governance to make proper decisions.
In this paper, we propose a participatory sensing based three-tier framework to fight environment pollution in urban areas of Bangladesh.
The framework includes an android application named `My City, My Environment', a server for storage and computation and also a web server for the authority to monitor and maintain environmental issues through expert decision making.
We have already developed a prototype system and deployed it to a small scale and demonstrated the effectiveness of this framework.
The transparency nature of Open Data is beneficial for citizens to evaluate government work performance.
In Indonesia, each government bodies or ministry have their own standard operating procedure on data treatment resulting in incoherent information between agent and likely to miss valuable insight.
Therefore, our motivation is to show the advantage of Open Data movement to support unified government decision making.
We use the dataset from data.go.id which publish official data from each government bodies.
The idea is by using those official but limited data, we can find important pattern.
The case study is on Human Development Index value prediction and its clustered nature.
We explore the data pattern using two important data analytics methods classification and clustering procedure.
Data analytics is the collection of activities to reveal unknown data pattern.
Specifically, we use Artificial Neural Network classification and K-means clustering.
The classification objective is to categorize different level of Human Development Index of cities or region in Indonesia based on Gross Domestic Product, Number of Population in Poverty, Number of Internet User, Number of Labors and Number of Population indicators data.
We determined which city belongs to four categories of Human Development stated by UNDP standard.
The clustering objective is to find the group characteristics between Human Development Index and Gross Domestic Product.
We study networks of human decision-makers who independently decide how to protect themselves against Susceptible-Infected-Susceptible (SIS) epidemics.
Motivated by studies in behavioral economics showing that humans perceive probabilities in a nonlinear fashion, we examine the impacts of such misperceptions on the equilibrium protection strategies.
In our setting, nodes choose their curing rates to minimize the infection probability under the degree-based mean-field approximation of the SIS epidemic plus the cost of their selected curing rate.
We establish the existence of a degree based equilibrium under both true and nonlinear perceptions of infection probabilities (under suitable assumptions).
When the per-unit cost of curing rate is sufficiently high, we show that true expectation minimizers choose the curing rate to be zero at the equilibrium, while curing rate is nonzero under nonlinear probability weighting.
We consider linear precoder design for a multiple-input multiple-output (MIMO) Gaussian wiretap channel, which comprises two legitimate nodes, i.e., Alice and Bob, operating in Full-Duplex (FD) mode and exchanging confidential messages in the presence of a passive eavesdropper.
Using the sum secrecy degrees of freedoms (sum S.D.o.F.) as reliability measure, we formulate an optimization problem with respect to the precoding matrices.
In order to solve this problem, we first propose a cooperative secrecy transmission scheme, and prove that its feasible set is sufficient to achieve the maximum sum S.D.o.F.. Based on that feasible set, we then determine the maximum achievable sum S.D.o.F. in closed form, and provide a method for constructing the precoding matrix pair which achieves the maximum sum S.D.o.F..
Results show that, the FD based network provides an attractive secrecy transmission rate performance.
Differential privacy is a promising framework for addressing the privacy concerns in sharing sensitive datasets for others to analyze.
However differential privacy is a highly technical area and current deployments often require experts to write code, tune parameters, and optimize the trade-off between the privacy and accuracy of statistical releases.
For differential privacy to achieve its potential for wide impact, it is important to design usable systems that enable differential privacy to be used by ordinary data owners and analysts.
PSI is a tool that was designed for this purpose, allowing researchers to release useful differentially private statistical information about their datasets without being experts in computer science, statistics, or privacy.
We conducted a thorough usability study of PSI to test whether it accomplishes its goal of usability by non-experts.
The usability test illuminated which features of PSI are most user-friendly and prompted us to improve aspects of the tool that caused confusion.
The test also highlighted some general principles and lessons for designing usable systems for differential privacy, which we discuss in depth.
Correctly identifying crosswalks is an essential task for the driving activity and mobility autonomy.
Many crosswalk classification, detection and localization systems have been proposed in the literature over the years.
These systems use different perspectives to tackle the crosswalk classification problem: satellite imagery, cockpit view (from the top of a car or behind the windshield), and pedestrian perspective.
Most of the works in the literature are designed and evaluated using small and local datasets, i.e. datasets that present low diversity.
Scaling to large datasets imposes a challenge for the annotation procedure.
Moreover, there is still need for cross-database experiments in the literature because it is usually hard to collect the data in the same place and conditions of the final application.
In this paper, we present a crosswalk classification system based on deep learning.
For that, crowdsourcing platforms, such as OpenStreetMap and Google Street View, are exploited to enable automatic training via automatic acquisition and annotation of a large-scale database.
Additionally, this work proposes a comparison study of models trained using fully-automatic data acquisition and annotation against models that were partially annotated.
Cross-database experiments were also included in the experimentation to show that the proposed methods enable use with real world applications.
Our results show that the model trained on the fully-automatic database achieved high overall accuracy (94.12%), and that a statistically significant improvement (to 96.30%) can be achieved by manually annotating a specific part of the database.
Finally, the results of the cross-database experiments show that both models are robust to the many variations of image and scenarios, presenting a consistent behavior.
Recently, increasing attention has been directed to the study of the speech emotion recognition, in which global acoustic features of an utterance are mostly used to eliminate the content differences.
However, the expression of speech emotion is a dynamic process, which is reflected through dynamic durations, energies, and some other prosodic information when one speaks.
In this paper, a novel local dynamic pitch probability distribution feature, which is obtained by drawing the histogram, is proposed to improve the accuracy of speech emotion recognition.
Compared with most of the previous works using global features, the proposed method takes advantage of the local dynamic information conveyed by the emotional speech.
Several experiments on Berlin Database of Emotional Speech are conducted to verify the effectiveness of the proposed method.
The experimental results demonstrate that the local dynamic information obtained with the proposed method is more effective for speech emotion recognition than the traditional global features.
Combining deep model-free reinforcement learning with on-line planning is a promising approach to building on the successes of deep RL.
On-line planning with look-ahead trees has proven successful in environments where transition models are known a priori.
However, in complex environments where transition models need to be learned from data, the deficiencies of learned models have limited their utility for planning.
To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions.
TreeQN dynamically constructs a tree by recursively applying a transition model in a learned abstract state space and then aggregating predicted rewards and state-values using a tree backup to estimate Q-values.
We also propose ATreeC, an actor-critic variant that augments TreeQN with a softmax layer to form a stochastic policy network.
Both approaches are trained end-to-end, such that the learned model is optimised for its actual use in the tree.
We show that TreeQN and ATreeC outperform n-step DQN and A2C on a box-pushing task, as well as n-step DQN and value prediction networks (Oh et al.2017) on multiple Atari games.
Furthermore, we present ablation studies that demonstrate the effect of different auxiliary losses on learning transition models.
Mining financial text documents and understanding the sentiments of individual investors, institutions and markets is an important and challenging problem in the literature.
Current approaches to mine sentiments from financial texts largely rely on domain specific dictionaries.
However, dictionary based methods often fail to accurately predict the polarity of financial texts.
This paper aims to improve the state-of-the-art and introduces a novel sentiment analysis approach that employs the concept of financial and non-financial performance indicators.
It presents an association rule mining based hierarchical sentiment classifier model to predict the polarity of financial texts as positive, neutral or negative.
The performance of the proposed model is evaluated on a benchmark financial dataset.
The model is also compared against other state-of-the-art dictionary and machine learning based approaches and the results are found to be quite promising.
The novel use of performance indicators for financial sentiment analysis offers interesting and useful insights.
We consider the numerical modeling of the Farley-Buneman instability development in the earth's ionosphere plasma.
The ion behavior is governed by the kinetic Landau equation in the four-dimensional phase space, and since the finite difference discretization on a tensor product grid is used, this equation becomes the most computationally challenging part of the scheme.
To relax the complexity and memory consumption, an adaptive model reduction using the low-rank separation of variables, namely the Tensor Train format, is employed.
The approach was verified via the prototype MATLAB implementation.
Numerical experiments demonstrate the possibility of efficient separation of space and velocity variables, resulting in the solution storage reduction by a factor of order tens.
Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting.
Teacher-student optimization aims at providing complementary cues from a model trained previously, but these approaches are often considerably slow due to the pipeline of training a few generations in sequence, i.e., time complexity is increased by several times.
This paper presents snapshot distillation (SD), the first framework which enables teacher-student optimization in one generation.
The idea of SD is very simple: instead of borrowing supervision signals from previous generations, we extract such information from earlier epochs in the same generation, meanwhile make sure that the difference between teacher and student is sufficiently large so as to prevent under-fitting.
To achieve this goal, we implement SD in a cyclic learning rate policy, in which the last snapshot of each cycle is used as the teacher for all iterations in the next cycle, and the teacher signal is smoothed to provide richer information.
In standard image classification benchmarks such as CIFAR100 and ILSVRC2012, SD achieves consistent accuracy gain without heavy computational overheads.
We also verify that models pre-trained with SD transfers well to object detection and semantic segmentation in the PascalVOC dataset.
MirBot is a collaborative application for smartphones that allows users to perform object recognition.
This app can be used to take a photograph of an object, select the region of interest and obtain the most likely class (dog, chair, etc.) by means of similarity search using features extracted from a convolutional neural network (CNN).
The answers provided by the system can be validated by the user so as to improve the results for future queries.
All the images are stored together with a series of metadata, thus enabling a multimodal incremental dataset labeled with synset identifiers from the WordNet ontology.
This dataset grows continuously thanks to the users' feedback, and is publicly available for research.
This work details the MirBot object recognition system, analyzes the statistics gathered after more than four years of usage, describes the image classification methodology, and performs an exhaustive evaluation using handcrafted features, convolutional neural codes and different transfer learning techniques.
After comparing various models and transformation methods, the results show that the CNN features maintain the accuracy of MirBot constant over time, despite the increasing number of new classes.
The app is freely available at the Apple and Google Play stores.
The goal of few-shot learning is to learn a classifier that generalizes well even when trained with a limited number of training instances per class.
The recently introduced meta-learning approaches tackle this problem by learning a generic classifier across a large number of multiclass classification tasks and generalizing the model to a new task.
Yet, even with such meta-learning, the low-data problem in the novel classification task still remains.
In this paper, we propose Transductive Propagation Network (TPN), a novel meta-learning framework for transductive inference that classifies the entire test set at once to alleviate the low-data problem.
Specifically, we propose to learn to propagate labels from labeled instances to unlabeled test instances, by learning a graph construction module that exploits the manifold structure in the data.
TPN jointly learns both the parameters of feature embedding and the graph construction in an end-to-end manner.
We validate TPN on multiple benchmark datasets, on which it largely outperforms existing few-shot learning approaches and achieves the state-of-the-art results.
Automated writing evaluation (AWE) has been shown to be an effective mechanism for quickly providing feedback to students.
It has already seen wide adoption in enterprise-scale applications and is starting to be adopted in large-scale contexts.
Training an AWE model has historically required a single batch of several hundred writing examples and human scores for each of them.
This requirement limits large-scale adoption of AWE since human-scoring essays is costly.
Here we evaluate algorithms for ensuring that AWE models are consistently trained using the most informative essays.
Our results show how to minimize training set sizes while maximizing predictive performance, thereby reducing cost without unduly sacrificing accuracy.
We conclude with a discussion of how to integrate this approach into large-scale AWE systems.
It is necessary for a mobile robot to be able to efficiently plan a path from its starting, or current, location to a desired goal location.
This is a trivial task when the environment is static.
However, the operational environment of the robot is rarely static, and it often has many moving obstacles.
The robot may encounter one, or many, of these unknown and unpredictable moving obstacles.
The robot will need to decide how to proceed when one of these obstacles is obstructing it's path.
A method of dynamic replanning using RRT* is presented.
The robot will modify it's current plan when an unknown random moving obstacle obstructs the path.
Various experimental results show the effectiveness of the proposed method.
Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets.
We investigate challenges and possible solutions of using the MapReduce programming model for parallel entity resolution.
In particular, we propose and evaluate two MapReduce-based implementations for Sorted Neighborhood blocking that either use multiple MapReduce jobs or apply a tailored data replication.
Time-aware encoding of frame sequences in a video is a fundamental problem in video understanding.
While many attempted to model time in videos, an explicit study on quantifying video time is missing.
To fill this lacuna, we aim to evaluate video time explicitly.
We describe three properties of video time, namely a) temporal asymmetry, b)temporal continuity and c) temporal causality.
Based on each we formulate a task able to quantify the associated property.
This allows assessing the effectiveness of modern video encoders, like C3D and LSTM, in their ability to model time.
Our analysis provides insights about existing encoders while also leading us to propose a new video time encoder, which is better suited for the video time recognition tasks than C3D and LSTM.
We believe the proposed meta-analysis can provide a reasonable baseline to assess video time encoders on equal grounds on a set of temporal-aware tasks.
Event Related Potentials (ERPs) are very feeble alterations in the ongoing Electroencephalogram (EEG) and their detection is a challenging problem.
Based on the unique time-based parameters derived from wavelet coefficients and the asymmetry property of wavelets a novel algorithm to separate ERP components in single-trial EEG data is described.
Though illustrated as a specific application to N170 ERP detection, the algorithm is a generalized approach that can be easily adapted to isolate different kinds of ERP components.
The algorithm detected the N170 ERP component with a high level of accuracy.
We demonstrate that the asymmetry method is more accurate than the matching wavelet algorithm and t-CWT method by 48.67 and 8.03 percent respectively.
This paper provides an off-line demonstration of the algorithm and considers issues related to the extension of the algorithm to real-time applications.
In a multi-agent system, transitioning from a centralized to a distributed decision-making strategy can introduce vulnerability to adversarial manipulation.
We study the potential for adversarial manipulation in a class of graphical coordination games where the adversary can pose as a friendly agent in the game, thereby influencing the decision-making rules of a subset of agents.
The adversary's influence can cascade throughout the system, indirectly influencing other agents' behavior and significantly impacting the emergent collective behavior.
The main results in this paper focus on characterizing conditions under which the adversary's local influence can dramatically impact the emergent global behavior, e.g., destabilize efficient Nash equilibria.
Neural machine translation (NMT), a new approach to machine translation, has been proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs.
Translation is an open-vocabulary problem, but most existing NMT systems operate with a fixed vocabulary, which causes the incapability of translating rare words.
This problem can be alleviated by using different translation granularities, such as character, subword and hybrid word-character.
Translation involving Chinese is one of the most difficult tasks in machine translation, however, to the best of our knowledge, there has not been any other work exploring which translation granularity is most suitable for Chinese in NMT.
In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study.
Furthermore, we discuss the advantages and disadvantages of various translation granularities in detail.
Our experiments show that subword model performs best for Chinese-to-English translation with the vocabulary which is not so big while hybrid word-character model is most suitable for English-to-Chinese translation.
Moreover, experiments of different granularities show that Hybrid_BPE method can achieve best result on Chinese-to-English translation task.
In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression.
OneR is utilized as horizontal com-pression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data.
As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used.
Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average.
In this paper we present reclaimID: An architecture that allows users to reclaim their digital identities by securely sharing identity attributes without the need for a centralised service provider.
We propose a design where user attributes are stored in and shared over a name system under user-owned namespaces.
Attributes are encrypted using attribute-based encryption (ABE), allowing the user to selectively authorize and revoke access of requesting parties to subsets of his attributes.
We present an implementation based on the decentralised GNU Name System (GNS) in combination with ciphertext-policy ABE using type-1 pairings.
To show the practicality of our implementation, we carried out experimental evaluations of selected implementation aspects including attribute resolution performance.
Finally, we show that our design can be used as a standard OpenID Connect Identity Provider allowing our implementation to be integrated into standard-compliant services.
In modern OCaml, single-argument datatype declarations (variants with a single constructor, records with a single field) can sometimes be `unboxed'.
This means that their memory representation is the same as their single argument (omitting the variant or record constructor and an indirection), thus achieving better time and memory efficiency.
However, in the case of generalized/guarded algebraic datatypes (GADTs), unboxing is not always possible due to a subtle assumption about the runtime representation of OCaml values.
The current correctness check is incomplete, rejecting many valid definitions, in particular those involving mutually-recursive datatype declarations.
In this paper, we explain the notion of separability as a semantic for the unboxing criterion, and propose a set of inference rules to check separability.
From these inference rules, we derive a new implementation of the unboxing check that properly supports mutually-recursive definitions.
Frequency agile radar (FAR) is known to have excellent electronic counter-countermeasures (ECCM) performance and the potential to realize spectrum sharing in dense electromagnetic environments.
Many compressed sensing (CS) based algorithms have been developed for joint range and Doppler estimation in FAR.
This paper considers theoretical analysis of FAR via CS algorithms.
In particular, we analyze the properties of the sensing matrix, which is a highly structured random matrix.
We then derive bounds on the number of recoverable targets.
Numerical simulations and field experiments validate the theoretical findings and demonstrate the effectiveness of CS approaches to FAR.
Understanding the semantic relationships between terms is a fundamental task in natural language processing applications.
While structured resources that can express those relationships in a formal way, such as ontologies, are still scarce, a large number of linguistic resources gathering dictionary definitions is becoming available, but understanding the semantic structure of natural language definitions is fundamental to make them useful in semantic interpretation tasks.
Based on an analysis of a subset of WordNet's glosses, we propose a set of semantic roles that compose the semantic structure of a dictionary definition, and show how they are related to the definition's syntactic configuration, identifying patterns that can be used in the development of information extraction frameworks and semantic models.
We present GHTraffic, a dataset of significant size comprising HTTP transactions extracted from GitHub data and augmented with synthetic transaction data.
The dataset facilitates reproducible research on many aspects of service-oriented computing.
This paper discusses use cases for such a dataset and extracts a set of requirements from these use cases.
We then discuss the design of GHTraffic, and the methods and tool used to construct it.
We conclude our contribution with some selective metrics that characterise GHTraffic.
Hindustani classical music is entirely based on the Raga structures.
In Hindustani music, a Gharana or school refers to the adherence of a group of musicians to a particular musical style of performing a certain raga.
The objective of this work was to find out if any characteristic acoustic cues exist which discriminates a particular gharana from the other.
Another intriguing fact is if the artists of the same gharana keep their singing style unchanged over generations or evolution of music takes place like everything else in nature.
In this work, we chose to study the similarities and differences in singing style of some artists from at least four consecutive generations representing four different gharanas using robust non-linear methods.
For this, alap parts of a particular raga sung by all the artists were analyzed with the help of non linear multifractal analysis (MFDFA) technique.
The spectral width obtained from the MFDFA method gives an estimate of the complexity of the signal.
The observations give a cue in the direction to the scientific recognition of guru-shisya parampara (teacher-student tradition) a hitherto much-heard philosophical term.
Moreover the variation in the complexity patterns among various gharanas will give a hint of the characteristic feature of that particular gharana as well as the effect of globalization in the field of classical music happening through past few decades.
Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc.
Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the data.
In this paper, we address the especially challenging problem of discovering behavior-based driving patterns from only externally observable phenomena (e.g. vehicle's speed).
We present a trajectory segmentation approach capable of discovering driving patterns as separate segments, based on the behavior of drivers.
This segmentation approach includes a novel transformation of trajectories along with a dynamic programming approach for segmentation.
We apply the segmentation approach on a real-word, rich dataset of personal car trajectories provided by a major insurance company based in Columbus, Ohio.
Analysis and preliminary results show the applicability of approach for finding significant driving patterns.
A remote-sensing system that can determine the position of hidden objects has applications in many critical real-life scenarios, such as search and rescue missions and safe autonomous driving.
Previous work has shown the ability to range and image objects hidden from the direct line of sight, employing advanced optical imaging technologies aimed at small objects at short range.
In this work we demonstrate a long-range tracking system based on single laser illumination and single-pixel single-photon detection.
This enables us to track one or more people hidden from view at a stand-off distance of over 50m.
These results pave the way towards next generation LiDAR systems that will reconstruct not only the direct-view scene but also the main elements hidden behind walls or corners.
Glaucoma is a disease in which the optic nerve is chronically damaged by the elevation of the intra-ocular pressure, resulting in visual field defect.
Therefore, it is important to monitor and treat suspected patients before they are confirmed with glaucoma.
In this paper, we propose a 2-stage ranking-CNN that classifies fundus images as normal, suspicious, and glaucoma.
Furthermore, we propose a method of using the class activation map as a mask filter and combining it with the original fundus image as an intermediate input.
Our results have improved the average accuracy by about 10% over the existing 3-class CNN and ranking-CNN, and especially improved the sensitivity of suspicious class by more than 20% over 3-class CNN.
In addition, the extracted ROI was also found to overlap with the diagnostic criteria of the physician.
The method we propose is expected to be efficiently applied to any medical data where there is a suspicious condition between normal and disease.
Biometrics emerged as a robust solution for security systems.
However, given the dissemination of biometric applications, criminals are developing techniques to circumvent them by simulating physical or behavioral traits of legal users (spoofing attacks).
Despite face being a promising characteristic due to its universality, acceptability and presence of cameras almost everywhere, face recognition systems are extremely vulnerable to such frauds since they can be easily fooled with common printed facial photographs.
State-of-the-art approaches, based on Convolutional Neural Networks (CNNs), present good results in face spoofing detection.
However, these methods do not consider the importance of learning deep local features from each facial region, even though it is known from face recognition that each facial region presents different visual aspects, which can also be exploited for face spoofing detection.
In this work we propose a novel CNN architecture trained in two steps for such task.
Initially, each part of the neural network learns features from a given facial region.
Afterwards, the whole model is fine-tuned on the whole facial images.
Results show that such pre-training step allows the CNN to learn different local spoofing cues, improving the performance and the convergence speed of the final model, outperforming the state-of-the-art approaches.
The Robinson-Goforth topology of swaps in adjoining payoffs elegantly arranges 2x2 ordinal games in accordance with important properties including symmetry, number of dominant strategies and Nash Equilibria, and alignment of interests.
Adding payoff families based on Nash Equilibria illustrates an additional aspect of this order and aids visualization of the topology.
Making ties through half-swaps not only creates simpler games within the topology, but, in reverse, breaking ties shows the evolution of preferences, yielding a natural ordering for the topology of 2x2 games with ties.
An ordinal game not only represents an equivalence class of games with real values, but also a discrete equivalent of the normalized version of those games.
The topology provides coordinates which could be used to identify related games in a semantic web ontology and facilitate comparative analysis of agent-based simulations and other research in game theory, as well as charting relationships and potential moves between games as a tool for institutional analysis and design.
Due to the ubiquity of batch data processing in cloud computing, the related problem of scheduling malleable batch tasks and its extensions have received significant attention recently.
In this paper, we consider a fundamental model where a set of n tasks is to be processed on C identical machines and each task is specified by a value, a workload, a deadline and a parallelism bound.
Within the parallelism bound, the number of machines assigned to a task can vary over time without affecting its workload.
For this model, we obtain two core results: a sufficient and necessary condition such that a set of tasks can be finished by their deadlines on C machines, and an algorithm to produce such a schedule.
These core results provide a conceptual tool and an optimal scheduling algorithm that enable proposing new algorithmic analysis and design and improving existing algorithms under various objectives.
We introduce a novel model for spatially varying variational data fusion, driven by point-wise confidence values.
The proposed model allows for the joint estimation of the data and the confidence values based on the spatial coherence of the data.
We discuss the main properties of the introduced model as well as suitable algorithms for estimating the solution of the corresponding biconvex minimization problem and their convergence.
The performance of the proposed model is evaluated considering the problem of depth image fusion by using both synthetic and real data from publicly available datasets.
We present a convolutional network capable of inferring a 3D representation of a previously unseen object given a single image of this object.
Concretely, the network can predict an RGB image and a depth map of the object as seen from an arbitrary view.
Several of these depth maps fused together give a full point cloud of the object.
The point cloud can in turn be transformed into a surface mesh.
The network is trained on renderings of synthetic 3D models of cars and chairs.
It successfully deals with objects on cluttered background and generates reasonable predictions for real images of cars.
This paper addresses the problem of automated vehicle tracking and recognition from aerial image sequences.
Motivated by its successes in the existing literature focus on the use of linear appearance subspaces to describe multi-view object appearance and highlight the challenges involved in their application as a part of a practical system.
A working solution which includes steps for data extraction and normalization is described.
In experiments on real-world data the proposed methodology achieved promising results with a high correct recognition rate and few, meaningful errors (type II errors whereby genuinely similar targets are sometimes being confused with one another).
Directions for future research and possible improvements of the proposed method are discussed.
Recent efforts in practical symbolic execution have successfully mitigated the path-explosion problem to some extent with search-based heuristics and compositional approaches.
Similarly, due to an increase in the performance of cheap multi-core commodity computers, fuzzing as a viable method of random mutation-based testing has also seen promise.
However, the possibility of combining symbolic execution and fuzzing, thereby providing an opportunity to mitigate drawbacks in each other, has not been sufficiently explored.
Fuzzing could, for example, expedite path-exploration in symbolic execution, and symbolic execution could make seed input generation in fuzzing more efficient.
There have only been, in our view, very few hybrid solution proposals with symbolic execution and fuzzing at their centre.
By analyzing 77 relevant and systematically selected papers, we (1) present an overview of hybrid solution proposals of symbolic execution and fuzzing, (2) perform a gap analysis in research of hybrid techniques to improve both, plain symbolic execution and fuzzing, (3) propose new ideas for hybrid test-case generation techniques.
We introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions.
The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa).
All questions were motivated by real situations and written by thousands of authors with very different backgrounds and levels of literacy, while answers were elaborated by specialists from Embrapa's customer service.
Our dataset was filtered and anonymized by three human annotators.
Consumer questions are a challenging kind of question that is usually employed as a form of seeking information.
Although several question answering datasets are available, most of such resources are not suitable for research on answer selection models for consumer questions.
We aim to fill this gap by making MilkQA publicly available.
We study the behavior of four answer selection models on MilkQA: two baseline models and two convolutional neural network archictetures.
Our results show that MilkQA poses real challenges to computational models, particularly due to linguistic characteristics of its questions and to their unusually longer lengths.
Only one of the experimented models gives reasonable results, at the cost of high computational requirements.
Generating secure random numbers is vital to the security and privacy infrastructures we rely on today.
Having a computer system generate a secure random number is not a trivial problem due to the deterministic nature of computer systems.
Servers commonly deal with this problem through hardware-based random number generators, which can come in the form of expansion cards, dongles, or integrated into the CPU itself.
With the explosion of network- and internet-connected devices, however, the problem of cryptography is no longer a server-centric problem; even small devices need a reliable source of randomness for cryptographic operations - for example, network devices and appliances like routers, switches and access points, as well as various Internet-of-Things (IoT) devices for security and remote management.
This paper proposes a software solution based on side-channel measurements as a source of high-quality entropy (nicknamed "SideRand"), that can theoretically be applied to most platforms (large servers, appliances, even maker boards like RaspberryPi or Arduino), and generates a seed for a regular CSPRNG to enable proper cryptographic operations for security and privacy.
This paper also proposes two criteria - openness and auditability - as essential requirements for confidence in any random generator for cryptographic use, and discusses how SideRand meets the two criteria (and how most hardware devices do not).
Compared to other behavioural biometrics, mouse dynamics is a less explored area.
General purpose data sets containing unrestricted mouse usage data are usually not available.
The Balabit data set was released in 2016 for a data science competition, which against the few subjects, can be considered the first adequate publicly available one.
This paper presents a performance evaluation study on this data set for impostor detection.
The existence of very short test sessions makes this data set challenging.
Raw data were segmented into mouse move, point and click and drag and drop types of mouse actions, then several features were extracted.
In contrast to keystroke dynamics, mouse data is not sensitive, therefore it is possible to collect negative mouse dynamics data and to use two-class classifiers for impostor detection.
Both action- and set of actions-based evaluations were performed.
Set of actions-based evaluation achieves 0.92 AUC on the test part of the data set.
However, the same type of evaluation conducted on the training part of the data set resulted in maximal AUC (1) using only 13 actions.
Drag and drop mouse actions proved to be the best actions for impostor detection.
Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.
In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning.
We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset called CLEVR, on which we achieve state-of-the-art, super-human performance; text-based question answering using the bAbI suite of tasks; and complex reasoning about dynamic physical systems.
Then, using a curated dataset called Sort-of-CLEVR we show that powerful convolutional networks do not have a general capacity to solve relational questions, but can gain this capacity when augmented with RNs.
Our work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.
A crucial and time-sensitive task when any disaster occurs is to rescue victims and distribute resources to the right groups and locations.
This task is challenging in populated urban areas, due to the huge burst of help requests generated in a very short period.
To improve the efficiency of the emergency response in the immediate aftermath of a disaster, we propose a heuristic multi-agent reinforcement learning scheduling algorithm, named as ResQ, which can effectively schedule the rapid deployment of volunteers to rescue victims in dynamic settings.
The core concept is to quickly identify victims and volunteers from social network data and then schedule rescue parties with an adaptive learning algorithm.
This framework performs two key functions: 1) identify trapped victims and rescue volunteers, and 2) optimize the volunteers' rescue strategy in a complex time-sensitive environment.
The proposed ResQ algorithm can speed up the training processes through a heuristic function which reduces the state-action space by identifying the set of particular actions over others.
Experimental results showed that the proposed heuristic multi-agent reinforcement learning based scheduling outperforms several state-of-art methods, in terms of both reward rate and response times.
The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data.
In this paper, we systematically explore the algorithmic implications of using this measure for optimization.
We show that the problem is NP-hard, which justifies the usage of worst-case exponential-time as well as heuristic search methods.
We then substantially improve the practical performance for both optimization styles by deriving a novel admissible bounding function that has an unbounded potential for additional pruning over the previously proposed one.
Finally, we empirically investigate the approximation ratio of the greedy algorithm and show that it produces highly competitive results in a fraction of time needed for complete branch-and-bound style search.
N-continuous orthogonal frequency division multiplexing (NC-OFDM) was demonstrated to provide significant sidelobe suppression for baseband OFDM signals.
However, it will introduce severe interference to the transmit signals.
Hence in this letter, we specifically design a class of low-interference NC-OFDM schemes for alleviating the introduced interference.
Meanwhile, we also obtain an asymptotic spectrum analysis by a closed-form expression.
It is shown that the proposed scheme is capable of reducing the interference to a negligible level, and hence to save the high complexity of signal recovery at the receiver, while maintaining similar sidelobe suppression performance compared to traditional NC-OFDM.
Using (a,b)-trees as an example, we show how to perform a parallel split with logarithmic latency and parallel join, bulk updates, intersection, union (or merge), and (symmetric) set difference with logarithmic latency and with information theoretically optimal work.
We present both asymptotically optimal solutions and simplified versions that perform well in practice - they are several times faster than previous implementations.
In this article we derive a Pontryagin maximum principle (PMP) for discrete-time optimal control problems on matrix Lie groups.
The PMP provides first order necessary conditions for optimality; these necessary conditions typically yield two point boundary value problems, and these boundary value problems can then solved to extract optimal control trajectories.
Constrained optimal control problems for mechanical systems, in general, can only be solved numerically, and this motivates the need to derive discrete-time models that are accurate and preserve the non-flat manifold structures of the underlying continuous-time controlled systems.
The PMPs for discrete-time systems evolving on Euclidean spaces are not readily applicable to discrete-time models evolving on non-flat manifolds.
In this article we bridge this lacuna and establish a discrete-time PMP on matrix Lie groups.
Our discrete-time models are derived via discrete mechanics, (a structure preserving discretization scheme,) leading to the preservation of the underlying manifold over time, thereby resulting in greater numerical accuracy of our technique.
This PMP caters to a class of constrained optimal control problems that includes point-wise state and control action constraints, and encompasses a large class of control problems that arise in various field of engineering and the applied sciences.
As the senior population rapidly increases, it is challenging yet crucial to provide effective long-term care for seniors who live at home or in senior care facilities.
Smart senior homes, which have gained widespread interest in the healthcare community, have been proposed to improve the well-being of seniors living independently.
In particular, non-intrusive, cost-effective sensors placed in these senior homes enable gait characterization, which can provide clinically relevant information including mobility level and early neurodegenerative disease risk.
In this paper, we present a method to perform gait analysis from a single camera placed within the home.
We show that we can accurately calculate various gait parameters, demonstrating the potential for our system to monitor the long-term gait of seniors and thus aid clinicians in understanding a patient's medical profile.
Several exact recovery criteria (ERC) ensuring that orthogonal matching pursuit (OMP) identifies the correct support of sparse signals have been developed in the last few years.
These ERC rely on the restricted isometry property (RIP), the associated restricted isometry constant (RIC) and sometimes the restricted orthogonality constant (ROC).
In this paper, three of the most recent ERC for OMP are examined.
The contribution is to show that these ERC remain valid for a generalization of OMP, entitled simultaneous orthogonal matching pursuit (SOMP), that is capable to process several measurement vectors simultaneously and return a common support estimate for the underlying sparse vectors.
The sharpness of the bounds is also briefly discussed in light of previous works focusing on OMP.
The words of a language reflect the structure of the human mind, allowing us to transmit thoughts between individuals.
However, language can represent only a subset of our rich and detailed cognitive architecture.
Here, we ask what kinds of common knowledge (semantic memory) are captured by word meanings (lexical semantics).
We examine a prominent computational model that represents words as vectors in a multidimensional space, such that proximity between word-vectors approximates semantic relatedness.
Because related words appear in similar contexts, such spaces - called "word embeddings" - can be learned from patterns of lexical co-occurrences in natural language.
Despite their popularity, a fundamental concern about word embeddings is that they appear to be semantically "rigid": inter-word proximity captures only overall similarity, yet human judgments about object similarities are highly context-dependent and involve multiple, distinct semantic features.
For example, dolphins and alligators appear similar in size, but differ in intelligence and aggressiveness.
Could such context-dependent relationships be recovered from word embeddings?
To address this issue, we introduce a powerful, domain-general solution: "semantic projection" of word-vectors onto lines that represent various object features, like size (the line extending from the word "small" to "big"), intelligence (from "dumb" to "smart"), or danger (from "safe" to "dangerous").
This method, which is intuitively analogous to placing objects "on a mental scale" between two extremes, recovers human judgments across a range of object categories and properties.
We thus show that word embeddings inherit a wealth of common knowledge from word co-occurrence statistics and can be flexibly manipulated to express context-dependent meanings.
This document provides the results of the tests of acoustic parameter estimation algorithms on the Acoustic Characterization of Environments (ACE) Challenge Evaluation dataset which were subsequently submitted and written up into papers for the Proceedings of the ACE Challenge.
This document is supporting material for a forthcoming journal paper on the ACE Challenge which will provide further analysis of the results.
In this paper, we develop a system for the low-cost indoor localization and tracking problem using radio signal strength indicator, Inertial Measurement Unit (IMU), and magnetometer sensors.
We develop a novel and simplified probabilistic IMU motion model as the proposal distribution of the sequential Monte-Carlo technique to track the robot trajectory.
Our algorithm can globally localize and track a robot with a priori unknown location, given an informative prior map of the Bluetooth Low Energy (BLE) beacons.
Also, we formulate the problem as an optimization problem that serves as the Back-end of the algorithm mentioned above (Front-end).
Thus, by simultaneously solving for the robot trajectory and the map of BLE beacons, we recover a continuous and smooth trajectory of the robot, corrected locations of the BLE beacons, and the time-varying IMU bias.
The evaluations achieved using hardware show that through the proposed closed-loop system the localization performance can be improved; furthermore, the system becomes robust to the error in the map of beacons by feeding back the optimized map to the Front-end.
Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce.
We address this problem by integrating a meta-learning procedure that uses the knowledge learned across many tasks as an inductive bias towards better natural language understanding.
Based on the Model-Agnostic Meta-Learning framework (MAML), we introduce the Attentive Task-Agnostic Meta-Learning (ATAML) algorithm for text classification.
The essential difference between MAML and ATAML is in the separation of task-agnostic representation learning and task-specific attentive adaptation.
The proposed ATAML is designed to encourage task-agnostic representation learning by way of task-agnostic parameterization and facilitate task-specific adaptation via attention mechanisms.
We provide evidence to show that the attention mechanism in ATAML has a synergistic effect on learning performance.
In comparisons with models trained from random initialization, pretrained models and meta trained MAML, our proposed ATAML method generalizes better on single-label and multi-label classification tasks in miniRCV1 and miniReuters-21578 datasets.
In the recent years we have witnessed a rapid development of new algorithmic techniques for parameterized algorithms for graph separation problems.
We present experimental evaluation of two cornerstone theoretical results in this area: linear-time branching algorithms guided by half-integral relaxations and kernelization (preprocessing) routines based on representative sets in matroids.
A side contribution is a new set of benchmark instances of (unweighted, vertex-deletion) Multiway Cut.
Caches in Content-Centric Networks (CCN) are increasingly adopting flash memory based storage.
The current flash cache technology stores all files with the largest possible expiry date, i.e. the files are written in the memory so that they are retained for as long as possible.
This, however, does not leverage the CCN data characteristics where content is typically short-lived and has a distinct popularity profile.
Writing files in a cache using the longest retention time damages the memory device thus reducing its lifetime.
However, writing using a small retention time can increase the content retrieval delay, since, at the time a file is requested, the file may already have been expired from the memory.
This motivates us to consider a joint optimization wherein we obtain optimal policies for jointly minimizing the content retrieval delay (which is a network-centric objective) and the flash damage (which is a device-centric objective).
Caching decisions now not only involve what to cache but also for how long to cache each file.
We design provably optimal policies and numerically compare them against prior policies.
In this paper, we propose an interpretable LSTM recurrent neural network, i.e., multi-variable LSTM for time series with exogenous variables.
Currently, widely used attention mechanism in recurrent neural networks mostly focuses on the temporal aspect of data and falls short of characterizing variable importance.
To this end, our multi-variable LSTM equipped with tensorized hidden states is developed to learn variable specific representations, which give rise to both temporal and variable level attention.
Preliminary experiments demonstrate comparable prediction performance of multi-variable LSTM w.r.t. encoder-decoder based baselines.
More interestingly, variable importance in real datasets characterized by the variable attention is highly in line with that determined by statistical Granger causality test, which exhibits the prospect of multi-variable LSTM as a simple and uniform end-to-end framework for both forecasting and knowledge discovery.
Autism Spectrum Disorder (ASD) is neurodevelopmental condition characterized by social interaction and communication difficulties, along with narrow and repetitive interests.
Being an spectrum disorder, ASD affects individuals with a large range of combinations of challenges along dimensions such intelligence, social skills, or sensory processing.
Hence, any interactive technology for ASD ought to be customizable to fit the particular profile of each individual that uses it.
The goal of this paper is to characterize the support of customization in this area.
To do so, we performed a focused study that identifies the dimensions of ASD where customization has been considered on wearable and natural surfaces technologies, two of the most promising technologies for ASD, and assess the empirical evaluation that supports them.
Our study revealed that, even though its critical importance, customization has fundamentally not been addressed in this domain and it opened avenues for research at the intersection of human-computer interaction and software engineering.
Current reconfiguration techniques are based on starting the system in a consistent configuration, in which all participating entities are in their initial state.
Starting from that state, the system must preserve consistency as long as a predefined churn rate of processors joins and leaves is not violated, and unbounded storage is available.
Many working systems cannot control this churn rate and do not have access to unbounded storage.
System designers that neglect the outcome of violating the above assumptions may doom the system to exhibit illegal behaviors.
We present the first automatically recovering reconfiguration scheme that recovers from transient faults, such as temporal violations of the above assumptions.
Our self-stabilizing solutions regain safety automatically by assuming temporal access to reliable failure detectors.
Once safety is re-established, the failure detector reliability is no longer needed.
Still, liveness is conditioned by the failure detector's unreliable signals.
We show that our self-stabilizing reconfiguration techniques can serve as the basis for the implementation of several dynamic services over message passing systems.
Examples include self-stabilizing reconfigurable virtual synchrony, which, in turn, can be used for implementing a self-stabilizing reconfigurable state-machine replication and self-stabilizing reconfigurable emulation of shared memory.
Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks.
Generative topic models infer topic-word distributions, taking no or only little context into account.
Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion.
This results in an improved performance in terms of generalization, interpretability and applicability.
We apply our modeling approach to seven data sets from various domains and demonstrate that our approach consistently outperforms stateof-the-art generative topic models.
With the learned representations, we show on an average a gain of 9.6% (0.57 Vs 0.52) in precision at retrieval fraction 0.02 and 7.2% (0.582 Vs 0.543) in F1 for text categorization.
We examine connections between combinatorial notions that arise in machine learning and topological notions in cubical/simplicial geometry.
These connections enable to export results from geometry to machine learning.
Our first main result is based on a geometric construction by Tracy Hall (2004) of a partial shelling of the cross-polytope which can not be extended.
We use it to derive a maximum class of VC dimension 3 that has no corners.
This refutes several previous works in machine learning from the past 11 years.
In particular, it implies that all previous constructions of optimal unlabeled sample compression schemes for maximum classes are erroneous.
On the positive side we present a new construction of an unlabeled sample compression scheme for maximum classes.
We leave as open whether our unlabeled sample compression scheme extends to ample (a.k.a. lopsided or extremal) classes, which represent a natural and far-reaching generalization of maximum classes.
Towards resolving this question, we provide a geometric characterization in terms of unique sink orientations of the 1-skeletons of associated cubical complexes.
We explore the role of interaction for the problem of reliable computation over two-way multicast networks.
Specifically we consider a four-node network in which two nodes wish to compute a modulo-sum of two independent Bernoulli sources generated from the other two, and a similar task is done in the other direction.
The main contribution of this work lies in the characterization of the computation capacity region for a deterministic model of the network via a novel transmission scheme.
One consequence of this result is that, not only we can get an interaction gain over the one-way non-feedback computation capacities, but also we can sometimes get all the way to perfect-feedback computation capacities simultaneously in both directions.
This result draws a parallel with the recent result developed in the context of two-way interference channels.
Metaheuristic particle swarm optimization (PSO) algorithm has emerged as one of the most promising optimization techniques in solving highly constrained non-linear and non-convex optimization problems in different areas of electrical engineering.
Economic operation of the power system is one of the most important areas of electrical engineering where PSO has been used efficiently in solving various issues of practical systems.
In this paper, a comprehensive survey of research works in solving various aspects of economic load dispatch (ELD) problems of power system engineering using different types of PSO algorithms is presented.
Five important areas of ELD problems have been identified, and the papers published in the general area of ELD using PSO have been classified into these five sections.
These five areas are (i) single objective economic load dispatch, (ii) dynamic economic load dispatch, (iii) economic load dispatch with non-conventional sources, (iv) multi-objective environmental/economic dispatch, and (v) economic load dispatch of microgrids.
At the end of each category, a table is provided which describes the main features of the papers in brief.
The promising future works are given at the conclusion of the review.
The use of modern technology in Education is the key to an increased drive for learning which shape learners critical and analytic competencies with respect to disciplinary knowledge.
Distance education (DE) is a system of learning driven by computer linked to internet.
The flexible nature of DE avail students who are unable to attend full time education due to age, social or religious barriers.
However, in Nigeria, the University of Lagos distance learning Institute has its shortfalls traced to poor student support system, which affects the service delivery to students.
The study examined Influence of Information Support on ICT use by distance learners.
Transformation of Machine Learning (ML) from a boutique science to a generally accepted technology has increased importance of reproduction and transportability of ML studies.
In the current work, we investigate how corpus characteristics of textual data sets correspond to text classification results.
We work with two data sets gathered from sub-forums of an online health-related forum.
Our empirical results are obtained for a multi-class sentiment analysis application.
A visible light communication broadcast channel is considered, in which a transmitter luminaire communicates with two legitimate receivers in the presence of an external eavesdropper.
A number of trusted cooperative half-duplex relay luminaires are deployed to aid with securing the transmitted data.
Transmitters are equipped with single light fixtures, containing multiple light emitting diodes, and receiving nodes are equipped with single photo-detectors, rendering the considered setting as a single-input single-output system.
Transmission is amplitude-constrained to maintain operation within the light emitting diodes' dynamic range.
Achievable secrecy rate regions are derived under such amplitude constraints for this multi-receiver wiretap channel, first for direct transmission without the relays, and then for multiple relaying schemes: cooperative jamming, decode-and-forward, and amplify-and-forward.
Superposition coding with uniform signaling is used at the transmitter and the relays.
Further, for each relaying scheme, secure beamforming vectors are carefully designed at the relay nodes in order to hurt the eavesdropper and/or benefit the legitimate receivers.
Superiority of the proposed relaying schemes, with secure beamforming, is shown over direct transmission.
It is also shown that the best relaying scheme depends on how far the eavesdropper is located from the transmitter and the relays, the number of relays, and their geometric layout.
The quality of high-level AI of non-player characters (NPCs) in commercial open-world games (OWGs) has been increasing during the past years.
However, due to constraints specific to the game industry, this increase has been slow and it has been driven by larger budgets rather than adoption of new complex AI techniques.
Most of the contemporary AI is still expressed as hard-coded scripts.
The complexity and manageability of the script codebase is one of the key limiting factors for further AI improvements.
In this paper we address this issue.
We present behavior objects - a general approach to development of NPC behaviors for large OWGs.
Behavior objects are inspired by object-oriented programming and extend the concept of smart objects.
Our approach promotes encapsulation of data and code for multiple related behaviors in one place, hiding internal details and embedding intelligence in the environment.
Behavior objects are a natural abstraction of five different techniques that we have implemented to manage AI complexity in an upcoming AAA OWG.
We report the details of the implementations in the context of behavior trees and the lessons learned during development.
Our work should serve as inspiration for AI architecture designers from both the academia and the industry.
The ozone level prediction is an important task of air quality agencies of modern cities.
In this paper, we design an ozone level alarm system (OLP) for Isfahan city and test it through the real word data from 1-1-2000 to 7-6-2011.
We propose a computer based system with three inputs and single output.
The inputs include three sensors of solar ultraviolet (UV), total solar radiation (TSR) and total ozone (O3).
And the output of the system is the predicted O3 of the next day and the alarm massages.
A developed artificial intelligence (AI) algorithm is applied to determine the output, based on the inputs variables.
For this issue, AI models, including supervised brain emotional learning (BEL), adaptive neuro-fuzzy inference system (ANFIS) and artificial neural networks (ANNs), are compared in order to find the best model.
The simulation of the proposed system shows that it can be used successfully in prediction of major cities ozone level.
With the popularity of deep learning (DL), artificial intelligence (AI) has been applied in many areas of human life.
Neural network or artificial neural network (NN), the main technique behind DL, has been extensively studied to facilitate computer vision and natural language recognition.
However, the more we rely on information technology, the more vulnerable we are.
That is, malicious NNs could bring huge threat in the so-called coming AI era.
In this paper, for the first time in the literature, we propose a novel approach to design and insert powerful neural-level trojans or PoTrojan in pre-trained NN models.
Most of the time, PoTrojans remain inactive, not affecting the normal functions of their host NN models.
PoTrojans could only be triggered in very rare conditions.
Once activated, however, the PoTrojans could cause the host NN models to malfunction, either falsely predicting or classifying, which is a significant threat to human society of the AI era.
We would explain the principles of PoTrojans and the easiness of designing and inserting them in pre-trained deep learning models.
PoTrojans doesn't modify the existing architecture or parameters of the pre-trained models, without re-training.
Hence, the proposed method is very efficient.
Interpreting black box classifiers, such as deep networks, allows an analyst to validate a classifier before it is deployed in a high-stakes setting.
A natural idea is to visualize the deep network's representations, so as to "see what the network sees".
In this paper, we demonstrate that standard dimension reduction methods in this setting can yield uninformative or even misleading visualizations.
Instead, we present DarkSight, which visually summarizes the predictions of a classifier in a way inspired by notion of dark knowledge.
DarkSight embeds the data points into a low-dimensional space such that it is easy to compress the deep classifier into a simpler one, essentially combining model compression and dimension reduction.
We compare DarkSight against t-SNE both qualitatively and quantitatively, demonstrating that DarkSight visualizations are more informative.
Our method additionally yields a new confidence measure based on dark knowledge by quantifying how unusual a given vector of predictions is.
We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks.
The approach involves going beyond worst-case theoretical capacity control frameworks that have been popular in machine learning in recent years to revisit old ideas in the statistical mechanics of neural networks.
Within this approach, we present a prototypical Very Simple Deep Learning (VSDL) model, whose behavior is controlled by two control parameters, one describing an effective amount of data, or load, on the network (that decreases when noise is added to the input), and one with an effective temperature interpretation (that increases when algorithms are early stopped).
Using this model, we describe how a very simple application of ideas from the statistical mechanics theory of generalization provides a strong qualitative description of recently-observed empirical results regarding the inability of deep neural networks not to overfit training data, discontinuous learning and sharp transitions in the generalization properties of learning algorithms, etc.
Accurate noise modelling is important for training of deep learning reconstruction algorithms.
While noise models are well known for traditional imaging techniques, the noise distribution of a novel sensor may be difficult to determine a priori.
Therefore, we propose learning arbitrary noise distributions.
To do so, this paper proposes a fully connected neural network model to map samples from a uniform distribution to samples of any explicitly known probability density function.
During the training, the Jensen-Shannon divergence between the distribution of the model's output and the target distribution is minimized.
We experimentally demonstrate that our model converges towards the desired state.
It provides an alternative to existing sampling methods such as inversion sampling, rejection sampling, Gaussian mixture models and Markov-Chain-Monte-Carlo.
Our model has high sampling efficiency and is easily applied to any probability distribution, without the need of further analytical or numerical calculations.
With joint learning of sampling and recovery, the deep learning-based compressive sensing (DCS) has shown significant improvement in performance and running time reduction.
Its reconstructed image, however, losses high-frequency content especially at low subrates.
This happens similarly in the multi-scale sampling scheme which also samples more low-frequency components.
In this paper, we propose a multi-scale DCS convolutional neural network (MS-DCSNet) in which we convert image signal using multiple scale-based wavelet transform, then capture it through convolution block by block across scales.
The initial reconstructed image is directly recovered from multi-scale measurements.
Multi-scale wavelet convolution is utilized to enhance the final reconstruction quality.
The network is able to learn both multi-scale sampling and multi-scale reconstruction, thus results in better reconstruction quality.
Sybil attacks are becoming increasingly widespread and pose a significant threat to online social systems; a single adversary can inject multiple colluding identities in the system to compromise security and privacy.
Recent works have leveraged social network-based trust relationships to defend against Sybil attacks.
However, existing defenses are based on oversimplified assumptions about network structure, which do not necessarily hold in real-world social networks.
Recognizing these limitations, we propose SybilFuse, a defense-in-depth framework for Sybil detection when the oversimplified assumptions are relaxed.
SybilFuse adopts a collective classification approach by first training local classifiers to compute local trust scores for nodes and edges, and then propagating the local scores through the global network structure via weighted random walk and loopy belief propagation mechanisms.
We evaluate our framework on both synthetic and real-world network topologies, including a large-scale, labeled Twitter network comprising 20M nodes and 265M edges, and demonstrate that SybilFuse outperforms state-of-the-art approaches significantly.
In particular, SybilFuse achieves 98% of Sybil coverage among top-ranked nodes.
Dedicated Short Range Communication (DSRC) was designed to provide reliable wireless communication for intelligent transportation system applications.
Sharing information among cars and between cars and the infrastructure, pedestrians, or "the cloud" has great potential to improve safety, mobility and fuel economy.
DSRC is being considered by the US Department of Transportation to be required for ground vehicles.
In the past, their performance has been assessed thoroughly in the labs and limited field testing, but not on a large fleet.
In this paper, we present the analysis of DSRC performance using data from the world's largest connected vehicle test program - Safety Pilot Model Deployment lead by the University of Michigan.
We first investigate their maximum and effective range, and then study the effect of environmental factors, such as trees/foliage, weather, buildings, vehicle travel direction, and road elevation.
The results can be used to guide future DSRC equipment placement and installation, and can be used to develop DSRC communication models for numerical simulations.
Interactive visualizations are crucial in ad hoc data exploration and analysis.
However, with the growing number of massive datasets, generating visualizations in interactive timescales is increasingly challenging.
One approach for improving the speed of the visualization tool is via data reduction in order to reduce the computational overhead, but at a potential cost in visualization accuracy.
Common data reduction techniques, such as uniform and stratified sampling, do not exploit the fact that the sampled tuples will be transformed into a visualization for human consumption.
We propose a visualization-aware sampling (VAS) that guarantees high quality visualizations with a small subset of the entire dataset.
We validate our method when applied to scatter and map plots for three common visualization goals: regression, density estimation, and clustering.
The key to our sampling method's success is in choosing tuples which minimize a visualization-inspired loss function.
Our user study confirms that optimizing this loss function correlates strongly with user success in using the resulting visualizations.
We also show the NP-hardness of our optimization problem and propose an efficient approximation algorithm.
Our experiments show that, compared to previous methods, (i) using the same sample size, VAS improves user's success by up to 35% in various visualization tasks, and (ii) VAS can achieve a required visualization quality up to 400 times faster.
Existing corpora for intrinsic evaluation are not targeted towards tasks in informal domains such as Twitter or news comment forums.
We want to test whether a representation of informal words fulfills the promise of eliding explicit text normalization as a preprocessing step.
One possible evaluation metric for such domains is the proximity of spelling variants.
We propose how such a metric might be computed and how a spelling variant dataset can be collected using UrbanDictionary.
Segmentation of retinal vessels from retinal fundus images is the key step in the automatic retinal image analysis.
In this paper, we propose a new unsupervised automatic method to segment the retinal vessels from retinal fundus images.
Contrast enhancement and illumination correction are carried out through a series of image processing steps followed by adaptive histogram equalization and anisotropic diffusion filtering.
This image is then converted to a gray scale using weighted scaling.
The vessel edges are enhanced by boosting the detail curvelet coefficients.
Optic disk pixels are removed before applying fuzzy C-mean classification to avoid the misclassification.
Morphological operations and connected component analysis are applied to obtain the segmented retinal vessels.
The performance of the proposed method is evaluated using DRIVE database to be able to compare with other state-of-art supervised and unsupervised methods.
The overall segmentation accuracy of the proposed method is 95.18% which outperforms the other algorithms.
Affective computing has become a very important research area in human-machine interaction.
However, affects are subjective, subtle, and uncertain.
So, it is very difficult to obtain a large number of labeled training samples, compared with the number of possible features we could extract.
Thus, dimensionality reduction is critical in affective computing.
This paper presents our preliminary study on dimensionality reduction for affect classification.
Five popular dimensionality reduction approaches are introduced and compared.
Experiments on the DEAP dataset showed that no approach can universally outperform others, and performing classification using the raw features directly may not always be a bad choice.
The problem of distributed dynamic state estimation in wireless sensor networks is studied.
Two important properties of local estimates, namely, the consistency and confidence, are emphasized.
On one hand, the consistency, which means that the approximated error covariance is lower bounded by the true unknown one, has to be guaranteed so that the estimate is not over-confident.
On the other hand, since the confidence indicates the accuracy of the estimate, the estimate should be as confident as possible.
We first analyze two different information fusion strategies used in the case of information sources with, respectively, uncorrelated errors and unknown but correlated errors.
Then a distributed hybrid information fusion algorithm is proposed, where each agent uses the information obtained not only by itself, but also from its neighbors through communication.
The proposed algorithm not only guarantees the consistency of the estimates, but also utilizes the available information sources in a more efficient manner and hence improves the confidence.
Besides, the proposed algorithm is fully distributed and guarantees convergence with the sufficient condition formulated.
The comparisons with existing algorithms are shown.
In this paper, we propose a novel deep neural network architecture, Sequence-to-Sequence Audio2Vec, for unsupervised learning of fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain semantic information pertaining to the segments, and are close to other vectors in the embedding space if their corresponding segments are semantically similar.
The design of the proposed model is based on the RNN Encoder-Decoder framework, and borrows the methodology of continuous skip-grams for training.
The learned vector representations are evaluated on 13 widely used word similarity benchmarks, and achieved competitive results to that of GloVe.
The biggest advantage of the proposed model is its capability of extracting semantic information of audio segments taken directly from raw speech, without relying on any other modalities such as text or images, which are challenging and expensive to collect and annotate.
In three dimensional integrated circuits (3D-ICs), through silicon via (TSV) is a critical technique in providing vertical connections.
However, the yield and reliability is one of the key obstacles to adopt the TSV based 3D-ICs technology in industry.
Various fault-tolerance structures using spare TSVs to repair faulty functional TSVs have been proposed in literature for yield and reliability enhancement, but a valid structure cannot always be found due to the lack of effective generation methods for fault-tolerance structures.
In this paper, we focus on the problem of adaptive fault-tolerance structure generation.
Given the relations between functional TSVs and spare TSVs, we first calculate the maximum number of tolerant faults in each TSV group.
Then we propose an integer linear programming (ILP) based model to construct adaptive fault-tolerance struc- ture with minimal multiplexer delay overhead and hardware cost.
We further develop a speed-up technique through efficient min-cost-max-flow (MCMF) model.
All the proposed method- ologies are embedded in a top-down TSV planning framework to form functional TSV groups and generate adaptive fault- tolerance structures.
Experimental results show that, compared with state-of-the-art, the number of spare TSVs used for fault tolerance can be effectively reduced.
This article considers application of genetic algorithms for finite machine synthesis.
The resulting genetic finite state machines synthesis algorithm allows for creation of machines with less number of states and within shorter time.
This makes it possible to use hardware-oriented genetic finite machines synthesis algorithm in autonomous systems on reconfigurable platforms.
Network functions (e.g., firewalls, load balancers, etc.) have been traditionally provided through proprietary hardware appliances.
Often, hardware appliances need to be hardwired back to back to form a service chain providing chained network functions.
Hardware appliances cannot be provisioned on demand since they are statically embedded in the network topology, making creation, insertion, modification, upgrade, and removal of service chains complex, and also slowing down service innovation.
Hence, network operators are starting to deploy Virtual Network Functions (VNFs), which are virtualized over commodity hardware.
VNFs can be deployed in Data Centers (DCs) or in Network Function Virtualization (NFV) capable network elements (nodes) such as routers and switches.
NFV capable nodes and DCs together form a Network enabled Cloud (NeC) that helps to facilitate the dynamic service chaining required to support evolving network traffic and its service demands.
In this study, we focus on the VNF service chain placement and traffic routing problem, and build a model for placing a VNF service chain while minimizing network resource consumption.
Our results indicate that a NeC having a DC and NFV capable nodes can significantly reduce network-resource consumption.
When scripts in untyped languages grow into large programs, maintaining them becomes difficult.
A lack of explicit type annotations in typical scripting languages forces programmers to must (re)discover critical pieces of design information every time they wish to change a program.
This analysis step both slows down the maintenance process and may even introduce mistakes due to the violation of undiscovered invariants.
This paper presents Typed Scheme, an explicitly typed extension of PLT Scheme, an untyped scripting language.
Its type system is based on the novel notion of occurrence typing, which we formalize and mechanically prove sound.
The implementation of Typed Scheme additionally borrows elements from a range of approaches, including recursive types, true unions and subtyping, plus polymorphism combined with a modicum of local inference.
The formulation of occurrence typing naturally leads to a simple and expressive version of predicates to describe refinement types.
A Typed Scheme program can use these refinement types to keep track of arbitrary classes of values via the type system.
Further, we show how the Typed Scheme type system, in conjunction with simple recursive types, is able to encode refinements of existing datatypes, thus expressing both proposed variations of refinement types.
Ensembling multiple predictions is a widely-used technique to improve the accuracy of various machine learning tasks.
One obvious drawback of the ensembling is its higher execution cost during inference.
In this paper, we first describe our insights on relationship between the probability of the prediction and the effect of ensembling with current deep neural networks; ensembling does not help mispredictions for inputs predicted with a high probability even when there is a non-negligible number of mispredicted inputs.
This finding motivates us to develop a new technique called adaptive ensemble prediction, which achieves the benefits of ensembling with much smaller additional execution costs.
If the prediction for an input reaches a high enough probability on the basis of the confidence level, we stop ensembling for this input to avoid wasting computation power.
We evaluated the adaptive ensembling by using various datasets and showed that it reduces the computation cost significantly while achieving similar accuracy to the naive ensembling.
We also showed that our statistically rigorous confidence-level-based termination condition reduces the burden of the task-dependent parameter tuning compared to the naive termination based on the pre-defined threshold in addition to yielding a better accuracy with the same cost.
Radio pollution and power consumption problems lead to innovative development of green heterogeneous networks (HetNet).
Time reversal (TR) technique which has been validated from wide- to narrow-band transmissions is evaluated as one of most prominent linear precoders with superior capability of harvesting signal energy.
In this paper, we consider a new HetNet model, in which TR-employed femtocell is proposed to attain saving power benefits whereas macrocell utilizes the beam-forming algorithm based on zero-forcing principle, over frequency selective channels.
In the considered HetNet, the practical case of limited signaling information exchanged via backhaul connections is also taken under advisement.
We hence organize a distributed power loading strategy, in which macrocell users are treated with a superior priority compared to femtocell users.
By Monte-Carlo simulation, the obtained results show that TR is preferred to zero-forcing in the perspective of beamforming technique for femtocell environments due to very high achievable gain in saving energy, and the validity of power loading strategy is verified over multipath channels.
Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications.
However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora.
We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms.
We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure.
Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community.
Results on analogy completion and entity sense disambiguation indicate that entities and words capture complementary information that can be effectively combined for downstream use.
The densification and expansion of wireless network pose new challenges on interference management and reducing energy consumption.
This paper studies energy-efficient resource management in heterogeneous networks by jointly optimizing cell activation, user association and multicell multiuser channel assignment, according to the long-term average traffic and channel conditions.
The proposed framework is built on characterizing the interference coupling by pre-defined interference patterns, and performing resource allocation among these patterns.
In this way, the interference fluctuation caused by (de)activating cells is explicitly taken into account when calculating the user achievable rates.
A tailored algorithm is developed to solve the formulated problem in the dual domain by exploiting the problem structure, which gives a significant complexity saving.
Numerical results show a huge improvement in energy saving achieved by the proposed scheme.
The user association derived from the proposed joint resource optimization is mapped to standard-compliant cell selection biasing.
This mapping reveals that the cell-specific biasing for energy saving is quite different from that for load balancing investigated in the literature.
In this paper we introduce a new, high-quality, dataset of images containing fruits.
We also present the results of some numerical experiment for training a neural network to detect fruits.
We discuss the reason why we chose to use fruits in this project by proposing a few applications that could use this kind of neural network.
Network testing plays an important role in the iterative process of developing new communication protocols and algorithms.
However, test environments have to keep up with the evolution of technology and require continuous update and redesign.
In this paper, we propose COINS, a framework that can be used by wireless technology developers to enable continuous integration (CI) practices in their testbed infrastructure.
As a proof-of-concept, we provide a reference architecture and implementation of COINS for controlled testing of multi-technology 5G Machine Type Communication (MTC) networks.
The implementation upgrades an existing wireless experimentation testbed with new software and hardware functionalities.
It blends web service technology and operating system virtualization technologies with emerging Internet of Things technologies enabling CI for wireless networks.
Moreover, we also extend an existing qualitative methodology for comparing similar frameworks and identify and discuss open challenges for wider use of CI practices in wireless technology development.
We propose the first deep learning solution to video frame inpainting, a challenging instance of the general video inpainting problem with applications in video editing, manipulation, and forensics.
Our task is less ambiguous than frame interpolation and video prediction because we have access to both the temporal context and a partial glimpse of the future, allowing us to better evaluate the quality of a model's predictions objectively.
We devise a pipeline composed of two modules: a bidirectional video prediction module, and a temporally-aware frame interpolation module.
The prediction module makes two intermediate predictions of the missing frames, one conditioned on the preceding frames and the other conditioned on the following frames, using a shared convolutional LSTM-based encoder-decoder.
The interpolation module blends the intermediate predictions to form the final result.
Specifically, it utilizes time information and hidden activations from the video prediction module to resolve disagreements between the predictions.
Our experiments demonstrate that our approach produces more accurate and qualitatively satisfying results than a state-of-the-art video prediction method and many strong frame inpainting baselines.
Interpretability and small labelled datasets are key issues in the practical application of deep learning, particularly in areas such as medicine.
In this paper, we present a semi-supervised technique that addresses both these issues by leveraging large unlabelled datasets to encode and decode images into a dense latent representation.
Using chest radiography as an example, we apply this encoder to other labelled datasets and apply simple models to the latent vectors to learn algorithms to identify heart failure.
For each prediction, we generate visual rationales by optimizing a latent representation to minimize the prediction of disease while constrained by a similarity measure in image space.
Decoding the resultant latent representation produces an image without apparent disease.
The difference between the original decoding and the altered image forms an interpretable visual rationale for the algorithm's prediction on that image.
We also apply our method to the MNIST dataset and compare the generated rationales to other techniques described in the literature.
Finding interesting association rules is an important and active research field in data mining.
The algorithms of the Apriori family are based on two rule extraction measures, support and confidence.
Although these two measures have the virtue of being algorithmically fast, they generate a prohibitive number of rules most of which are redundant and irrelevant.
It is therefore necessary to use further measures which filter uninteresting rules.
Many synthesis studies were then realized on the interestingness measures according to several points of view.
Different reported studies have been carried out to identify "good" properties of rule extraction measures and these properties have been assessed on 61 measures.
The purpose of this paper is twofold.
First to extend the number of the measures and properties to be studied, in addition to the formalization of the properties proposed in the literature.
Second, in the light of this formal study, to categorize the studied measures.
This paper leads then to identify categories of measures in order to help the users to efficiently select an appropriate measure by choosing one or more measure(s) during the knowledge extraction process.
The properties evaluation on the 61 measures has enabled us to identify 7 classes of measures, classes that we obtained using two different clustering techniques.
In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice.
One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its ease-of-use and bug-finding power.
However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage.
We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps.
First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test.
Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program.
We conduct evaluation on real-world programs against state-of-the-art versions of AFL, thoroughly repeating experiments to get good measures of variability.
We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to state-of-the-art versions of AFL, while on others it achieves high program coverage at a significantly faster rate.
We propose a sign-based online learning (SOL) algorithm for a neuromorphic hardware framework called Trainable Analogue Block (TAB).
The TAB framework utilises the principles of neural population coding, implying that it encodes the input stimulus using a large pool of nonlinear neurons.
The SOL algorithm is a simple weight update rule that employs the sign of the hidden layer activation and the sign of the output error, which is the difference between the target output and the predicted output.
The SOL algorithm is easily implementable in hardware, and can be used in any artificial neural network framework that learns weights by minimising a convex cost function.
We show that the TAB framework can be trained for various regression tasks using the SOL algorithm.
We give optimal sorting algorithms in the evolving data framework, where an algorithm's input data is changing while the algorithm is executing.
In this framework, instead of producing a final output, an algorithm attempts to maintain an output close to the correct output for the current state of the data, repeatedly updating its best estimate of a correct output over time.
We show that a simple repeated insertion-sort algorithm can maintain an O(n) Kendall tau distance, with high probability, between a maintained list and an underlying total order of n items in an evolving data model where each comparison is followed by a swap between a random consecutive pair of items in the underlying total order.
This result is asymptotically optpimal, since there is an Omega(n) lower bound for Kendall tau distance for this problem.
Our result closes the gap between this lower bound and the previous best algorithm for this problem, which maintains a Kendall tau distance of O(n log log n) with high probability.
It also confirms previous experimental results that suggested that insertion sort tends to perform better than quicksort in practice.
Computational devices combining two or more different parts, one controlling the operation of the other, for example, derive their power from the interaction, in addition to the capabilities of the parts.
Non-classical computation has tended to consider only single computational models: neural, analog, quantum, chemical, biological, neglecting to account for the contribution from the experimental controls.
In this position paper, we propose a framework suitable for analysing combined computational models, from abstract theory to practical programming tools.
Focusing on the simplest example of one system controlled by another through a sequence of operations in which only one system is active at a time, the output from one system becomes the input to the other for the next step, and vice versa.
We outline the categorical machinery required for handling diverse computational systems in such combinations, with their interactions explicitly accounted for.
Drawing on prior work in refinement and retrenchment, we suggest an appropriate framework for developing programming tools from the categorical framework.
We place this work in the context of two contrasting concepts of "efficiency": theoretical comparisons to determine the relative computational power do not always reflect the practical comparison of real resources for a finite-sized computational task, especially when the inputs include (approximations of) real numbers.
Finally we outline the limitations of our simple model, and identify some of the extensions that will be required to treat more complex interacting computational systems.
Currently, most speech processing techniques use magnitude spectrograms as front-end and are therefore by default discarding part of the signal: the phase.
In order to overcome this limitation, we propose an end-to-end learning method for speech denoising based on Wavenet.
The proposed model adaptation retains Wavenet's powerful acoustic modeling capabilities, while significantly reducing its time-complexity by eliminating its autoregressive nature.
Specifically, the model makes use of non-causal, dilated convolutions and predicts target fields instead of a single target sample.
The discriminative adaptation of the model we propose, learns in a supervised fashion via minimizing a regression loss.
These modifications make the model highly parallelizable during both training and inference.
Both computational and perceptual evaluations indicate that the proposed method is preferred to Wiener filtering, a common method based on processing the magnitude spectrogram.
We consider a network of event-based systems that use a shared wireless medium to communicate with their respective controllers.
These systems use a contention resolution mechanism to arbitrate access to the shared network.
We identify sufficient conditions for Lyapunov mean square stability of each control system in the network, and design event-based policies that guarantee it.
Our stability analysis is based on a Markov model that removes the network-induced correlation between the states of the control systems in the network.
Analyzing the stability of this Markov model remains a challenge, as the event-triggering policy renders the estimation error non-Gaussian.
Hence, we identify an auxiliary system that furnishes an upper bound for the variance of the system states.
Using the stability analysis, we design policies, such as the constant-probability policy, for adapting the event-triggering thresholds to the delay in accessing the network.
Realistic wireless networked control examples illustrate the applicability of the presented approach.
Artist recognition is a task of modeling the artist's musical style.
This problem is challenging because there is no clear standard.
We propose a hybrid method of the generative model i-vector and the discriminative model deep convolutional neural network.
We show that this approach achieves state-of-the-art performance by complementing each other.
In addition, we briefly explain the advantages and disadvantages of each approach.
Surrogate models provide a low computational cost alternative to evaluating expensive functions.
The construction of accurate surrogate models with large numbers of independent variables is currently prohibitive because it requires a large number of function evaluations.
Gradient-enhanced kriging has the potential to reduce the number of function evaluations for the desired accuracy when efficient gradient computation, such as an adjoint method, is available.
However, current gradient-enhanced kriging methods do not scale well with the number of sampling points due to the rapid growth in the size of the correlation matrix where new information is added for each sampling point in each direction of the design space.
They do not scale well with the number of independent variables either due to the increase in the number of hyperparameters that needs to be estimated.
To address this issue, we develop a new gradient-enhanced surrogate model approach that drastically reduced the number of hyperparameters through the use of the partial-least squares method that maintains accuracy.
In addition, this method is able to control the size of the correlation matrix by adding only relevant points defined through the information provided by the partial-least squares method.
To validate our method, we compare the global accuracy of the proposed method with conventional kriging surrogate models on two analytic functions with up to 100 dimensions, as well as engineering problems of varied complexity with up to 15 dimensions.
We show that the proposed method requires fewer sampling points than conventional methods to obtain the desired accuracy, or provides more accuracy for a fixed budget of sampling points.
In some cases, we get over 3 times more accurate models than a bench of surrogate models from the literature, and also over 3200 times faster than standard gradient-enhanced kriging models.
To deploy a spoken language understanding (SLU) model to a new language, language transferring is desired to avoid the trouble of acquiring and labeling a new big SLU corpus.
Translating the original SLU corpus into the target language is an attractive strategy.
However, SLU corpora consist of plenty of semantic labels (slots), which general-purpose translators cannot handle well, not to mention additional culture differences.
This paper focuses on the language transferring task given a tiny in-domain parallel SLU corpus.
The in-domain parallel corpus can be used as the first adaptation on the general translator.
But more importantly, we show how to use reinforcement learning (RL) to further finetune the adapted translator, where translated sentences with more proper slot tags receive higher rewards.
We evaluate our approach on Chinese to English language transferring for SLU systems.
The experimental results show that the generated English SLU corpus via adaptation and reinforcement learning gives us over 97% in the slot F1 score and over 84% accuracy in domain classification.
It demonstrates the effectiveness of the proposed language transferring method.
Compared with naive translation, our proposed method improves domain classification accuracy by relatively 22%, and the slot filling F1 score by relatively more than 71%.
Vector Quantization, VQ is a popular image compression technique with a simple decoding architecture and high compression ratio.
Codebook designing is the most essential part in Vector Quantization.
LindeBuzoGray, LBG is a traditional method of generation of VQ Codebook which results in lower PSNR value.
A Codebook affects the quality of image compression, so the choice of an appropriate codebook is a must.
Several optimization techniques have been proposed for global codebook generation to enhance the quality of image compression.
In this paper, a novel algorithm called IDE-LBG is proposed which uses Improved Differential Evolution Algorithm coupled with LBG for generating optimum VQ Codebooks.
The proposed IDE works better than the traditional DE with modifications in the scaling factor and the boundary control mechanism.
The IDE generates better solutions by efficient exploration and exploitation of the search space.
Then the best optimal solution obtained by the IDE is provided as the initial Codebook for the LBG.
This approach produces an efficient Codebook with less computational time and the consequences include excellent PSNR values and superior quality reconstructed images.
It is observed that the proposed IDE-LBG find better VQ Codebooks as compared to IPSO-LBG, BA-LBG and FA-LBG.
In this paper we propose a signature scheme based on two intractable problems, namely the integer factorization problem and the discrete logarithm problem for elliptic curves.
It is suitable for applications requiring long-term security and provides a more efficient solution than the existing ones.
This paper studies the fundamental limits of content delivery in a cache-aided broadcast network for correlated content generated by a discrete memoryless source with arbitrary joint distribution.
Each receiver is equipped with a cache of equal capacity, and the requested files are delivered over a shared error-free broadcast link.
A class of achievable correlation-aware schemes based on a two-step source coding approach is proposed.
Library files are first compressed, and then cached and delivered using a combination of correlation-unaware multiple-request cache-aided coded multicast schemes.
The first step uses Gray-Wyner source coding to represent the library via private descriptions and descriptions that are common to more than one file.
The second step then becomes a multiple-request caching problem, where the demand structure is dictated by the configuration of the compressed library, and it is interesting in its own right.
The performance of the proposed two-step scheme is evaluated by comparing its achievable rate with a lower bound on the optimal peak and average rate-memory tradeoffs in a two-file multiple-receiver network, and in a three-file two-receiver network.
Specifically, in a network with two files and two receivers, the achievable rate matches the lower bound for a significant memory regime and it is within half of the conditional entropy of files for all other memory values.
In the three-file two-receiver network, the two-step strategy achieves the lower bound for large cache capacities, and it is within half of the joint entropy of two of the sources conditioned on the third one for all other cache sizes.
One of the biggest challenges in the research of generative adversarial networks (GANs) is assessing the quality of generated samples and detecting various levels of mode collapse.
In this work, we construct a novel measure of performance of a GAN by comparing geometrical properties of the underlying data manifold and the generated one, which provides both qualitative and quantitative means for evaluation.
Our algorithm can be applied to datasets of an arbitrary nature and is not limited to visual data.
We test the obtained metric on various real-life models and datasets and demonstrate that our method provides new insights into properties of GANs.
In this paper, we present an improved feedforward sequential memory networks (FSMN) architecture, namely Deep-FSMN (DFSMN), by introducing skip connections between memory blocks in adjacent layers.
These skip connections enable the information flow across different layers and thus alleviate the gradient vanishing problem when building very deep structure.
As a result, DFSMN significantly benefits from these skip connections and deep structure.
We have compared the performance of DFSMN to BLSTM both with and without lower frame rate (LFR) on several large speech recognition tasks, including English and Mandarin.
Experimental results shown that DFSMN can consistently outperform BLSTM with dramatic gain, especially trained with LFR using CD-Phone as modeling units.
In the 2000 hours Fisher (FSH) task, the proposed DFSMN can achieve a word error rate of 9.4% by purely using the cross-entropy criterion and decoding with a 3-gram language model, which achieves a 1.5% absolute improvement compared to the BLSTM.
In a 20000 hours Mandarin recognition task, the LFR trained DFSMN can achieve more than 20% relative improvement compared to the LFR trained BLSTM.
Moreover, we can easily design the lookahead filter order of the memory blocks in DFSMN to control the latency for real-time applications.
This paper presents two new approaches to decomposing and solving large Markov decision problems (MDPs), a partial decoupling method and a complete decoupling method.
In these approaches, a large, stochastic decision problem is divided into smaller pieces.
The first approach builds a cache of policies for each part of the problem independently, and then combines the pieces in a separate, light-weight step.
A second approach also divides the problem into smaller pieces, but information is communicated between the different problem pieces, allowing intelligent decisions to be made about which piece requires the most attention.
Both approaches can be used to find optimal policies or approximately optimal policies with provable bounds.
These algorithms also provide a framework for the efficient transfer of knowledge across problems that share similar structure.
The collaborative development methods pioneered by the open source software community offer a way to create lessons that are open, accessible, and sustainable.
This paper presents ten simple rules for doing this drawn from our experience with several successful projects.
This article consists of a brief introduction to the Shannon information theory.
Two topics, entropy and channel capacity, are mainly covered.
All these concepts are developed in a totally combinatorial favor.
Some issues usually not addressed in the literature are discussed here as well.
Protograph-based Raptor-like low-density parity-check codes (PBRL codes) are a recently proposed family of easily encodable and decodable rate-compatible LDPC (RC-LDPC) codes.
These codes have an excellent iterative decoding threshold and performance across all design rates.
PBRL codes designed thus far, for both long and short block-lengths, have been based on optimizing the iterative decoding threshold of the protograph of the RC code family at various design rates.
In this work, we propose a design method to obtain better quasi-cyclic (QC) RC-LDPC codes with PBRL structure for short block-lengths (of a few hundred bits).
We achieve this by maximizing an upper bound on the minimum distance of any QC-LDPC code that can be obtained from the protograph of a PBRL ensemble.
The obtained codes outperform the original PBRL codes at short block-lengths by significantly improving the error floor behavior at all design rates.
Furthermore, we identify a reduction in complexity of the design procedure, facilitated by the general structure of a PBRL ensemble.
We give a description of the weighted Reed-Muller codes over a prime field in a modular algebra.
A description of the homogeneous Reed-Muller codes in the same ambient space is presented for the binary case.
A decoding procedure using the Landrock-Manz method is developed.
Algorithms that use hardware transactional memory (HTM) must provide a software-only fallback path to guarantee progress.
The design of the fallback path can have a profound impact on performance.
If the fallback path is allowed to run concurrently with hardware transactions, then hardware transactions must be instrumented, adding significant overhead.
Otherwise, hardware transactions must wait for any processes on the fallback path, causing concurrency bottlenecks, or move to the fallback path.
We introduce an approach that combines the best of both worlds.
The key idea is to use three execution paths: an HTM fast path, an HTM middle path, and a software fallback path, such that the middle path can run concurrently with each of the other two.
The fast path and fallback path do not run concurrently, so the fast path incurs no instrumentation overhead.
Furthermore, fast path transactions can move to the middle path instead of waiting or moving to the software path.
We demonstrate our approach by producing an accelerated version of the tree update template of Brown et al., which can be used to implement fast lock-free data structures based on down-trees.
We used the accelerated template to implement two lock-free trees: a binary search tree (BST), and an (a,b)-tree (a generalization of a B-tree).
Experiments show that, with 72 concurrent processes, our accelerated (a,b)-tree performs between 4.0x and 4.2x as many operations per second as an implementation obtained using the original tree update template.
This paper proposes a new objective metric of exceptional motion in VR video contents for VR sickness assessment.
In VR environment, VR sickness can be caused by several factors which are mismatched motion, field of view, motion parallax, viewing angle, etc.
Similar to motion sickness, VR sickness can induce a lot of physical symptoms such as general discomfort, headache, stomach awareness, nausea, vomiting, fatigue, and disorientation.
To address the viewing safety issues in virtual environment, it is of great importance to develop an objective VR sickness assessment method that predicts and analyses the degree of VR sickness induced by the VR content.
The proposed method takes into account motion information that is one of the most important factors in determining the overall degree of VR sickness.
In this paper, we detect the exceptional motion that is likely to induce VR sickness.
Spatio-temporal features of the exceptional motion in the VR video content are encoded using a convolutional autoencoder.
For objectively assessing the VR sickness, the level of exceptional motion in VR video content is measured by using the convolutional autoencoder as well.
The effectiveness of the proposed method has been successfully evaluated by subjective assessment experiment using simulator sickness questionnaires (SSQ) in VR environment.
Stochastic gradient descent (SGD) is widely believed to perform implicit regularization when used to train deep neural networks, but the precise manner in which this occurs has thus far been elusive.
We prove that SGD minimizes an average potential over the posterior distribution of weights along with an entropic regularization term.
This potential is however not the original loss function in general.
So SGD does perform variational inference, but for a different loss than the one used to compute the gradients.
Even more surprisingly, SGD does not even converge in the classical sense: we show that the most likely trajectories of SGD for deep networks do not behave like Brownian motion around critical points.
Instead, they resemble closed loops with deterministic components.
We prove that such "out-of-equilibrium" behavior is a consequence of highly non-isotropic gradient noise in SGD; the covariance matrix of mini-batch gradients for deep networks has a rank as small as 1% of its dimension.
We provide extensive empirical validation of these claims, proven in the appendix.
We introduce a method for automated temporal segmentation of human motion data into distinct actions and compositing motion primitives based on self-similar structures in the motion sequence.
We use neighbourhood graphs for the partitioning and the similarity information in the graph is further exploited to cluster the motion primitives into larger entities of semantic significance.
The method requires no assumptions about the motion sequences at hand and no user interaction is required for the segmentation or clustering.
In addition, we introduce a feature bundling preprocessing technique to make the segmentation more robust to noise, as well as a notion of motion symmetry for more refined primitive detection.
We test our method on several sensor modalities, including markered and markerless motion capture as well as on electromyograph and accelerometer recordings.
The results highlight our system's capabilities for both segmentation and for analysis of the finer structures of motion data, all in a completely unsupervised manner.
Most companies' new business practices are based on customer data.
These practices have raised privacy concerns because of the associated risks.
Privacy laws require companies to gain customer consent before using their information, which stands as the biggest roadblock to monetise this asset.
Privacy literature suggests that reducing privacy concerns and building trust may increase individuals' intention to authorise the use of personal information.
Fair information practices (FIPs) are potential means to achieve this goal.
However, there is lack of empirical evidence on the mechanisms through which the FIPs affect privacy concerns and trust.
This research argues that FIPs load individuals with control, which has been found to influence privacy concerns and trust level.
We will use an experimental design methodology to conduct the study.
The results are expected to have both theoretical and managerial implications.
Italy adopted a performance-based system for funding universities that is centered on the results of a national research assessment exercise, realized by a governmental agency (ANVUR).
ANVUR evaluated papers by using 'a dual system of evaluation', that is by informed peer review or by bibliometrics.
In view of validating that system, ANVUR performed an experiment for estimating the agreement between informed review and bibliometrics.
Ancaiani et al.(2015) presents the main results of the experiment.
Baccini and De Nicolao (2017) documented in a letter, among other critical issues, that the statistical analysis was not realized on a random sample of articles.
A reply to the letter has been published by Research Evaluation (Benedetto et al.2017).
This note highlights that in the reply there are (1) errors in data, (2) problems with 'representativeness' of the sample, (3) unverifiable claims about weights used for calculating kappas, (4) undisclosed averaging procedures; (5) a statement about 'same protocol in all areas' contradicted by official reports.
Last but not least: the data used by the authors continue to be undisclosed.
A general warning concludes: many recently published papers use data originating from Italian research assessment exercise.
These data are not accessible to the scientific community and consequently these papers are not reproducible.
They can be hardly considered as containing sound evidence at least until authors or ANVUR disclose the data necessary for replication.
Human computer conversation is regarded as one of the most difficult problems in artificial intelligence.
In this paper, we address one of its key sub-problems, referred to as short text conversation, in which given a message from human, the computer returns a reasonable response to the message.
We leverage the vast amount of short conversation data available on social media to study the issue.
We propose formalizing short text conversation as a search problem at the first step, and employing state-of-the-art information retrieval (IR) techniques to carry out the task.
We investigate the significance as well as the limitation of the IR approach.
Our experiments demonstrate that the retrieval-based model can make the system behave rather "intelligently", when combined with a huge repository of conversation data from social media.
The effect of transport-related pollution on human health is fast becoming recognised as a major issue in cities worldwide.
Cyclists, in particular, face great risks, as they typically are most exposed to tail-pipe emissions.
Three avenues are being explored worldwide in the fight against urban pollution: (i) outright bans on polluting vehicles and embracing zero tailpipe emission vehicles; (ii) measuring air-quality as a means to better informing citizens of zones of higher pollution; and (iii) developing smart mobility devices that seek to minimize the effect of polluting devices on citizens as they transport goods and individuals in our cities.
Following this latter direction, in this paper we present a new way to protect cyclists from the effect of urban pollution.
Namely, by exploiting the actuation possibilities afforded by pedelecs or e-bikes (electric bikes), we design a cyber-physical system that mitigates the effect of urban pollution by indirectly controlling the breathing rate of cyclists in polluted areas.
Results from a real device are presented to illustrate the efficacy of our system.
Driving is a social activity: drivers often indicate their intent to change lanes via motion cues.
We consider mixed-autonomy traffic where a Human-driven Vehicle (HV) and an Autonomous Vehicle (AV) drive together.
We propose a planning framework where the degree to which the AV considers the other agent's reward is controlled by a selfishness factor.
We test our approach on a simulated two-lane highway where the AV and HV merge into each other's lanes.
In a user study with 21 subjects and 6 different selfishness factors, we found that our planning approach was sound and that both agents had less merging times when a factor that balances the rewards for the two agents was chosen.
Our results on double lane merging suggest it to be a non-zero-sum game and encourage further investigation on collaborative decision making algorithms for mixed-autonomy traffic.
This paper proposes a deep cerebellar model articulation controller (DCMAC) for adaptive noise cancellation (ANC).
We expand upon the conventional CMAC by stacking sin-gle-layer CMAC models into multiple layers to form a DCMAC model and derive a modified backpropagation training algorithm to learn the DCMAC parameters.
Com-pared with conventional CMAC, the DCMAC can characterize nonlinear transformations more effectively because of its deep structure.
Experimental results confirm that the pro-posed DCMAC model outperforms the CMAC in terms of residual noise in an ANC task, showing that DCMAC provides enhanced modeling capability based on channel characteristics.
Late Gadolinium Enhanced Cardiac MRI (LGE-CMRI) for detecting atrial scars in atrial fibrillation (AF) patients has recently emerged as a promising technique to stratify patients, guide ablation therapy and predict treatment success.
Visualisation and quantification of scar tissues require a segmentation of both the left atrium (LA) and the high intensity scar regions from LGE-CMRI images.
These two segmentation tasks are challenging due to the cancelling of healthy tissue signal, low signal-to-noise ratio and often limited image quality in these patients.
Most approaches require manual supervision and/or a second bright-blood MRI acquisition for anatomical segmentation.
Segmenting both the LA anatomy and the scar tissues automatically from a single LGE-CMRI acquisition is highly in demand.
In this study, we proposed a novel fully automated multiview two-task (MVTT) recursive attention model working directly on LGE-CMRI images that combines a sequential learning and a dilated residual learning to segment the LA (including attached pulmonary veins) and delineate the atrial scars simultaneously via an innovative attention model.
Compared to other state-of-the-art methods, the proposed MVTT achieves compelling improvement, enabling to generate a patient-specific anatomical and atrial scar assessment model.
Emotion estimation in music listening is confronting challenges to capture the emotion variation of listeners.
Recent years have witnessed attempts to exploit multimodality fusing information from musical contents and physiological signals captured from listeners to improve the performance of emotion recognition.
In this paper, we present a study of fusion of signals of electroencephalogram (EEG), a tool to capture brainwaves at a high-temporal resolution, and musical features at decision level in recognizing the time-varying binary classes of arousal and valence.
Our empirical results showed that the fusion could outperform the performance of emotion recognition using only EEG modality that was suffered from inter-subject variability, and this suggested the promise of multimodal fusion in improving the accuracy of music-emotion recognition.
When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs).
This unlocked ability of being able to update parts of a neural network asynchronously and with only local information was demonstrated to work empirically in Jaderberg et al (2016).
However, there has been very little demonstration of what changes DNIs and SGs impose from a functional, representational, and learning dynamics point of view.
In this paper, we study DNIs through the use of synthetic gradients on feed-forward networks to better understand their behaviour and elucidate their effect on optimisation.
We show that the incorporation of SGs does not affect the representational strength of the learning system for a neural network, and prove the convergence of the learning system for linear and deep linear models.
On practical problems we investigate the mechanism by which synthetic gradient estimators approximate the true loss, and, surprisingly, how that leads to drastically different layer-wise representations.
Finally, we also expose the relationship of using synthetic gradients to other error approximation techniques and find a unifying language for discussion and comparison.
Cluster analysis plays an important role in decision making process for many knowledge-based systems.
There exist a wide variety of different approaches for clustering applications including the heuristic techniques, probabilistic models, and traditional hierarchical algorithms.
In this paper, a novel heuristic approach based on big bang-big crunch algorithm is proposed for clustering problems.
The proposed method not only takes advantage of heuristic nature to alleviate typical clustering algorithms such as k-means, but it also benefits from the memory based scheme as compared to its similar heuristic techniques.
Furthermore, the performance of the proposed algorithm is investigated based on several benchmark test functions as well as on the well-known datasets.
The experimental results show the significant superiority of the proposed method over the similar algorithms.
In aspect-based sentiment analysis, most existing methods either focus on aspect/opinion terms extraction or aspect terms categorization.
However, each task by itself only provides partial information to end users.
To generate more detailed and structured opinion analysis, we propose a finer-grained problem, which we call category-specific aspect and opinion terms extraction.
This problem involves the identification of aspect and opinion terms within each sentence, as well as the categorization of the identified terms.
To this end, we propose an end-to-end multi-task attention model, where each task corresponds to aspect/opinion terms extraction for a specific category.
Our model benefits from exploring the commonalities and relationships among different tasks to address the data sparsity issue.
We demonstrate its state-of-the-art performance on three benchmark datasets.
Detecting controversy in general web pages is a daunting task, but increasingly essential to efficiently moderate discussions and effectively filter problematic content.
Unfortunately, controversies occur across many topics and domains, with great changes over time.
This paper investigates neural classifiers as a more robust methodology for controversy detection in general web pages.
Current models have often cast controversy detection on general web pages as Wikipedia linking, or exact lexical matching tasks.
The diverse and changing nature of controversies suggest that semantic approaches are better able to detect controversy.
We train neural networks that can capture semantic information from texts using weak signal data.
By leveraging the semantic properties of word embeddings we robustly improve on existing controversy detection methods.
To evaluate model stability over time and to unseen topics, we asses model performance under varying training conditions to test cross-temporal, cross-topic, cross-domain performance and annotator congruence.
In doing so, we demonstrate that weak-signal based neural approaches are closer to human estimates of controversy and are more robust to the inherent variability of controversies.
Recent trends in targeted cyber-attacks has increased the interest of research in the field of cyber security.
Such attacks have massive disruptive effects on rganizations, enterprises and governments.
Cyber kill chain is a model to describe cyber-attacks so as to develop incident response and analysis capabilities.
Cyber kill chain in simple terms is an attack chain, the path that an intruder takes to penetrate information systems over time to execute an attack on the target.
This paper broadly categories the methodologies, techniques and tools involved in cyber-attacks.
This paper intends to help a cyber security researcher to realize the options available to an attacker at every stage of a cyber-attack.
Patent data represent a significant source of information on innovation and the evolution of technology through networks of citations, co-invention and co-assignment of new patents.
A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in the creation of a technology.
In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventor and assignees on more than 3.6 million patents found in the European Patent Office (EPO), under the Patent Cooperation treaty (PCT), and in the US Patent and Trademark Office (USPTO).
We show that our algorithm has both high precision and recall in comparison to a manual disambiguation of EPO assignee names in Boston and Paris, and show it performs well for a benchmark of USPTO inventor names that can be linked to a high-resolution address (but poorly for inventors that never provided a high quality address).
The most significant benefit of this work is the high quality assignee disambiguation with worldwide coverage coupled with an inventor disambiguation that is competitive with other state of the art approaches.
To our knowledge this is the broadest and most accurate simultaneous disambiguation and cross-linking of the inventor and assignee names for a significant fraction of patents in these three major patent collections.
We propose a Fourier domain asymmetric cryptosystem for multimodal biometric security.
One modality of biometrics (such as face) is used as the plaintext, which is encrypted by another modality of biometrics (such as fingerprint).
A private key is synthesized from the encrypted biometric signature by complex spatial Fourier processing.
The encrypted biometric signature is further encrypted by other biometric modalities, and the corresponding private keys are synthesized.
The resulting biometric signature is privacy protected since the encryption keys are provided by the human, and hence those are private keys.
Moreover, the decryption keys are synthesized using those private encryption keys.
The encrypted signatures are decrypted using the synthesized private keys and inverse complex spatial Fourier processing.
Computer simulations demonstrate the feasibility of the technique proposed.
HTTP/2 supersedes HTTP/1.1 to tackle the performance challenges of the modern Web.
A highly anticipated feature is Server Push, enabling servers to send data without explicit client requests, thus potentially saving time.
Although guidelines on how to use Server Push emerged, measurements have shown that it can easily be used in a suboptimal way and hurt instead of improving performance.
We thus tackle the question if the current Web can make better use of Server Push.
First, we enable real-world websites to be replayed in a testbed to study the effects of different Server Push strategies.
Using this, we next revisit proposed guidelines to grasp their performance impact.
Finally, based on our results, we propose a novel strategy using an alternative server scheduler that enables to interleave resources.
This improves the visual progress for some websites, with minor modifications to the deployment.
Still, our results highlight the limits of Server Push: a deep understanding of web engineering is required to make optimal use of it, and not every site will benefit.
Low-density parity-check (LPDC) decoders assume the channel estate information (CSI) is known and they have the true a posteriori probability (APP) for each transmitted bit.
But in most cases of interest, the CSI needs to be estimated with the help of a short training sequence and the LDPC decoder has to decode the received word using faulty APP estimates.
In this paper, we study the uncertainty in the CSI estimate and how it affects the bit error rate (BER) output by the LDPC decoder.
To improve these APP estimates, we propose a Bayesian equalizer that takes into consideration not only the uncertainty due to the noise in the channel, but also the uncertainty in the CSI estimate, reducing the BER after the LDPC decoder.
Dense subgraph discovery is a key primitive in many graph mining applications, such as detecting communities in social networks and mining gene correlation from biological data.
Most studies on dense subgraph mining only deal with one graph.
However, in many applications, we have more than one graph describing relations among a same group of entities.
In this paper, given two graphs sharing the same set of vertices, we investigate the problem of detecting subgraphs that contrast the most with respect to density.
We call such subgraphs Density Contrast Subgraphs, or DCS in short.
Two widely used graph density measures, average degree and graph affinity, are considered.
For both density measures, mining DCS is equivalent to mining the densest subgraph from a "difference" graph, which may have both positive and negative edge weights.
Due to the existence of negative edge weights, existing dense subgraph detection algorithms cannot identify the subgraph we need.
We prove the computational hardness of mining DCS under the two graph density measures and develop efficient algorithms to find DCS.
We also conduct extensive experiments on several real-world datasets to evaluate our algorithms.
The experimental results show that our algorithms are both effective and efficient.
Training a Deep Neural Network (DNN) from scratch requires a large amount of labeled data.
For a classification task where only small amount of training data is available, a common solution is to perform fine-tuning on a DNN which is pre-trained with related source data.
This consecutive training process is time consuming and does not consider explicitly the relatedness between different source and target tasks.
In this paper, we propose a novel method to jointly fine-tune a Deep Neural Network with source data and target data.
By adding an Optimal Transport loss (OT loss) between source and target classifier predictions as a constraint on the source classifier, the proposed Joint Transfer Learning Network (JTLN) can effectively learn useful knowledge for target classification from source data.
Furthermore, by using different kind of metric as cost matrix for the OT loss, JTLN can incorporate different prior knowledge about the relatedness between target categories and source categories.
We carried out experiments with JTLN based on Alexnet on image classification datasets and the results verify the effectiveness of the proposed JTLN in comparison with standard consecutive fine-tuning.
This Joint Transfer Learning with OT loss is general and can also be applied to other kind of Neural Networks.
People's interests and people's social relationships are intuitively connected, but understanding their interplay and whether they can help predict each other has remained an open question.
We examine the interface of two decisive structures forming the backbone of online social media: the graph structure of social networks - who connects with whom - and the set structure of topical affiliations - who is interested in what.
In studying this interface, we identify key relationships whereby each of these structures can be understood in terms of the other.
The context for our analysis is Twitter, a complex social network of both follower relationships and communication relationships.
On Twitter, "hashtags" are used to label conversation topics, and we examine hashtag usage alongside these social structures.
We find that the hashtags that users adopt can predict their social relationships, and also that the social relationships between the initial adopters of a hashtag can predict the future popularity of that hashtag.
By studying weighted social relationships, we observe that while strong reciprocated ties are the easiest to predict from hashtag structure, they are also much less useful than weak directed ties for predicting hashtag popularity.
Importantly, we show that computationally simple structural determinants can provide remarkable performance in both tasks.
While our analyses focus on Twitter, we view our findings as broadly applicable to topical affiliations and social relationships in a host of diverse contexts, including the movies people watch, the brands people like, or the locations people frequent.
This paper concerns the maximum coding rate at which data can be transmitted over a noncoherent, single-antenna, Rayleigh block-fading channel using an error-correcting code of a given blocklength with a block-error probability not exceeding a given value.
A high-SNR normal approximation of the maximum coding rate is presented that becomes accurate as the signal-to-noise ratio (SNR) and the number of coherence intervals L over which we code tend to infinity.
Numerical analyses suggest that the approximation is accurate already at SNR values of 15 dB and when the number of coherence intervals is 10 or more.
Firewalls have long been in use to protect local networks from threats of the larger Internet.
Although firewalls are effective in preventing attacks initiated from outside, they are vulnerable to insider threats, e.g., malicious insiders may access and alter firewall configurations, and disable firewall services.
In this paper, we develop an innovative distributed architecture to obliviously manage and evaluate firewalls to prevent both insider and external attacks oriented to the firewalls.
Our proposed structure alleviates these issues by obfuscating the firewall rules or policies themselves, then distributing the function of evaluating these rules across multiple servers.
Thus, both accessing and altering the rules are considerably more difficult thereby providing better protection to the local network as well as greater security for the firewall itself.
We achieve this by integrating multiple areas of research such as secret sharing schemes and multi-party computation, as well as Bloom filters and Byzantine agreement protocols.
Our resulting solution is an efficient and secure means by which a firewall may be distributed, and obfuscated while maintaining the ability for multiple servers to obliviously evaluate its functionality.
In this work we present a deep learning framework for video compressive sensing.
The proposed formulation enables recovery of video frames in a few seconds at significantly improved reconstruction quality compared to previous approaches.
Our investigation starts by learning a linear mapping between video sequences and corresponding measured frames which turns out to provide promising results.
We then extend the linear formulation to deep fully-connected networks and explore the performance gains using deeper architectures.
Our analysis is always driven by the applicability of the proposed framework on existing compressive video architectures.
Extensive simulations on several video sequences document the superiority of our approach both quantitatively and qualitatively.
Finally, our analysis offers insights into understanding how dataset sizes and number of layers affect reconstruction performance while raising a few points for future investigation.
Code is available at Github: https://github.com/miliadis/DeepVideoCS
As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a ``naive space', which does not take into account the protocol used, can often lead to counterintuitive results.
Here we examine why.
A criterion known as CAR (coarsening at random) in the statistical literature characterizes when ``naive' conditioning in a naive space works.
We show that the CAR condition holds rather infrequently.
We then consider more generalized notions of update such as Jeffrey conditioning and minimizing relative entropy (MRE).
We give a generalization of the CAR condition that characterizes when Jeffrey conditioning leads to appropriate answers, but show that there are no such conditions for MRE.
This generalizes and interconnects previous results obtained in the literature on CAR and MRE.
Existing region-based object detectors are limited to regions with fixed box geometry to represent objects, even if those are highly non-rectangular.
In this paper we introduce DP-FCN, a deep model for object detection which explicitly adapts to shapes of objects with deformable parts.
Without additional annotations, it learns to focus on discriminative elements and to align them, and simultaneously brings more invariance for classification and geometric information to refine localization.
DP-FCN is composed of three main modules: a Fully Convolutional Network to efficiently maintain spatial resolution, a deformable part-based RoI pooling layer to optimize positions of parts and build invariance, and a deformation-aware localization module explicitly exploiting displacements of parts to improve accuracy of bounding box regression.
We experimentally validate our model and show significant gains.
DP-FCN achieves state-of-the-art performances of 83.1% and 80.9% on PASCAL VOC 2007 and 2012 with VOC data only.
Although outdoor navigation systems are mostly dependent on GPS, indoor systems have to rely upon different techniques for localizing the user, due to unavailability of GPS signals in indoor environments.
Over the past decade various indoor navigation systems have been developed.
In this paper an overview of some existing indoor navigation systems for visually impaired people are presented and they are compared from different perspectives.
The evaluated techniques are ultrasonic systems, RFID-based solutions, computer vision aided navigation systems, ans smartphone-based applications.
We present a novel method for high detail-preserving human avatar creation from monocular video.
A parameterized body model is refined and optimized to maximally resemble subjects from a video showing them from all sides.
Our avatars feature a natural face, hairstyle, clothes with garment wrinkles, and high-resolution texture.
Our paper contributes facial landmark and shading-based human body shape refinement, a semantic texture prior, and a novel texture stitching strategy, resulting in the most sophisticated-looking human avatars obtained from a single video to date.
Numerous results show the robustness and versatility of our method.
A user study illustrates its superiority over the state-of-the-art in terms of identity preservation, level of detail, realism, and overall user preference.
Most tabular data visualization techniques focus on overviews, yet many practical analysis tasks are concerned with investigating individual items of interest.
At the same time, relating an item to the rest of a potentially large table is important.
In this work we present Taggle, a tabular visualization technique for exploring and presenting large and complex tables.
Taggle takes an item-centric,spreadsheet-like approach, visualizing each row in the source data individually using visual encodings for the cells.
At the same time, Taggle introduces data-driven aggregation of data subsets.
The aggregation strategy is complemented by interaction methods tailored to answer specific analysis questions, such as sorting based on multiple columns and rich data selection and filtering capabilities.
We evaluate Taggle using a qualitative user study and a case study conducted by a domain expert on complex genomics data analysis for the purpose of drug discovery.
A compiler approach for generating low-level computer code from high-level input for discontinuous Galerkin finite element forms is presented.
The input language mirrors conventional mathematical notation, and the compiler generates efficient code in a standard programming language.
This facilitates the rapid generation of efficient code for general equations in varying spatial dimensions.
Key concepts underlying the compiler approach and the automated generation of computer code are elaborated.
The approach is demonstrated for a range of common problems, including the Poisson, biharmonic, advection--diffusion and Stokes equations.
In this paper, we propose in Dezert-Smarandache Theory (DSmT) framework, a new probabilistic transformation, called DSmP, in order to build a subjective probability measure from any basic belief assignment defined on any model of the frame of discernment.
Several examples are given to show how the DSmP transformation works and we compare it to main existing transformations proposed in the literature so far.
We show the advantages of DSmP over classical transformations in term of Probabilistic Information Content (PIC).
The direct extension of this transformation for dealing with qualitative belief assignments is also presented.
When modeling geo-spatial data, it is critical to capture spatial correlations for achieving high accuracy.
Spatial Auto-Regression (SAR) is a common tool used to model such data, where the spatial contiguity matrix (W) encodes the spatial correlations.
However, the efficacy of SAR is limited by two factors.
First, it depends on the choice of contiguity matrix, which is typically not learnt from data, but instead, is assumed to be known apriori.
Second, it assumes that the observations can be explained by linear models.
In this paper, we propose a Convolutional Neural Network (CNN) framework to model geo-spatial data (specifi- cally housing prices), to learn the spatial correlations automatically.
We show that neighborhood information embedded in satellite imagery can be leveraged to achieve the desired spatial smoothing.
An additional upside of our framework is the relaxation of linear assumption on the data.
Specific challenges we tackle while implementing our framework include, (i) how much of the neighborhood is relevant while estimating housing prices?
(ii) what is the right approach to capture multiple resolutions of satellite imagery? and (iii) what other data-sources can help improve the estimation of spatial correlations?
We demonstrate a marked improvement of 57% on top of the SAR baseline through the use of features from deep neural networks for the cities of London, Birmingham and Liverpool.
Empirical software engineering has received much attention in recent years and coined the shift from a more design-science-driven engineering discipline to an insight-oriented, and theory-centric one.
Yet, we still face many challenges, among which some increase the need for interdisciplinary research.
This is especially true for the investigation of human-centric aspects of software engineering.
Although we can already observe an increased recognition of the need for more interdisciplinary research in (empirical) software engineering, such research configurations come with challenges barely discussed from a scientific point of view.
In this position paper, we critically reflect upon the epistemological setting of empirical software engineering and elaborate its configuration as an Interdiscipline.
In particular, we (1) elaborate a pragmatic view on empirical research for software engineering reflecting a cyclic process for knowledge creation, (2) motivate a path towards symmetrical interdisciplinary research, and (3) adopt five rules of thumb from other interdisciplinary collaborations in our field before concluding with new emerging challenges.
This shall support stopping to treating empirical software engineering as a developing discipline moving towards a paradigmatic stage of normal science, but as a configuration of symmetric interdisciplinary teams and research methods.
How does the collaboration network of researchers coalesce around a scientific topic?
What sort of social restructuring occurs as a new field develops?
Previous empirical explorations of these questions have examined the evolution of co-authorship networks associated with several fields of science, each noting a characteristic shift in network structure as fields develop.
Historically, however, such studies have tended to rely on manually annotated datasets and therefore only consider a handful of disciplines, calling into question the universality of the observed structural signature.To overcome this limitation and test the robustness of this phenomenon, we use a comprehensive dataset of over 189,000 scientific articles and develop a framework for partitioning articles and their authors into coherent, semantically-related groups representing scientific fields of varying size and specificity.
We then use the resulting population of fields to study the structure of evolving co-authorship networks.
Consistent with earlier findings, we observe a global topological transition as the co-authorship networks coalesce from a disjointed aggregate into a dense giant connected component that dominates the network.
We validate these results using a separate, complimentary corpus of scientific articles, and, overall, we find that the previously reported characteristic structural evolution of a scientific field's associated co-authorship network is robust across a large number of scientific fields of varying size, scope, and specificity.
Additionally, the framework developed in this study may be used in other scientometric contexts in order to extend studies to compare across a larger range of scientific disciplines.
Processing of multi-word expressions (MWEs) is a known problem for any natural language processing task.
Even neural machine translation (NMT) struggles to overcome it.
This paper presents results of experiments on investigating NMT attention allocation to the MWEs and improving automated translation of sentences that contain MWEs in English->Latvian and English->Czech NMT systems.
Two improvement strategies were explored -(1) bilingual pairs of automatically extracted MWE candidates were added to the parallel corpus used to train the NMT system, and (2) full sentences containing the automatically extracted MWE candidates were added to the parallel corpus.
Both approaches allowed to increase automated evaluation results.
The best result - 0.99 BLEU point increase - has been reached with the first approach, while with the second approach minimal improvements achieved.
We also provide open-source software and tools used for MWE extraction and alignment inspection.
Tax manipulation comes in a variety of forms with different motivations and of varying complexities.
In this paper, we deal with a specific technique used by tax-evaders known as circular trading.
In particular, we define algorithms for the detection and analysis of circular trade.
To achieve this, we have modelled the whole system as a directed graph with the actors being vertices and the transactions among them as directed edges.
We illustrate the results obtained after running the proposed algorithm on the commercial tax dataset of the government of Telangana, India, which contains the transaction details of a set of participants involved in a known circular trade.
In this paper, we present Arap-Tweet, which is a large-scale and multi-dialectal corpus of Tweets from 11 regions and 16 countries in the Arab world representing the major Arabic dialectal varieties.
To build this corpus, we collected data from Twitter and we provided a team of experienced annotators with annotation guidelines that they used to annotate the corpus for age categories, gender, and dialectal variety.
During the data collection effort, we based our search on distinctive keywords that are specific to the different Arabic dialects and we also validated the location using Twitter API.
In this paper, we report on the corpus data collection and annotation efforts.
We also present some issues that we encountered during these phases.
Then, we present the results of the evaluation performed to ensure the consistency of the annotation.
The provided corpus will enrich the limited set of available language resources for Arabic and will be an invaluable enabler for developing author profiling tools and NLP tools for Arabic.
Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization.
While tasks on source code (ie, formal languages) have been considered recently, most work in this area does not attempt to capitalize on the unique opportunities offered by its known syntax and structure.
In this work, we introduce SmartPaste, a first task that requires to use such information.
The task is a variant of the program repair problem that requires to adapt a given (pasted) snippet of code to surrounding, existing source code.
As first solutions, we design a set of deep neural models that learn to represent the context of each variable location and variable usage in a data flow-sensitive way.
Our evaluation suggests that our models can learn to solve the SmartPaste task in many cases, achieving 58.6% accuracy, while learning meaningful representation of variable usages.
Despite the performance advantages of modern sampling-based motion planners, solving high dimensional planning problems in near real-time remains a challenge.
Applications include hyper-redundant manipulators, snake-like and humanoid robots.
Based on the intuition that many of these problem instances do not require the robots to exercise every degree of freedom independently, we introduce an enhancement to popular sampling-based planning algorithms aimed at circumventing the exponential dependence on dimensionality.
We propose beginning the search in a lower dimensional subspace of the configuration space in the hopes that a simple solution will be found quickly.
After a certain number of samples are generated, if no solution is found, we increase the dimension of the search subspace by one and continue sampling in the higher dimensional subspace.
In the worst case, the search subspace expands to include the full configuration space - making the completeness properties identical to the underlying sampling-based planer.
Our experiments comparing the enhanced and traditional version of RRT, RRT-Connect, and BidirectionalT-RRT on both a planar hyper-redundant manipulator and the Baxter humanoid robot indicate that a solution is typically found much faster using this approach and the run time appears to be less sensitive to the dimension of the full configuration space.
We explore important implementation issues in the sampling process and discuss its limitations.
We present a complexity reduction algorithm for a family of parameter-dependent linear systems when the system parameters belong to a compact semi-algebraic set.
This algorithm potentially describes the underlying dynamical system with fewer parameters or state variables.
To do so, it minimizes the distance (i.e., H-infinity-norm of the difference) between the original system and its reduced version.
We present a sub-optimal solution to this problem using sum-of-squares optimization methods.
We present the results for both continuous-time and discrete-time systems.
Lastly, we illustrate the applicability of our proposed algorithm on numerical examples.
Over the past years, literature has shown that attacks exploiting the microarchitecture of modern processors pose a serious threat to the privacy of mobile phone users.
This is because applications leave distinct footprints in the processor, which can be used by malware to infer user activities.
In this work, we show that these inference attacks are considerably more practical when combined with advanced AI techniques.
In particular, we focus on profiling the activity in the last-level cache (LLC) of ARM processors.
We employ a simple Prime+Probe based monitoring technique to obtain cache traces, which we classify with Deep Learning methods including Convolutional Neural Networks.
We demonstrate our approach on an off-the-shelf Android phone by launching a successful attack from an unprivileged, zeropermission App in well under a minute.
The App thereby detects running applications with an accuracy of 98% and reveals opened websites and streaming videos by monitoring the LLC for at most 6 seconds.
This is possible, since Deep Learning compensates measurement disturbances stemming from the inherently noisy LLC monitoring and unfavorable cache characteristics such as random line replacement policies.
In summary, our results show that thanks to advanced AI techniques, inference attacks are becoming alarmingly easy to implement and execute in practice.
This once more calls for countermeasures that confine microarchitectural leakage and protect mobile phone applications, especially those valuing the privacy of their users.
XML access control policies involving updates may contain security flaws, here called inconsistencies, in which a forbidden operation may be simulated by performing a sequence of allowed operations.
This paper investigates the problem of deciding whether a policy is consistent, and if not, how its inconsistencies can be repaired.
We consider policies expressed in terms of annotated DTDs defining which operations are allowed or denied for the XML trees that are instances of the DTD.
We show that consistency is decidable in PTIME for such policies and that consistent partial policies can be extended to unique "least-privilege" consistent total policies.
We also consider repair problems based on deleting privileges to restore consistency, show that finding minimal repairs is NP-complete, and give heuristics for finding repairs.
Housing costs have a significant impact on individuals, families, businesses, and governments.
Recently, online companies such as Zillow have developed proprietary systems that provide automated estimates of housing prices without the immediate need of professional appraisers.
Yet, our understanding of what drives the value of houses is very limited.
In this paper, we use multiple sources of data to entangle the economic contribution of the neighborhood's characteristics such as walkability and security perception.
We also develop and release a framework able to now-cast housing prices from Open data, without the need for historical transactions.
Experiments involving 70,000 houses in 8 Italian cities highlight that the neighborhood's vitality and walkability seem to drive more than 20% of the housing value.
Moreover, the use of this information improves the nowcast by 60%.
Hence, the use of property's surroundings' characteristics can be an invaluable resource to appraise the economic and social value of houses after neighborhood changes and, potentially, anticipate gentrification.
Instance segmentation has attracted recent attention in computer vision and existing methods in this domain mostly have an object detection stage.
In this paper, we study the intrinsic challenge of the instance segmentation problem, the presence of a quotient space (swapping the labels of different instances leads to the same result), and propose new methods that are object proposal- and object detection- free.
We propose three alternative methods, namely pixel-based affinity mapping, superpixel-based affinity learning, and boundary-based component segmentation, all focusing on performing labeling transformations to cope with the quotient space problem.
By adopting fully convolutional neural networks (FCN) like models, our framework attains competitive results on both the PASCAL dataset (object-centric) and the Gland dataset (texture-centric), which the existing methods are not able to do.
Our work also has the advantages in its transparency, simplicity, and being all segmentation based.
A unified method for extracting geometric shape features from binary image data using a steady state partial differential equation (PDE) system as a boundary value problem is presented in this paper.
The PDE and functions are formulated to extract the thickness, orientation, and skeleton simultaneously.
The main advantages of the proposed method is that the orientation is defined without derivatives and thickness computation is not imposed a topological constraint on the target shape.
A one-dimensional analytical solution is provided to validate the proposed method.
In addition, two and three-dimensional numerical examples are presented to confirm the usefulness of the proposed method.
Estimating scene flow in RGB-D videos is attracting much interest of the computer vision researchers, due to its potential applications in robotics.
The state-of-the-art techniques for scene flow estimation, typically rely on the knowledge of scene structure of the frame and the correspondence between frames.
However, with the increasing amount of RGB-D data captured from sophisticated sensors like Microsoft Kinect, and the recent advances in the area of sophisticated deep learning techniques, introduction of an efficient deep learning technique for scene flow estimation, is becoming important.
This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture.
The proposed network SceneEDNet involves estimation of three dimensional motion vectors of all the scene points from sequence of stereo images.
The training for direct estimation of scene flow is done using consecutive pairs of stereo images and corresponding scene flow ground truth.
The proposed architecture is applied on a huge dataset and provides meaningful results.
Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices.
While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms.
CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes.
Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries.
In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices.
We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library.
Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search.
In this paper we consider the task of recognizing human actions in realistic video where human actions are dominated by irrelevant factors.
We first study the benefits of removing non-action video segments, which are the ones that do not portray any human action.
We then learn a non-action classifier and use it to down-weight irrelevant video segments.
The non-action classifier is trained using ActionThread, a dataset with shot-level annotation for the occurrence or absence of a human action.
The non-action classifier can be used to identify non-action shots with high precision and subsequently used to improve the performance of action recognition systems.
Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules.
Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies.
During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks.
In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions.
In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods.
We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.
Molecular communication via diffusion (MCvD) is a molecular communication method that utilizes the free diffusion of carrier molecules to transfer information at the nano-scale.
Due to the random propagation of carrier molecules, inter-symbol interference (ISI) is a major issue in an MCvD system.
Alongside ISI, inter-link interference (ILI) is also an issue that increases the total interference for MCvD-based multiple-input-multiple-output (MIMO) approaches.
Inspired by the antenna index modulation (IM) concept in traditional communication systems, this paper introduces novel IM-based transmission schemes for MCvD systems.
In the paper, molecular space shift keying (MSSK) is proposed as a novel modulation for molecular MIMO systems, and it is found that this method combats ISI and ILI considerably better than existing MIMO approaches.
For nano-machines that have access to two different molecules, the direct extension of MSSK, quadrature molecular space shift keying (QMSSK) is also proposed.
QMSSK is found to combat ISI considerably well whilst not performing well against ILI-caused errors.
In order to combat ILI more effectively, another dual-molecule-based novel modulation scheme called the molecular spatial modulation (MSM) is proposed.
Combined with the Gray mapping imposed on the antenna indices, MSM is observed to yield reliable error rates for molecular MIMO systems.
Motion planning is a key tool that allows robots to navigate through an environment without collisions.
The problem of robot motion planning has been studied in great detail over the last several decades, with researchers initially focusing on systems such as planar mobile robots and low degree-of-freedom (DOF) robotic arms.
The increased use of high DOF robots that must perform tasks in real time in complex dynamic environments spurs the need for fast motion planning algorithms.
In this overview, we discuss several types of strategies for motion planning in high dimensional spaces and dissect some of them, namely grid search based, sampling based and trajectory optimization based approaches.
We compare them and outline their advantages and disadvantages, and finally, provide an insight into future research opportunities.
Low-rank learning has attracted much attention recently due to its efficacy in a rich variety of real-world tasks, e.g., subspace segmentation and image categorization.
Most low-rank methods are incapable of capturing low-dimensional subspace for supervised learning tasks, e.g., classification and regression.
This paper aims to learn both the discriminant low-rank representation (LRR) and the robust projecting subspace in a supervised manner.
To achieve this goal, we cast the problem into a constrained rank minimization framework by adopting the least squares regularization.
Naturally, the data label structure tends to resemble that of the corresponding low-dimensional representation, which is derived from the robust subspace projection of clean data by low-rank learning.
Moreover, the low-dimensional representation of original data can be paired with some informative structure by imposing an appropriate constraint, e.g., Laplacian regularizer.
Therefore, we propose a novel constrained LRR method.
The objective function is formulated as a constrained nuclear norm minimization problem, which can be solved by the inexact augmented Lagrange multiplier algorithm.
Extensive experiments on image classification, human pose estimation, and robust face recovery have confirmed the superiority of our method.
Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens.
Skim-RNN gives computational advantage over an RNN that always updates the entire hidden state.
Skim-RNN uses the same input and output interfaces as a standard RNN and can be easily used instead of RNNs in existing models.
In our experiments, we show that Skim-RNN can achieve significantly reduced computational cost without losing accuracy compared to standard RNNs across five different natural language tasks.
In addition, we demonstrate that the trade-off between accuracy and speed of Skim-RNN can be dynamically controlled during inference time in a stable manner.
Our analysis also shows that Skim-RNN running on a single CPU offers lower latency compared to standard RNNs on GPUs.
Wireless on-chip communication is a promising candidate to address the performance and efficiency issues that arise when scaling current Network-on-Chip (NoC) techniques to manycore processors.
A Wireless Network-on-Chip (WNoC) can serve global and broadcast traffic with ultra-low latency even in thousand-core chips, thus acting as a natural complement of conventional and throughput-oriented wireline NoCs.
However, the development of Medium Access Control (MAC) strategies needed to efficiently share the wireless medium among the increasing number of cores remains as a considerable challenge given the singularities of the environment and the novelty of the research area.
In this position paper, we present a context analysis describing the physical constraints, performance objectives, and traffic characteristics of the on-chip communication paradigm.
We summarize the main differences with respect to traditional wireless scenarios, to then discuss their implications on the design of MAC protocols for manycore WNoCs, with the ultimate goal of kickstarting this arguably unexplored research area.
Neural approaches to sequence labeling often use a Conditional Random Field (CRF) to model their output dependencies, while Recurrent Neural Networks (RNN) are used for the same purpose in other tasks.
We set out to establish RNNs as an attractive alternative to CRFs for sequence labeling.
To do so, we address one of the RNN's most prominent shortcomings, the fact that it is not exposed to its own errors with the maximum-likelihood training.
We frame the prediction of the output sequence as a sequential decision-making process, where we train the network with an adjusted actor-critic algorithm (AC-RNN).
We comprehensively compare this strategy with maximum-likelihood training for both RNNs and CRFs on three structured-output tasks.
The proposed AC-RNN efficiently matches the performance of the CRF on NER and CCG tagging, and outperforms it on Machine Transliteration.
We also show that our training strategy is significantly better than other techniques for addressing RNN's exposure bias, such as Scheduled Sampling, and Self-Critical policy training.
In this paper, we compare the individual rate of MIMO-NOMA and MIMO-OMA when users are paired into clusters.
A power allocation (PA) strategy is proposed, which ensures that MIMO-NOMA achieves a higher individual rate for each user than MIMO-OMA with arbitrary PA and optimal degrees of freedom split.
In addition, a special case with equal degrees of freedom and arbitrary PA for OMA is considered, for which the individual rate superiority of NOMA still holds.
Moreover, it is shown that NOMA can attain better fairness through appropriate PA.
Finally, simulations are carried out, which validate the developed analytical results.
Exploring contextual information in the local region is important for shape understanding and analysis.
Existing studies often employ hand-crafted or explicit ways to encode contextual information of local regions.
However, it is hard to capture fine-grained contextual information in hand-crafted or explicit manners, such as the correlation between different areas in a local region, which limits the discriminative ability of learned features.
To resolve this issue, we propose a novel deep learning model for 3D point clouds, named Point2Sequence, to learn 3D shape features by capturing fine-grained contextual information in a novel implicit way.
Point2Sequence employs a novel sequence learning model for point clouds to capture the correlations by aggregating multi-scale areas of each local region with attention.
Specifically, Point2Sequence first learns the feature of each area scale in a local region.
Then, it captures the correlation between area scales in the process of aggregating all area scales using a recurrent neural network (RNN) based encoder-decoder structure, where an attention mechanism is proposed to highlight the importance of different area scales.
Experimental results show that Point2Sequence achieves state-of-the-art performance in shape classification and segmentation tasks.
The relationship between reading and writing (RRW) is one of the major themes in learning science.
One of its obstacles is that it is difficult to define or measure the latent background knowledge of the individual.
However, in an academic research setting, scholars are required to explicitly list their background knowledge in the citation sections of their manuscripts.
This unique opportunity was taken advantage of to observe RRW, especially in the published academic commentary scenario.
RRW was visualized under a proposed topic process model by using a state of the art version of latent Dirichlet allocation (LDA).
The empirical study showed that the academic commentary is modulated both by its target paper and the author's background knowledge.
Although this conclusion was obtained in a unique environment, we suggest its implications can also shed light on other similar interesting areas, such as dialog and conversation, group discussion, and social media.
We present a new approach for building source-to-source transformations that can run on multiple programming languages, based on a new way of representing programs called incremental parametric syntax.
We implement this approach in Haskell in our Cubix system, and construct incremental parametric syntaxes for C, Java, JavaScript, Lua, and Python.
We demonstrate a whole-program refactoring tool that runs on all of them, along with three smaller transformations that each run on several.
Our evaluation shows that (1) once a transformation is written, little work is required to configure it for a new language (2) transformations built this way output readable code which preserve the structure of the original, according to participants in our human study, and (3) our transformations can still handle language corner-cases, as validated on compiler test suites.
We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e.g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity.
This formulation allows us to use a new type of distant supervision at large scale: head words, which indicate the type of the noun phrases they appear in.
We show that these ultra-fine types can be crowd-sourced, and introduce new evaluation sets that are much more diverse and fine-grained than existing benchmarks.
We present a model that can predict open types, and is trained using a multitask objective that pools our new head-word supervision with prior supervision from entity linking.
Experimental results demonstrate that our model is effective in predicting entity types at varying granularity; it achieves state of the art performance on an existing fine-grained entity typing benchmark, and sets baselines for our newly-introduced datasets.
Our data and model can be downloaded from: http://nlp.cs.washington.edu/entity_type
Automatically determining the optimal size of a neural network for a given task without prior information currently requires an expensive global search and training many networks from scratch.
In this paper, we address the problem of automatically finding a good network size during a single training cycle.
We introduce *nonparametric neural networks*, a non-probabilistic framework for conducting optimization over all possible network sizes and prove its soundness when network growth is limited via an L_p penalty.
We train networks under this framework by continuously adding new units while eliminating redundant units via an L_2 penalty.
We employ a novel optimization algorithm, which we term *adaptive radial-angular gradient descent* or *AdaRad*, and obtain promising results.
This letter describes a network that is able to capture spatiotemporal correlations over arbitrary timestamps.
The proposed scheme operates as a complementary, extended network over spatiotemporal regions.
Recently, multimodal fusion has been extensively researched in deep learning.
For action recognition, the spatial and temporal streams are vital components of deep Convolutional Neural Network (CNNs), but reducing the occurrence of overfitting and fusing these two streams remain open problems.
The existing fusion approach is to average the two streams.
To this end, we propose a correlation network with a Shannon fusion to learn a CNN that has already been trained.
Long-range video may consist of spatiotemporal correlation over arbitrary times.
This correlation can be captured using simple fully connected layers to form the correlation network.
This is found to be complementary to the existing network fusion methods.
We evaluate our approach on the UCF-101 and HMDB-51 datasets, and the resulting improvement in accuracy demonstrates the importance of multimodal correlation.
Representing the semantics of words is a long-standing problem for the natural language processing community.
Most methods compute word semantics given their textual context in large corpora.
More recently, researchers attempted to integrate perceptual and visual features.
Most of these works consider the visual appearance of objects to enhance word representations but they ignore the visual environment and context in which objects appear.
We propose to unify text-based techniques with vision-based techniques by simultaneously leveraging textual and visual context to learn multimodal word embeddings.
We explore various choices for what can serve as a visual context and present an end-to-end method to integrate visual context elements in a multimodal skip-gram model.
We provide experiments and extensive analysis of the obtained results.
In this paper we discuss the stability properties of convolutional neural networks.
Convolutional neural networks are widely used in machine learning.
In classification they are mainly used as feature extractors.
Ideally, we expect similar features when the inputs are from the same class.
That is, we hope to see a small change in the feature vector with respect to a deformation on the input signal.
This can be established mathematically, and the key step is to derive the Lipschitz properties.
Further, we establish that the stability results can be extended for more general networks.
We give a formula for computing the Lipschitz bound, and compare it with other methods to show it is closer to the optimal value.
In this work, we study the effects of finite buffers on the throughput and delay of line networks with erasure links.
We identify the calculation of performance parameters such as throughput and delay to be equivalent to determining the stationary distribution of an irreducible Markov chain.
We note that the number of states in the Markov chain grows exponentially in the size of the buffers with the exponent scaling linearly with the number of hops in a line network.
We then propose a simplified iterative scheme to approximately identify the steady-state distribution of the chain by decoupling the chain to smaller chains.
The approximate solution is then used to understand the effect of buffer sizes on throughput and distribution of packet delay.
Further, we classify nodes based on congestion that yields an intelligent scheme for memory allocation using the proposed framework.
Finally, by simulations we confirm that our framework yields an accurate prediction of the variation of the throughput and delay distribution.
When dealing with process calculi and automata which express both nondeterministic and probabilistic behavior, it is customary to introduce the notion of scheduler to solve the nondeterminism.
It has been observed that for certain applications, notably those in security, the scheduler needs to be restricted so not to reveal the outcome of the protocol's random choices, or otherwise the model of adversary would be too strong even for ``obviously correct'' protocols.
We propose a process-algebraic framework in which the control on the scheduler can be specified in syntactic terms, and we show how to apply it to solve the problem mentioned above.
We also consider the definition of (probabilistic) may and must preorders, and we show that they are precongruences with respect to the restricted schedulers.
Furthermore, we show that all the operators of the language, except replication, distribute over probabilistic summation, which is a useful property for verification.
This paper presents an adaptive and intelligent sparse model for digital image sampling and recovery.
In the proposed sampler, we adaptively determine the number of required samples for retrieving image based on space-frequency-gradient information content of image patches.
By leveraging texture in space, sparsity locations in DCT domain, and directional decomposition of gradients, the sampler structure consists of a combination of uniform, random, and nonuniform sampling strategies.
For reconstruction, we model the recovery problem as a two-state cellular automaton to iteratively restore image with scalable windows from generation to generation.
We demonstrate the recovery algorithm quickly converges after a few generations for an image with arbitrary degree of texture.
For a given number of measurements, extensive experiments on standard image-sets, infra-red, and mega-pixel range imaging devices show that the proposed measurement matrix considerably increases the overall recovery performance, or equivalently decreases the number of sampled pixels for a specific recovery quality compared to random sampling matrix and Gaussian linear combinations employed by the state-of-the-art compressive sensing methods.
In practice, the proposed measurement-adaptive sampling/recovery framework includes various applications from intelligent compressive imaging-based acquisition devices to computer vision and graphics, and image processing technology.
Simulation codes are available online for reproduction purposes.
Most existing knowledge graphs (KGs) in academic domains suffer from problems of insufficient multi-relational information, name ambiguity and improper data format for large-scale machine processing.
In this paper, we present AceKG, a new large-scale KG in academic domain.
AceKG not only provides clean academic information, but also offers a large-scale benchmark dataset for researchers to conduct challenging data mining projects including link prediction, community detection and scholar classification.
Specifically, AceKG describes 3.13 billion triples of academic facts based on a consistent ontology, including necessary properties of papers, authors, fields of study, venues and institutes, as well as the relations among them.
To enrich the proposed knowledge graph, we also perform entity alignment with existing databases and rule-based inference.
Based on AceKG, we conduct experiments of three typical academic data mining tasks and evaluate several state-of- the-art knowledge embedding and network representation learning approaches on the benchmark datasets built from AceKG.
Finally, we discuss several promising research directions that benefit from AceKG.
In this paper, a novel multiple criteria decision making (MCDM) methodology is presented for assessing and prioritizing medical tourism destinations in uncertain environment.
A systematic evaluation and assessment method is proposed by integrating rough number based AHP (Analytic Hierarchy Process) and rough number based MABAC (Multi-Attributive Border Approximation area Comparison).
Rough number is used to aggregate individual judgments and preferences to deal with vagueness in decision making due to limited data.
Rough AHP analyzes the relative importance of criteria based on their preferences given by experts.
Rough MABAC evaluates the alternative sites based on the criteria weights.
The proposed methodology is explained through a case study considering different cities for healthcare service in India.
The validity of the obtained ranking for the given decision making problem is established by testing criteria proposed by Wang and Triantaphyllou (2008) along with further analysis and discussion.
In real world everything is an object which represents particular classes.
Every object can be fully described by its attributes.
Any real world dataset contains large number of attributes and objects.
Classifiers give poor performance when these huge datasets are given as input to it for proper classification.
So from these huge dataset most useful attributes need to be extracted that contribute the maximum to the decision.
In the paper, attribute set is reduced by generating reducts using the indiscernibility relation of Rough Set Theory (RST).
The method measures similarity among the attributes using relative indiscernibility relation and computes attribute similarity set.
Then the set is minimized and an attribute similarity table is constructed from which attribute similar to maximum number of attributes is selected so that the resultant minimum set of selected attributes (called reduct) cover all attributes of the attribute similarity table.
The method has been applied on glass dataset collected from the UCI repository and the classification accuracy is calculated by various classifiers.
The result shows the efficiency of the proposed method.
Many important real-world applications-such as social networks or distributed data bases-can be modeled as hypergraphs.
In such a model, vertices represent entities-such as users or data records-whereas hyperedges model a group membership of the vertices-such as the authorship in a specific topic or the membership of a data record in a specific replicated shard.
To optimize such applications, we need an efficient and effective solution to the NP-hard balanced k-way hypergraph partitioning problem.
However, existing hypergraph partitioners that scale to very large graphs do not effectively exploit the hypergraph structure when performing the partitioning decisions.
We propose HYPE, a hypergraph partitionier that exploits the neighborhood relations between vertices in the hypergraph using an efficient implementation of neighborhood expansion.
HYPE improves partitioning quality by up to 95% and reduces runtime by up to 39% compared to streaming partitioning.
With crimes on the rise all around the world, video surveillance is becoming more important day by day.
Due to the lack of human resources to monitor this increasing number of cameras manually new computer vision algorithms to perform lower and higher level tasks are being developed.
We have developed a new method incorporating the most acclaimed Histograms of Oriented Gradients the theory of Visual Saliency and the saliency prediction model Deep Multi Level Network to detect human beings in video sequences.
Furthermore we implemented the k Means algorithm to cluster the HOG feature vectors of the positively detected windows and determined the path followed by a person in the video.
We achieved a detection precision of 83.11% and a recall of 41.27%.
We obtained these results 76.866 times faster than classification on normal images.
This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture.
The database contains measurements of soil profile data.
We have applied GATree for generating classification decision tree.
GATree is a decision tree builder that is based on Genetic Algorithms (GAs).
The idea behind it is rather simple but powerful.
Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size.
GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems.
Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.
Current research environments are witnessing high enormities of presentations occurring in different sessions at academic conferences.
This situation makes it difficult for researchers (especially juniors) to attend the right presentation session(s) for effective collaboration.
In this paper, we propose an innovative venue recommendation algorithm to enhance smart conference participation.
Our proposed algorithm, Social Aware Recommendation of Venues and Environments (SARVE), computes the Pearson Correlation and social characteristic information of conference participants.
SARVE further incorporates the current context of both the smart conference community and participants in order to model a recommendation process using distributed community detection.
Through the integration of the above computations and techniques, we are able to recommend presentation sessions of active participant presenters that may be of high interest to a particular participant.
We evaluate SARVE using a real world dataset.
Our experimental results demonstrate that SARVE outperforms other state-of-the-art methods.
Generative adversarial networks (GANs) are powerful tools for learning generative models.
In practice, the training may suffer from lack of convergence.
GANs are commonly viewed as a two-player zero-sum game between two neural networks.
Here, we leverage this game theoretic view to study the convergence behavior of the training process.
Inspired by the fictitious play learning process, a novel training method, referred to as Fictitious GAN, is introduced.
Fictitious GAN trains the deep neural networks using a mixture of historical models.
Specifically, the discriminator (resp. generator) is updated according to the best-response to the mixture outputs from a sequence of previously trained generators (resp. discriminators).
It is shown that Fictitious GAN can effectively resolve some convergence issues that cannot be resolved by the standard training approach.
It is proved that asymptotically the average of the generator outputs has the same distribution as the data samples.
Unsupervised learning permits the development of algorithms that are able to adapt to a variety of different data sets using the same underlying rules thanks to the autonomous discovery of discriminating features during training.
Recently, a new class of Hebbian-like and local unsupervised learning rules for neural networks have been developed that minimise a similarity matching cost-function.
These have been shown to perform sparse representation learning.
This study tests the effectiveness of one such learning rule for learning features from images.
The rule implemented is derived from a nonnegative classical multidimensional scaling cost-function, and is applied to both single and multi-layer architectures.
The features learned by the algorithm are then used as input to an SVM to test their effectiveness in classification on the established CIFAR-10 image dataset.
The algorithm performs well in comparison to other unsupervised learning algorithms and multi-layer networks, thus suggesting its validity in the design of a new class of compact, online learning networks.
Modern biological science produces vast amounts of genomic sequence data.
This is fuelling the need for efficient algorithms for sequence compression and analysis.
Data compression and the associated techniques coming from information theory are often perceived as being of interest for data communication and storage.
In recent years, a substantial effort has been made for the application of textual data compression techniques to various computational biology tasks, ranging from storage and indexing of large datasets to comparison of genomic databases.
This paper presents a differential compression algorithm that is based on production of difference sequences according to op-code table in order to optimize the compression of homologous sequences in dataset.
Therefore, the stored data are composed of reference sequence, the set of differences, and differences locations, instead of storing each sequence individually.
This algorithm does not require a priori knowledge about the statistics of the sequence set.
The algorithm was applied to three different datasets of genomic sequences, it achieved up to 195-fold compression rate corresponding to 99.4% space saving.
The problem of quickest detection of an anomalous process among M processes is considered.
At each time, a subset of the processes can be observed, and the observations from each chosen process follow two different distributions, depending on whether the process is normal or abnormal.
The objective is a sequential search strategy that minimizes the expected detection time subject to an error probability constraint.
This problem can be considered as a special case of active hypothesis testing first considered by Chernoff in 1959 where a randomized strategy, referred to as the Chernoff test, was proposed and shown to be asymptotically (as the error probability approaches zero) optimal.
For the special case considered in this paper, we show that a simple deterministic test achieves asymptotic optimality and offers better performance in the finite regime.
We further extend the problem to the case where multiple anomalous processes are present.
In particular, we examine the case where only an upper bound on the number of anomalous processes is known.
Sequence data is challenging for machine learning approaches, because the lengths of the sequences may vary between samples.
In this paper, we present an unsupervised learning model for sequence data, called the Integrated Sequence Autoencoder (ISA), to learn a fixed-length vectorial representation by minimizing the reconstruction error.
Specifically, we propose to integrate two classical mechanisms for sequence reconstruction which takes into account both the global silhouette information and the local temporal dependencies.
Furthermore, we propose a stop feature that serves as a temporal stamp to guide the reconstruction process, which results in a higher-quality representation.
The learned representation is able to effectively summarize not only the apparent features, but also the underlying and high-level style information.
Take for example a speech sequence sample: our ISA model can not only recognize the spoken text (apparent feature), but can also discriminate the speaker who utters the audio (more high-level style).
One promising application of the ISA model is that it can be readily used in the semi-supervised learning scenario, in which a large amount of unlabeled data is leveraged to extract high-quality sequence representations and thus to improve the performance of the subsequent supervised learning tasks on limited labeled data.
It is desirable for detection and classification algorithms to generalize to unfamiliar environments, but suitable benchmarks for quantitatively studying this phenomenon are not yet available.
We present a dataset designed to measure recognition generalization to novel environments.
The images in our dataset are harvested from twenty camera traps deployed to monitor animal populations.
Camera traps are fixed at one location, hence the background changes little across images; capture is triggered automatically, hence there is no human bias.
The challenge is learning recognition in a handful of locations, and generalizing animal detection and classification to new locations where no training data is available.
In our experiments state-of-the-art algorithms show excellent performance when tested at the same location where they were trained.
However, we find that generalization to new locations is poor, especially for classification systems.
Using time series of US patents per million inhabitants, knowledge-generating cycles can be distinguished.
These cycles partly coincide with Kondratieff long waves.
The changes in the slopes between them indicate discontinuities in the knowledge-generating paradigms.
The knowledge-generating paradigms can be modeled in terms of interacting dimensions (for example, in university-industry-government relations) that set limits to the maximal efficiency of innovation systems.
The maximum values of the parameters in the model are of the same order as the regression coefficients of the empirical waves.
The mechanism of the increase in the dimensionality is specified as self-organization which leads to the breaking of existing relations into the more diversified structure of a fractal-like network.
This breaking can be modeled in analogy to 2D and 3D (Koch) snowflakes.
The boost of knowledge generation leads to newly emerging technologies that can be expected to be more diversified and show shorter life cycles than before.
Time spans of the knowledge-generating cycles can also be analyzed in terms of Fibonacci numbers.
This perspective allows for forecasting expected dates of future possible paradigm changes.
In terms of policy implications, this suggests a shift in focus from the manufacturing technologies to developing new organizational technologies and formats of human interactions
Large-scale collection of human behavioral data by companies raises serious privacy concerns.
We show that behavior captured in the form of application usage data collected from smartphones is highly unique even in very large datasets encompassing millions of individuals.
This makes behavior-based re-identification of users across datasets possible.
We study 12 months of data from 3.5 million users and show that four apps are enough to uniquely re-identify 91.2% of users using a simple strategy based on public information.
Furthermore, we show that there is seasonal variability in uniqueness and that application usage fingerprints drift over time at an average constant rate.
The paper presents some theoretical and practical considerations regarding the TV information distribution in local (small and medium) networks, using different technologies and architectures.
The SMATV concept is chosen to be presented extensively.
The most important design formulae are presented with a software package supporting the network planner to design and optimize the network.
A case study is realized, using standard components in SMATV, for a 5 floor building.
The study proved that it is possible to design and optimize the entire network, without realizing first a costly experimental setup.
It is also possible to run different architectures, optimizing also the costs of the final solution of network.
We trained Binarized Neural Networks (BNNs) on the high resolution ImageNet ILSVRC-2102 dataset classification task and achieved a good performance.
With a moderate size network of 13 layers, we obtained top-5 classification accuracy rate of 84.1 % on validation set through network distillation, much better than previous published results of 73.2% on XNOR network and 69.1% on binarized GoogleNET.
We expect networks of better performance can be obtained by following our current strategies.
We provide a detailed discussion and preliminary analysis on strategies used in the network training.
We are proposing an extension of the recursive neural network that makes use of a variant of the long short-term memory architecture.
The extension allows information low in parse trees to be stored in a memory register (the `memory cell') and used much later higher up in the parse tree.
This provides a solution to the vanishing gradient problem and allows the network to capture long range dependencies.
Experimental results show that our composition outperformed the traditional neural-network composition on the Stanford Sentiment Treebank.
Open forms of global constraints allow the addition of new variables to an argument during the execution of a constraint program.
Such forms are needed for difficult constraint programming problems where problem construction and problem solving are interleaved, and fit naturally within constraint logic programming.
However, in general, filtering that is sound for a global constraint can be unsound when the constraint is open.
This paper provides a simple characterization, called contractibility, of the constraints where filtering remains sound when the constraint is open.
With this characterization we can easily determine whether a constraint has this property or not.
In the latter case, we can use it to derive a contractible approximation to the constraint.
We demonstrate this work on both hard and soft constraints.
In the process, we formulate two general classes of soft constraints.
Sections are the building blocks of Wikipedia articles.
They enhance readability and can be used as a structured entry point for creating and expanding articles.
Structuring a new or already existing Wikipedia article with sections is a hard task for humans, especially for newcomers or less experienced editors, as it requires significant knowledge about how a well-written article looks for each possible topic.
Inspired by this need, the present paper defines the problem of section recommendation for Wikipedia articles and proposes several approaches for tackling it.
Our systems can help editors by recommending what sections to add to already existing or newly created Wikipedia articles.
Our basic paradigm is to generate recommendations by sourcing sections from articles that are similar to the input article.
We explore several ways of defining similarity for this purpose (based on topic modeling, collaborative filtering, and Wikipedia's category system).
We use both automatic and human evaluation approaches for assessing the performance of our recommendation system, concluding that the category-based approach works best, achieving precision@10 of about 80% in the human evaluation.
We consider the problem of jointly optimizing channel pairing, channel-user assignment, and power allocation, to maximize the weighted sum-rate, in a single-relay cooperative system with multiple channels and multiple users.
Common relaying strategies are considered, and transmission power constraints are imposed on both individual transmitters and the aggregate over all transmitters.
The joint optimization problem naturally leads to a mixed-integer program.
Despite the general expectation that such problems are intractable, we construct an efficient algorithm to find an optimal solution, which incurs computational complexity that is polynomial in the number of channels and the number of users.
We further demonstrate through numerical experiments that the jointly optimal solution can significantly improve system performance over its suboptimal alternatives.
Current state-of-the-art semantic role labeling (SRL) uses a deep neural network with no explicit linguistic features.
However, prior work has shown that gold syntax trees can dramatically improve SRL decoding, suggesting the possibility of increased accuracy from explicit modeling of syntax.
In this work, we present linguistically-informed self-attention (LISA): a neural network model that combines multi-head self-attention with multi-task learning across dependency parsing, part-of-speech tagging, predicate detection and SRL.
Unlike previous models which require significant pre-processing to prepare linguistic features, LISA can incorporate syntax using merely raw tokens as input, encoding the sequence only once to simultaneously perform parsing, predicate detection and role labeling for all predicates.
Syntax is incorporated by training one attention head to attend to syntactic parents for each token.
Moreover, if a high-quality syntactic parse is already available, it can be beneficially injected at test time without re-training our SRL model.
In experiments on CoNLL-2005 SRL, LISA achieves new state-of-the-art performance for a model using predicted predicates and standard word embeddings, attaining 2.5 F1 absolute higher than the previous state-of-the-art on newswire and more than 3.5 F1 on out-of-domain data, nearly 10% reduction in error.
On ConLL-2012 English SRL we also show an improvement of more than 2.5 F1.
LISA also out-performs the state-of-the-art with contextually-encoded (ELMo) word representations, by nearly 1.0 F1 on news and more than 2.0 F1 on out-of-domain text.
Packet parsing is a key step in SDN-aware devices.
Packet parsers in SDN networks need to be both reconfigurable and fast, to support the evolving network protocols and the increasing multi-gigabit data rates.
The combination of packet processing languages with FPGAs seems to be the perfect match for these requirements.
In this work, we develop an open-source FPGA-based configurable architecture for arbitrary packet parsing to be used in SDN networks.
We generate low latency and high-speed streaming packet parsers directly from a packet processing program.
Our architecture is pipelined and entirely modeled using templated C++ classes.
The pipeline layout is derived from a parser graph that corresponds a P4 code after a series of graph transformation rounds.
The RTL code is generated from the C++ description using Xilinx Vivado HLS and synthesized with Xilinx Vivado.
Our architecture achieves 100 Gb/s data rate in a Xilinx Virtex-7 FPGA while reducing the latency by 45% and the LUT usage by 40% compared to the state-of-the-art.
In order to better manage the premiums and encourage safe driving, many commercial insurance companies (e.g., Geico, Progressive) are providing options for their customers to install sensors on their vehicles which collect individual vehicle's traveling data.
The driver's insurance is linked to his/her driving behavior.
At the other end, through analyzing the historical traveling data from a large number of vehicles, the insurance company could build a classifier to predict a new driver's driving style: aggressive or defensive.
However, collection of such vehicle traveling data explicitly breaches the drivers' personal privacy.
To tackle such privacy concerns, this paper presents a privacy-preserving driving style recognition technique to securely predict aggressive and defensive drivers for the insurance company without compromising the privacy of all the participating parties.
The insurance company cannot learn any private information from the vehicles, and vice-versa.
Finally, the effectiveness and efficiency of the privacy-preserving driving style recognition technique are validated with experimental results.
This letter introduces a 3D space-time-space block code for future digital TV systems.
The code is based on a double layer structure for inter-cell and intra-cell transmission mode in single frequency networks.
Without increasing the complexity of the receiver, the proposed code is very efficient for different transmission scenarios.
Midpoint subdivision generalizes the Lane-Riesenfeld algorithm for uniform tensor product splines and can also be applied to non regular meshes.
For example, midpoint subdivision of degree 2 is a specific Doo-Sabin algorithm and midpoint subdivision of degree 3 is a specific Catmull-Clark algorithm.
In 2001, Zorin and Schroeder were able to prove C1-continuity for midpoint subdivision surfaces analytically up to degree 9.
Here, we develop general analysis tools to show that the limiting surfaces under midpoint subdivision of any degree >= 2 are C1-continuous at their extraordinary points.
In this work we present simple grapheme-based system for low-resource speech recognition using Babel data for Turkish spontaneous speech (80 hours).
We have investigated different neural network architectures performance, including fully-convolutional, recurrent and ResNet with GRU.
Different features and normalization techniques are compared as well.
We also proposed CTC-loss modification using segmentation during training, which leads to improvement while decoding with small beam size.
Our best model achieved word error rate of 45.8%, which is the best reported result for end-to-end systems using in-domain data for this task, according to our knowledge.
In this article, a theoretical justification of one type of skew-symmetric optimal translational motion (moving in the minimal acceptable time) of a flexible object carried by a robot from its initial to its final position of absolute quiescence with the exception of the oscillations at the end of the motion is presented.
The Hamilton-Ostrogradsky principle is used as a criterion for searching an optimal control.
The data of experimental verification of the control are presented using the Orthoglide robot for translational motions and several masses were attached to a flexible beam.
The standard reasoning problem, concept satisfiability, in the basic description logic ALC is PSPACE-complete, and it is EXPTIME-complete in the presence of unrestricted axioms.
Several fragments of ALC, notably logics in the FL, EL, and DL-Lite families, have an easier satisfiability problem; sometimes it is even tractable.
We classify the complexity of the standard satisfiability problems for all possible Boolean and quantifier fragments of ALC in the presence of general axioms.
In zero-shot learning (ZSL), a classifier is trained to recognize visual classes without any image samples.
Instead, it is given semantic information about the class, like a textual description or a set of attributes.
Learning from attributes could benefit from explicitly modeling structure of the attribute space.
Unfortunately, learning of general structure from empirical samples is hard with typical dataset sizes.
Here we describe LAGO, a probabilistic model designed to capture natural soft and-or relations across groups of attributes.
We show how this model can be learned end-to-end with a deep attribute-detection model.
The soft group structure can be learned from data jointly as part of the model, and can also readily incorporate prior knowledge about groups if available.
The soft and-or structure succeeds to capture meaningful and predictive structures, improving the accuracy of zero-shot learning on two of three benchmarks.
Finally, LAGO reveals a unified formulation over two ZSL approaches: DAP (Lampert et al., 2009) and ESZSL (Romera-Paredes & Torr, 2015).
Interestingly, taking only one singleton group for each attribute, introduces a new soft-relaxation of DAP, that outperforms DAP by 40.
Predicting issue lifetime can help software developers, managers, and stakeholders effectively prioritize work, allocate development resources, and better understand project timelines.
Progress had been made on this prediction problem, but prior work has reported low precision and high false alarms.
The latest results also use complex models such as random forests that detract from their readability.
We solve both issues by using small, readable decision trees (under 20 lines long) and correlation feature selection to predict issue lifetime, achieving high precision and low false alarms (medians of 71% and 13% respectively).
We also address the problem of high class imbalance within issue datasets - when local data fails to train a good model, we show that cross-project data can be used in place of the local data.
In fact, cross-project data works so well that we argue it should be the default approach for learning predictors for issue lifetime.
The detection of weapons concealed underneath a person cloths is very much important to the improvement of the security of the public as well as the safety of public assets like airports, buildings and railway stations etc.
With the advent of drones, aerial video analysis becomes increasingly important; yet, it has received scant attention in the literature.
This paper addresses a new problem of parsing low-resolution aerial videos of large spatial areas, in terms of 1) grouping, 2) recognizing events and 3) assigning roles to people engaged in events.
We propose a novel framework aimed at conducting joint inference of the above tasks, as reasoning about each in isolation typically fails in our setting.
Given noisy tracklets of people and detections of large objects and scene surfaces (e.g., building, grass), we use a spatiotemporal AND-OR graph to drive our joint inference, using Markov Chain Monte Carlo and dynamic programming.
We also introduce a new formalism of spatiotemporal templates characterizing latent sub-events.
For evaluation, we have collected and released a new aerial videos dataset using a hex-rotor flying over picnic areas rich with group events.
Our results demonstrate that we successfully address above inference tasks under challenging conditions.
Many social media researchers and data scientists collected geo-tagged tweets to conduct spatial analysis or identify spatiotemporal patterns of filtered messages for specific topics or events.
This paper provides a systematic view to illustrate the characteristics (data noises, user biases, and system errors) of geo-tagged tweets from the Twitter Streaming API.
First, we found that a small percentage (1%) of active Twitter users can create a large portion (16%) of geo-tagged tweets.
Second, there is a significant amount (57.3%) of geo-tagged tweets located outside the Twitter Streaming API's bounding box in San Diego.
Third, we can detect spam, bot, cyborg tweets (data noises) by examining the "source" metadata field.
The portion of data noises in geo-tagged tweets is significant (29.42% in San Diego, CA and 53.47% in Columbus, OH) in our case study.
Finally, the majority of geo-tagged tweets are not created by the generic Twitter apps in Android or iPhone devices, but by other platforms, such as Instagram and Foursquare.
We recommend a multi-step procedure to remove these noises for the future research projects utilizing geo-tagged tweets.
The increasing demand for higher data rates, better quality of service, fully mobile and connected wireless networks lead the researchers to seek new solutions beyond 4G wireless systems.
It is anticipated that 5G wireless networks, which are expected to be introduced around 2020, will achieve ten times higher spectral and energy efficiency than current 4G wireless networks and will support data rates up to 10 Gbps for low mobility users.
The ambitious goals set for 5G wireless networks require dramatic changes in the design of different layers for next generation communications systems.
Massive multiple-input multiple-output (MIMO) systems, filter bank multi-carrier (FBMC) modulation, relaying technologies, and millimeter-wave communications have been considered as some of the strong candidates for the physical layer design of 5G networks.
In this article, we shed light on the potential and implementation of index modulation (IM) techniques for MIMO and multi-carrier communications systems which are expected to be two of the key technologies for 5G systems.
Specifically, we focus on two promising applications of IM: spatial modulation (SM) and orthogonal frequency division multiplexing with IM (OFDM-IM), and we discuss the recent advances and future research directions in IM technologies towards spectral and energy-efficient 5G wireless networks.
The Internet of Things (IoT) is a crucial component of Industry 4.0.
Due to growing demands of customers, the current IoT architecture will not be reliable and responsive for next generation IoT applications and upcoming services.
In this paper, the next generation IoT architecture based on new technologies is proposed in which the requirements of future applications, services, and generated data are addressed.
Particularly, this architecture consists of Nano-chip, millimeter Wave (mmWave), Heterogeneous Networks (HetNet), device-todevice (D2D) communication, 5G-IoT, Machine-Type Communication (MTC), Wireless Network Function virtualization (WNFV), Wireless Software Defined Networks (WSDN), Advanced Spectrum Sharing and Interference Management (Advanced SSIM), Mobile Edge Computing (MEC), Mobile Cloud Computing (MCC), Data Analytics and Big Data.
This combination of technologies is able to satisfy requirements of new applications.
The proposed novel architecture is modular, efficient, agile, scalable, simple, and it is able to satisfy the high amount of data and application demands.
In sentence classification tasks, additional contexts, such as the neighboring sentences, may improve the accuracy of the classifier.
However, such contexts are domain-dependent and thus cannot be used for another classification task with an inappropriate domain.
In contrast, we propose the use of translated sentences as context that is always available regardless of the domain.
We find that naive feature expansion of translations gains only marginal improvements and may decrease the performance of the classifier, due to possible inaccurate translations thus producing noisy sentence vectors.
To this end, we present multiple context fixing attachment (MCFA), a series of modules attached to multiple sentence vectors to fix the noise in the vectors using the other sentence vectors as context.
We show that our method performs competitively compared to previous models, achieving best classification performance on multiple data sets.
We are the first to use translations as domain-free contexts for sentence classification.
An approach to the formal description of service contracts is presented in terms of automata.
We focus on the basic property of guaranteeing that in the multi-party composition of principals each of them gets his requests satisfied, so that the overall composition reaches its goal.
Depending on whether requests are satisfied synchronously or asynchronously, we construct an orchestrator that at static time either yields composed services enjoying the required properties or detects the principals responsible for possible violations.
To do that in the asynchronous case we resort to Linear Programming techniques.
We also relate our automata with two logically based methods for specifying contracts.
We present a novel distributed Gauss-Newton method for the non-linear state estimation (SE) model based on a probabilistic inference method called belief propagation (BP).
The main novelty of our work comes from applying BP sequentially over a sequence of linear approximations of the SE model, akin to what is done by the Gauss-Newton method.
The resulting iterative Gauss-Newton belief propagation (GN-BP) algorithm can be interpreted as a distributed Gauss-Newton method with the same accuracy as the centralized SE, however, introducing a number of advantages of the BP framework.
The paper provides extensive numerical study of the GN-BP algorithm, provides details on its convergence behavior, and gives a number of useful insights for its implementation.
Traditional event detection methods heavily rely on manually engineered rich features.
Recent deep learning approaches alleviate this problem by automatic feature engineering.
But such efforts, like tradition methods, have so far only focused on single-token event mentions, whereas in practice events can also be a phrase.
We instead use forward-backward recurrent neural networks (FBRNNs) to detect events that can be either words or phrases.
To the best our knowledge, this is one of the first efforts to handle multi-word events and also the first attempt to use RNNs for event detection.
Experimental results demonstrate that FBRNN is competitive with the state-of-the-art methods on the ACE 2005 and the Rich ERE 2015 event detection tasks.
In this paper, we propose a novelmethod to search for precise locations of paired note onset and offset in a singing voice signal.
In comparison with the existing onset detection algorithms,our approach differs in two key respects.
First, we employ Correntropy, a generalized correlation function inspired from Reyni's entropy, as a detection function to capture the instantaneous flux while preserving insensitiveness to outliers.
Next, a novel peak picking algorithm is specially designed for this detection function.
By calculating the fitness of a pre-defined inverse hyperbolic kernel to a detection function, it is possible to find an onset and its corresponding offset simultaneously.
Experimental results show that the proposed method achieves performance significantly better than or comparable to other state-of-the-art techniques for onset detection in singing voice.
Principle Component Analysis PCA is a classical feature extraction and data representation technique widely used in pattern recognition.
It is one of the most successful techniques in face recognition.
But it has drawback of high computational especially for big size database.
This paper conducts a study to optimize the time complexity of PCA (eigenfaces) that does not affects the recognition performance.
The authors minimize the participated eigenvectors which consequently decreases the computational time.
A comparison is done to compare the differences between the recognition time in the original algorithm and in the enhanced algorithm.
The performance of the original and the enhanced proposed algorithm is tested on face94 face database.
Experimental results show that the recognition time is reduced by 35% by applying our proposed enhanced algorithm.
DET Curves are used to illustrate the experimental results.
Domain adversarial learning aligns the feature distributions across the source and target domains in a two-player minimax game.
Existing domain adversarial networks generally assume identical label space across different domains.
In the presence of big data, there is strong motivation of transferring deep models from existing big domains to unknown small domains.
This paper introduces partial domain adaptation as a new domain adaptation scenario, which relaxes the fully shared label space assumption to that the source label space subsumes the target label space.
Previous methods typically match the whole source domain to the target domain, which are vulnerable to negative transfer for the partial domain adaptation problem due to the large mismatch between label spaces.
We present Partial Adversarial Domain Adaptation (PADA), which simultaneously alleviates negative transfer by down-weighing the data of outlier source classes for training both source classifier and domain adversary, and promotes positive transfer by matching the feature distributions in the shared label space.
Experiments show that PADA exceeds state-of-the-art results for partial domain adaptation tasks on several datasets.
The 2016 U.S. presidential election has witnessed the major role of Twitter in the year's most important political event.
Candidates used this social media platform extensively for online campaigns.
Meanwhile, social media has been filled with rumors, which might have had huge impacts on voters' decisions.
In this paper, we present a thorough analysis of rumor tweets from the followers of two presidential candidates: Hillary Clinton and Donald Trump.
To overcome the difficulty of labeling a large amount of tweets as training data, we detect rumor tweets by matching them with verified rumor articles.
We analyze over 8 million tweets collected from the followers of the two candidates.
Our results provide answers to several primary concerns about rumors in this election, including: which side of the followers posted the most rumors, who posted these rumors, what rumors they posted, and when they posted these rumors.
The insights of this paper can help us understand the online rumor behaviors in American politics.
The image-to-GPS verification problem asks whether a given image is taken at a claimed GPS location.
In this paper, we treat it as an image verification problem -- whether a query image is taken at the same place as a reference image retrieved at the claimed GPS location.
We make three major contributions: 1) we propose a novel custom bottom-up pattern matching (BUPM) deep neural network solution; 2) we demonstrate that the verification can be directly done by cross-checking a perspective-looking query image and a panorama reference image, and 3) we collect and clean a dataset of 30K pairs query and reference.
Our experimental results show that the proposed BUPM solution outperforms the state-of-the-art solutions in terms of both verification and localization.
Intrinsic image decomposition is a severely under-constrained problem.
User interactions can help to reduce the ambiguity of the decomposition considerably.
The traditional way of user interaction is to draw scribbles that indicate regions with constant reflectance or shading.
However the effect scopes of the scribbles are quite limited, so dozens of scribbles are often needed to rectify the whole decomposition, which is time consuming.
In this paper we propose an efficient way of user interaction that users need only to annotate the color composition of the image.
Color composition reveals the global distribution of reflectance, so it can help to adapt the whole decomposition directly.
We build a generative model of the process that the albedo of the material produces both the reflectance through imaging and the color labels by color naming.
Our model fuses effectively the physical properties of image formation and the top-down information from human color perception.
Experimental results show that color naming can improve the performance of intrinsic image decomposition, especially in cleaning the shadows left in reflectance and solving the color constancy problem.
Recent years have seen growing interest in exploiting dual- and multi-energy measurements in computed tomography (CT) in order to characterize material properties as well as object shape.
Material characterization is performed by decomposing the scene into constitutive basis functions, such as Compton scatter and photoelectric absorption functions.
While well motivated physically, the joint recovery of the spatial distribution of photoelectric and Compton properties is severely complicated by the fact that the data are several orders of magnitude more sensitive to Compton scatter coefficients than to photoelectric absorption, so small errors in Compton estimates can create large artifacts in the photoelectric estimate.
To address these issues, we propose a model-based iterative approach which uses patch-based regularization terms to stabilize inversion of photoelectric coefficients, and solve the resulting problem though use of computationally attractive Alternating Direction Method of Multipliers (ADMM) solution techniques.
Using simulations and experimental data acquired on a commercial scanner, we demonstrate that the proposed processing can lead to more stable material property estimates which should aid materials characterization in future dual- and multi-energy CT systems.
Recently ensemble selection for consensus clustering has emerged as a research problem in Machine Intelligence.
Normally consensus clustering algorithms take into account the entire ensemble of clustering, where there is a tendency of generating a very large size ensemble before computing its consensus.
One can avoid considering the entire ensemble and can judiciously select few partitions in the ensemble without compromising on the quality of the consensus.
This may result in an efficient consensus computation technique and may save unnecessary computational overheads.
The ensemble selection problem addresses this issue of consensus clustering.
In this paper, we propose an efficient method of ensemble selection for a large ensemble.
We prioritize the partitions in the ensemble based on diversity and frequency.
Our method selects top K of the partitions in order of priority, where K is decided by the user.
We observe that considering jointly the diversity and frequency helps in identifying few representative partitions whose consensus is qualitatively better than the consensus of the entire ensemble.
Experimental analysis on a large number of datasets shows our method gives better results than earlier ensemble selection methods.
Mining the silent members of an online community, also called lurkers, has been recognized as an important problem that accompanies the extensive use of online social networks (OSNs).
Existing solutions to the ranking of lurkers can aid understanding the lurking behaviors in an OSN.
However, they are limited to use only structural properties of the static network graph, thus ignoring any relevant information concerning the time dimension.
Our goal in this work is to push forward research in lurker mining in a twofold manner: (i) to provide an in-depth analysis of temporal aspects that aims to unveil the behavior of lurkers and their relations with other users, and (ii) to enhance existing methods for ranking lurkers by integrating different time-aware properties concerning information-production and information-consumption actions.
Network analysis and ranking evaluation performed on Flickr, FriendFeed and Instagram networks allowed us to draw interesting remarks on both the understanding of lurking dynamics and on transient and cumulative scenarios of time-aware ranking.
Humans and most animals can learn new tasks without forgetting old ones.
However, training artificial neural networks (ANNs) on new tasks typically cause it to forget previously learned tasks.
This phenomenon is the result of "catastrophic forgetting", in which training an ANN disrupts connection weights that were important for solving previous tasks, degrading task performance.
Several recent studies have proposed methods to stabilize connection weights of ANNs that are deemed most important for solving a task, which helps alleviate catastrophic forgetting.
Here, drawing inspiration from algorithms that are believed to be implemented in vivo, we propose a complementary method: adding a context-dependent gating signal, such that only sparse, mostly non-overlapping patterns of units are active for any one task.
This method is easy to implement, requires little computational overhead, and allows ANNs to maintain high performance across large numbers of sequentially presented tasks when combined with weight stabilization.
This work provides another example of how neuroscience-inspired algorithms can benefit ANN design and capability.
Autonomous vehicles (AVs) will revolutionarize ground transport and take a substantial role in the future transportation system.
Most AVs are likely to be electric vehicles (EVs) and they can participate in the vehicle-to-grid (V2G) system to support various V2G services.
Although it is generally infeasible for EVs to dictate their routes, we can design AV travel plans to fulfill certain system-wide objectives.
In this paper, we focus on the AVs looking for parking and study how they can be led to appropriate parking facilities to support V2G services.
We formulate the Coordinated Parking Problem (CPP), which can be solved by a standard integer linear program solver but requires long computational time.
To make it more practical, we develop a distributed algorithm to address CPP based on dual decomposition.
We carry out a series of simulations to evaluate the proposed solution methods.
Our results show that the distributed algorithm can produce nearly optimal solutions with substantially less computational time.
A coarser time scale can improve computational time but degrade the solution quality resulting in possible infeasible solution.
Even with communication loss, the distributed algorithm can still perform well and converge with only little degradation in speed.
What tweet features are associated with higher effectiveness in tweets?
Through the mining of 122 million engagements of 2.5 million original tweets, we present a systematic review of tweet time, entities, composition, and user account features.
We show that the relationship between various features and tweeting effectiveness is non-linear; for example, tweets that use a few hashtags have higher effectiveness than using no or too many hashtags.
This research closely relates to various industrial applications that are based on tweet features, including the analysis of advertising campaigns, the prediction of user engagement, the extraction of signals for automated trading, etc.
Computer science provides an in-depth understanding of technical aspects of programming concepts, but if we want to understand how programming concepts evolve, how programmers think and talk about them and how they are used in practice, we need to consider a broader perspective that includes historical, philosophical and cognitive aspects.
In this paper, we develop such broader understanding of monads, a programming concept that has an infamous formal definition, syntactic support in several programming languages and a reputation for being elegant and powerful, but also intimidating and difficult to grasp.
This paper is not a monad tutorial.
It will not tell you what a monad is.
Instead, it helps you understand how computer scientists and programmers talk about monads and why they do so.
To answer these questions, we review the history of monads in the context of programming and study the development through the perspectives of philosophy of science, philosophy of mathematics and cognitive sciences.
More generally, we present a framework for understanding programming concepts that considers them at three levels: formal, metaphorical and implementation.
We base such observations on established results about the scientific method and mathematical entities -- cognitive sciences suggest that the metaphors used when thinking about monads are more important than widely accepted, while philosophy of science explains how the research paradigm from which monads originate influences and restricts their use.
Finally, we provide evidence for why a broader philosophical, sociological look at programming concepts should be of interest for programmers.
It lets us understand programming concepts better and, fundamentally, choose more appropriate abstractions as illustrated in number of case studies that conclude the paper.
Mobile edge cloud is emerging as a promising technology to the internet of things and cyber-physical system applications such as smart home and intelligent video surveillance.
In a smart home, various sensors are deployed to monitor the home environment and physiological health of individuals.
The data collected by sensors are sent to an application, where numerous algorithms for emotion and sentiment detection, activity recognition and situation management are applied to provide healthcare- and emergency-related services and to manage resources at the home.
The executions of these algorithms require a vast amount of computing and storage resources.
To address the issue, the conventional approach is to send the collected data to an application on an internet cloud.
This approach has several problems such as high communication latency, communication energy consumption and unnecessary data traffic to the core network.
To overcome the drawbacks of the conventional cloud-based approach, a new system called mobile edge cloud is proposed.
In mobile edge cloud, multiple mobiles and stationary devices interconnected through wireless local area networks are combined to create a small cloud infrastructure at a local physical area such as a home.
Compared to traditional mobile distributed computing systems, mobile edge cloud introduces several complex challenges due to the heterogeneous computing environment, heterogeneous and dynamic network environment, node mobility, and limited battery power.
The real-time requirements associated with the internet of things and cyber-physical system applications make the problem even more challenging.
In this paper, we describe the applications and challenges associated with the design and development of mobile edge cloud system and propose an architecture based on a cross layer design approach for effective decision making.
Adaptive Computation Time for Recurrent Neural Networks (ACT) is one of the most promising architectures for variable computation.
ACT adapts to the input sequence by being able to look at each sample more than once, and learn how many times it should do it.
In this paper, we compare ACT to Repeat-RNN, a novel architecture based on repeating each sample a fixed number of times.
We found surprising results, where Repeat-RNN performs as good as ACT in the selected tasks.
Source code in TensorFlow and PyTorch is publicly available at https://imatge-upc.github.io/danifojo-2018-repeatrnn/
Almost all known secret sharing schemes work on numbers.
Such methods will have difficulty in sharing graphs since the number of graphs increases exponentially with the number of nodes.
We propose a secret sharing scheme for graphs where we use graph intersection for reconstructing the secret which is hidden as a sub graph in the shares.
Our method does not rely on heavy computational operations such as modular arithmetic or polynomial interpolation but makes use of very basic operations like assignment and checking for equality, and graph intersection can also be performed visually.
In certain cases, the secret could be reconstructed using just pencil and paper by authorised parties but cannot be broken by an adversary even with unbounded computational power.
The method achieves perfect secrecy for (2, n) scheme and requires far fewer operations compared to Shamir's algorithm.
The proposed method could be used to share objects such as matrices, sets, plain text and even a heterogeneous collection of these.
Since we do not require a previously agreed upon encoding scheme, the method is very suitable for sharing heterogeneous collection of objects in a dynamic fashion.
In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways.
This topic has been thoroughly studied on recurrent architectures.
In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture.
We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical.
We evaluate our methods on tasks of multimodal translation and translation with multiple source languages.
The experiments show that the models are able to use multiple sources and improve over single source baselines.
Machine learning has celebrated a lot of achievements on computer vision tasks such as object detection, but the traditionally used models work with relatively low resolution images.
The resolution of recording devices is gradually increasing and there is a rising need for new methods of processing high resolution data.
We propose an attention pipeline method which uses two staged evaluation of each image or video frame under rough and refined resolution to limit the total number of necessary evaluations.
For both stages, we make use of the fast object detection model YOLO v2.
We have implemented our model in code, which distributes the work across GPUs.
We maintain high accuracy while reaching the average performance of 3-6 fps on 4K video and 2 fps on 8K video.
Conversational agents are exploding in popularity.
However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics.
To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as socialbots, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes.
The Alexa Prize offers the academic community a unique opportunity to perform research with a live system used by millions of users.
The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team.
This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions.
To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition.
To support the efforts of participating teams, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability.
This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI.
This is a reflection on the author's experience in teaching logic at the graduate level in a computer science department.
The main lesson is that model building and the process of modelling must be placed at the centre stage of logic teaching.
Furthermore, effective use must be supported with adequate tools.
Finally, logic is the methodology underlying many applications, it is hence paramount to pass on its principles, methods and concepts to computer science audiences.
We present the first method to capture the 3D total motion of a target person from a monocular view input.
Given an image or a monocular video, our method reconstructs the motion from body, face, and fingers represented by a 3D deformable mesh model.
We use an efficient representation called 3D Part Orientation Fields (POFs), to encode the 3D orientations of all body parts in the common 2D image space.
POFs are predicted by a Fully Convolutional Network (FCN), along with the joint confidence maps.
To train our network, we collect a new 3D human motion dataset capturing diverse total body motion of 40 subjects in a multiview system.
We leverage a 3D deformable human model to reconstruct total body pose from the CNN outputs by exploiting the pose and shape prior in the model.
We also present a texture-based tracking method to obtain temporally coherent motion capture output.
We perform thorough quantitative evaluations including comparison with the existing body-specific and hand-specific methods, and performance analysis on camera viewpoint and human pose changes.
Finally, we demonstrate the results of our total body motion capture on various challenging in-the-wild videos.
Our code and newly collected human motion dataset will be publicly shared.
Nauticle is a general-purpose simulation tool for the flexible and highly configurable application of particle-based methods of either discrete or continuum phenomena.
It is presented that Nauticle has three distinct layers for users and developers, then the top two layers are discussed in detail.
The paper introduces the Symbolic Form Language (SFL) of Nauticle, which facilitates the formulation of user-defined numerical models at the top level in text-based configuration files and provides simple application examples of use.
On the other hand, at the intermediate level, it is shown that the SFL can be intuitively extended with new particle methods without tedious recoding or even the knowledge of the bottom level.
Finally, the efficiency of the code is also tested through a performance benchmark.
This paper proposes a novel approach for efficiently evaluating regular path queries over provenance graphs of workflows that may include recursion.
The approach assumes that an execution g of a workflow G is labeled with query-agnostic reachability labels using an existing technique.
At query time, given g, G and a regular path query R, the approach decomposes R into a set of subqueries R1, ..., Rk that are safe for G. For each safe subquery Ri, G is rewritten so that, using the reachability labels of nodes in g, whether or not there is a path which matches Ri between two nodes can be decided in constant time.
The results of each safe subquery are then composed, possibly with some small unsafe remainder, to produce an answer to R. The approach results in an algorithm that significantly reduces the number of subqueries k over existing techniques by increasing their size and complexity, and that evaluates each subquery in time bounded by its input and output size.
Experimental results demonstrate the benefit of this approach.
We propose a new algorithm to the problem of polygonal curve approximation based on a multiresolution approach.
This algorithm is suboptimal but still maintains some optimality between successive levels of resolution using dynamic programming.
We show theoretically and experimentally that this algorithm has a linear complexity in time and space.
We experimentally compare the outcomes of our algorithm to the optimal "full search" dynamic programming solution and finally to classical merge and split approaches.
The experimental evaluations confirm the theoretical derivations and show that the proposed approach evaluated on 2D coastal maps either show a lower time complexity or provide polygonal approximations closer to the input discrete curves.
Mild traumatic brain injury is a growing public health problem with an estimated incidence of over 1.7 million people annually in US.
Diagnosis is based on clinical history and symptoms, and accurate, concrete measures of injury are lacking.
This work aims to directly use diffusion MR images obtained within one month of trauma to detect injury, by incorporating deep learning techniques.
To overcome the challenge due to limited training data, we describe each brain region using the bag of word representation, which specifies the distribution of representative patch patterns.
We apply a convolutional auto-encoder to learn the patch-level features, from overlapping image patches extracted from the MR images, to learn features from diffusion MR images of brain using an unsupervised approach.
Our experimental results show that the bag of word representation using patch level features learnt by the auto encoder provides similar performance as that using the raw patch patterns, both significantly outperform earlier work relying on the mean values of MR metrics in selected brain regions.
Surrogate models are a well established approach to reduce the number of expensive function evaluations in continuous optimization.
In the context of genetic programming, surrogate modeling still poses a challenge, due to the complex genotype-phenotype relationships.
We investigate how different genotypic and phenotypic distance measures can be used to learn Kriging models as surrogates.
We compare the measures and suggest to use their linear combination in a kernel.
We test the resulting model in an optimization framework, using symbolic regression problem instances as a benchmark.
Our experiments show that the model provides valuable information.
Firstly, the model enables an improved optimization performance compared to a model-free algorithm.
Furthermore, the model provides information on the contribution of different distance measures.
The data indicates that a phenotypic distance measure is important during the early stages of an optimization run when less data is available.
In contrast, genotypic measures, such as the tree edit distance, contribute more during the later stages.
Understanding the financial burden of chronic diseases in developing regions still remains an important economical factor which influences the successful implementation of sensor based applications for continuous monitoring of chronic conditions.
Our research focused on a comparison of literature-based data with real costs of the management and treatment of chronic diseases in a developing country, and we are using Kosovo as an example here.
The results reveal that the actual living costs exceed the minimum expenses that chronic diseases impose.
Following the potential of a positive economic impact of sensor based platforms for monitoring chronic conditions, we further examined the users perception of digital technology.
The purpose of this paper is to present the varying cost levels of treating chronic diseases, identify the users concerns and requirements towards digital technology and discuss issues and challenges that the application of sensor based platforms imply in low and middle income countries.
Visual Object tracking research has undergone significant improvement in the past few years.
The emergence of tracking by detection approach in tracking paradigm has been quite successful in many ways.
Recently, deep convolutional neural networks have been extensively used in most successful trackers.
Yet, the standard approach has been based on correlation or feature selection with minimal consideration given to motion consistency.
Thus, there is still a need to capture various physical constraints through motion consistency which will improve accuracy, robustness and more importantly rotation adaptiveness.
Therefore, one of the major aspects of this paper is to investigate the outcome of rotation adaptiveness in visual object tracking.
Among other key contributions, the paper also includes various consistencies that turn out to be extremely effective in numerous challenging sequences than the current state-of-the-art.
Authcoin is an alternative approach to the commonly used public key infrastructures such as central authorities and the PGP web of trust.
It combines a challenge response-based validation and authentication process for domains, certificates, email accounts and public keys with the advantages of a block chain-based storage system.
As a result, Authcoin does not suffer from the downsides of existing solutions and is much more resilient to sybil attacks.
The regular K-10 curriculums often do not get the necessary of affordable technology involving interactive ways of teaching the prescribed curriculum with effective analytical skill building.
In this paper, we present "PlutoAR", a paper-based augmented reality interpreter which is scalable, affordable, portable and can be used as a platform for skill building for the kids.
PlutoAR manages to overcome the conventional albeit non-interactive ways of teaching by incorporating augmented reality (AR) through an interactive toolkit to provide students with the best of both worlds.
Students cut out paper "tiles" and place these tiles one by one on a larger paper surface called "Launchpad" and use the PlutoAR mobile application which runs on any Android device with a camera and uses augmented reality to output each step of the program like an interpreter.
PlutoAR has inbuilt AR experiences like stories, maze solving using conditional loops, simple elementary mathematics and the intuition of gravity.
Network visualization allows a quick glance at how nodes (or actors) are connected by edges (or ties).
A conventional network diagram of "contact tree" maps out a root and branches that represent the structure of nodes and edges, often without further specifying leaves or fruits that would have grown from small branches.
By furnishing such a network structure with leaves and fruits, we reveal details about "contacts" in our ContactTrees that underline ties and relationships.
Our elegant design employs a bottom-up approach that resembles a recent attempt to understand subjective well-being by means of a series of emotions.
Such a bottom-up approach to social-network studies decomposes each tie into a series of interactions or contacts, which help deepen our understanding of the complexity embedded in a network structure.
Unlike previous network visualizations, ContactTrees can highlight how relationships form and change based upon interactions among actors, and how relationships and networks vary by contact attributes.
Based on a botanical tree metaphor, the design is easy to construct and the resulting tree-like visualization can display many properties at both tie and contact levels, a key ingredient missing from conventional techniques of network visualization.
We first demonstrate ContactTrees using a dataset consisting of three waves of 3-month contact diaries over the 2004-2012 period, then compare ContactTrees with alternative tools and discuss how this tool can be applied to other types of datasets.
In social choice settings with linear preferences, random dictatorship is known to be the only social decision scheme satisfying strategyproofness and ex post efficiency.
When also allowing indifferences, random serial dictatorship (RSD) is a well-known generalization of random dictatorship that retains both properties.
RSD has been particularly successful in the special domain of random assignment where indifferences are unavoidable.
While executing RSD is obviously feasible, we show that computing the resulting probabilities is #P-complete and thus intractable, both in the context of voting and assignment.
Toxic online content has become a major issue in today's world due to an exponential increase in the use of internet by people of different cultures and educational background.
Differentiating hate speech and offensive language is a key challenge in automatic detection of toxic text content.
In this paper, we propose an approach to automatically classify tweets on Twitter into three classes: hateful, offensive and clean.
Using Twitter dataset, we perform experiments considering n-grams as features and passing their term frequency-inverse document frequency (TFIDF) values to multiple machine learning models.
We perform comparative analysis of the models considering several values of n in n-grams and TFIDF normalization methods.
After tuning the model giving the best results, we achieve 95.6% accuracy upon evaluating it on test data.
We also create a module which serves as an intermediate between user and Twitter.
Automatic annotation of images with descriptive words is a challenging problem with vast applications in the areas of image search and retrieval.
This problem can be viewed as a label-assignment problem by a classifier dealing with a very large set of labels, i.e., the vocabulary set.
We propose a novel annotation method that employs two layers of sparse coding and performs coarse-to-fine labeling.
Themes extracted from the training data are treated as coarse labels.
Each theme is a set of training images that share a common subject in their visual and textual contents.
Our system extracts coarse labels for training and test images without requiring any prior knowledge.
Vocabulary words are the fine labels to be associated with images.
Most of the annotation methods achieve low recall due to the large number of available fine labels, i.e., vocabulary words.
These systems also tend to achieve high precision for highly frequent words only while relatively rare words are more important for search and retrieval purposes.
Our system not only outperforms various previously proposed annotation systems, but also achieves symmetric response in terms of precision and recall.
Our system scores and maintains high precision for words with a wide range of frequencies.
Such behavior is achieved by intelligently reducing the number of available fine labels or words for each image based on coarse labels assigned to it.
This paper reports a new reading for wavelets, which is based on the classical 'De Broglie' principle.
The wave-particle duality principle is adapted to wavelets.
Every continuous basic wavelet is associated with a proper probability density, allowing defining the Shannon entropy of a wavelet.
Further entropy definitions are considered, such as Jumarie or Renyi entropy of wavelets.
We proved that any wavelet of the same family has the same Shannon entropy of its mother wavelet.
Finally, the Shannon entropy for a few standard wavelet families is determined.
There is a plethora of datasets in various formats which are usually stored in files, hosted in catalogs, or accessed through SPARQL endpoints.
In most cases, these datasets cannot be straightforwardly explored by end users, for satisfying recall-oriented information needs.
To fill this gap, in this paper we present the design and implementation of Facetize, an editor that allows users to transform (in an interactive manner) datasets, either static (i.e. stored in files), or dynamic (i.e. being the results of SPARQL queries), to datasets that can be directly explored effectively by themselves or other users.
The latter (exploration) is achieved through the familiar interaction paradigm of Faceted Search (and Preference-enriched Faceted Search).
Specifically in this paper we describe the requirements, we introduce the required set of transformations, and then we detail the functionality and the implementation of the editor Facetize that realizes these transformations.
The supported operations cover a wide range of tasks (selection, visibility, deletions, edits, definition of hierarchies, intervals, derived attributes, and others) and Facetize enables the user to carry them out in a user-friendly and guided manner, without presupposing any technical background (regarding data representation or query languages).
Finally we present the results of an evaluation with users.
To the best of your knowledge, this is the first editor for this kind of tasks.
Using Low Cost Portable Eye Tracking for Biometric Identification Or Verification: Eye tracking technologies have in recent years become available outside of specialised labs, and are starting to become integrated in tablets and virtual reality headsets.
This offers new opportunities for use in common office- and home environments, such as for biometric recognition (identification or verification), alone or in combination with other technologies.
This paper exposes two fundamentally different approaches that have been suggested, based on spatial and temporal signatures respectively.
While deploying different stimulation paradigms for recording, it also proposes an alternative way to analyze spatial domain signatures using Fourier transformation.
Empirical data recorded from two subjects over two weeks, three months apart, are found to support previous results.
Further, variations and stability of some of the proposed signatures are analyzed over the extended timeframe and under slightly varying conditions.
Convex optimization problems arise frequently in diverse machine learning (ML) applications.
First-order methods, i.e., those that solely rely on the gradient information, are most commonly used to solve these problems.
This choice is motivated by their simplicity and low per-iteration cost.
Second-order methods that rely on curvature information through the dense Hessian matrix have, thus far, proven to be prohibitively expensive at scale, both in terms of computational and memory requirements.
We present a novel multi-GPU distributed formulation of a second order (Newton-type) solver for convex finite sum minimization problems for multi-class classification.
Our distributed formulation relies on the Alternating Direction of Multipliers Method (ADMM), which requires only one round of communication per-iteration -- significantly reducing communication overheads, while incurring minimal convergence overhead.
By leveraging the computational capabilities of GPUs, we demonstrate that per-iteration costs of Newton-type methods can be significantly reduced to be on-par with, if not better than, state-of-the-art first-order alternatives.
Given their significantly faster convergence rates, we demonstrate that our methods can process large data-sets in much shorter time (orders of magnitude in many cases) compared to existing first and second order methods, while yielding similar test-accuracy results.
Our results demonstrated that a previously reported protein name co-occurrence method (5-mention PubGene) which was not based on a hypothesis testing framework, it is generally statistically more significant than the 99th percentile of Poisson distribution-based method of calculating co-occurrence.
It agrees with previous methods using natural language processing to extract protein-protein interaction from text as more than 96% of the interactions found by natural language processing methods to overlap with the results from 5-mention PubGene method.
However, less than 2% of the gene co-expressions analyzed by microarray were found from direct co-occurrence or interaction information extraction from the literature.
At the same time, combining microarray and literature analyses, we derive a novel set of 7 potential functional protein-protein interactions that had not been previously described in the literature.
Articulated and flexible objects constitute a challenge for robot manipulation tasks but are present in different real-world settings, including home and industrial environments.
Current approaches to the manipulation of articulated and flexible objects employ ad hoc strategies to sequence and perform actions on them depending on a number of physical or geometrical characteristics related to those objects, as well as on an a priori classification of target object configurations.
In this paper, we propose an action planning and execution framework, which (i) considers abstract representations of articulated or flexible objects, (ii) integrates action planning to reason upon such configurations and to sequence an appropriate set of actions with the aim of obtaining a target configuration provided as a goal, and (iii) is able to cooperate with humans to collaboratively carry out the plan.
On the one hand, we show that a trade-off exists between the way articulated or flexible objects are perceived and how the system represents them.
Such a trade-off greatly impacts on the complexity of the planning process.
On the other hand, we demonstrate the system's capabilities in allowing humans to interrupt robot action execution, and - in general - to contribute to the whole manipulation process.
Results related to planning performance are discussed, and examples of a Baxter dual-arm manipulator performing actions collaboratively with humans are shown.
Information extraction traditionally focuses on extracting relations between identifiable entities, such as <Monterey, locatedIn, California>.
Yet, texts often also contain Counting information, stating that a subject is in a specific relation with a number of objects, without mentioning the objects themselves, for example, "California is divided into 58 counties".
Such counting quantifiers can help in a variety of tasks such as query answering or knowledge base curation, but are neglected by prior work.
This paper develops the first full-fledged system for extracting counting information from text, called CINEX.
We employ distant supervision using fact counts from a knowledge base as training seeds, and develop novel techniques for dealing with several challenges: (i) non-maximal training seeds due to the incompleteness of knowledge bases, (ii) sparse and skewed observations in text sources, and (iii) high diversity of linguistic patterns.
Experiments with five human-evaluated relations show that CINEX can achieve 60% average precision for extracting counting information.
In a large-scale experiment, we demonstrate the potential for knowledge base enrichment by applying CINEX to 2,474 frequent relations in Wikidata.
CINEX can assert the existence of 2.5M facts for 110 distinct relations, which is 28% more than the existing Wikidata facts for these relations.
We propose an online framework to detect cyber attacks on Automatic Generation Control (AGC).
A cyber at- tack detection algorithm is designed based on the approach of Dynamic Watermarking.
The detection algorithm provides a theoretical guarantee of detection of cyber attacks launched by sophisticated attackers possessing extensive knowledge of the physical and statistical models of targeted power systems.
The proposed framework is practically implementable, as it needs no hardware update on generation units.
The efficacy of the proposed framework is validated in both four-area system and 140-bus system.
Cities are engines of the knowledge-based economy, because they are the primary sites of knowledge production activities that subsequently shape the rate and direction of technological change and economic growth.
Patents provide a wealth of information to analyse the knowledge specialization at specific places, such as technological details and information on inventors and entities involved, including address information.
The technology codes on each patent document indicate the specialization and scope of the underlying technological knowledge of a given invention.
In this paper we introduce tools for portfolio analysis in terms of patents that provide insights into the technological specialization of cities.
The mapping and analysis of patent portfolios of cities using data of the Unites States Patent and Trademark Office (USPTO) website (at http://www.uspto.gov) and dedicated tools (at http://www.leydesdorff.net/portfolio) can be used to analyse the specialisation patterns of inventive activities among cities.
The results allow policy makers and other stakeholders to identify promising areas of further knowledge development and 'smart specialisation' strategies.
The task of unsupervised domain adaptation is proposed to transfer the knowledge of a label-rich domain (source domain) to a label-scarce domain (target domain).
Matching feature distributions between different domains is a widely applied method for the aforementioned task.
However, the method does not perform well when classes in the two domains are not identical.
Specifically, when the classes of the target correspond to a subset of those of the source, target samples can be incorrectly aligned with the classes that exist only in the source.
This problem setting is termed as partial domain adaptation (PDA).
In this study, we propose a novel method called Two Weighted Inconsistency-reduced Networks (TWINs) for PDA.
We utilize two classification networks to estimate the ratio of the target samples in each class with which a classification loss is weighted to adapt the classes present in the target domain.
Furthermore, to extract discriminative features for the target, we propose to minimize the divergence between domains measured by the classifiers' inconsistency on target samples.
We empirically demonstrate that reducing the inconsistency between two networks is effective for PDA and that our method outperforms other existing methods with a large margin in several datasets.
Predicting how Congressional legislators will vote is important for understanding their past and future behavior.
However, previous work on roll-call prediction has been limited to single session settings, thus did not consider generalization across sessions.
In this paper, we show that metadata is crucial for modeling voting outcomes in new contexts, as changes between sessions lead to changes in the underlying data generation process.
We show how augmenting bill text with the sponsors' ideologies in a neural network model can achieve an average of a 4% boost in accuracy over the previous state-of-the-art.
Language provides simple ways of communicating generalizable knowledge to each other (e.g., "Birds fly", "John hikes", "Fire makes smoke").
Though found in every language and emerging early in development, the language of generalization is philosophically puzzling and has resisted precise formalization.
Here, we propose the first formal account of generalizations conveyed with language that makes quantitative predictions about human understanding.
We test our model in three diverse domains: generalizations about categories (generic language), events (habitual language), and causes (causal language).
The model explains the gradience in human endorsement through the interplay between a simple truth-conditional semantic theory and diverse beliefs about properties, formalized in a probabilistic model of language understanding.
This work opens the door to understanding precisely how abstract knowledge is learned from language.
Heart failure (HF) is one of the leading causes of hospital admissions in the US.
Readmission within 30 days after a HF hospitalization is both a recognized indicator for disease progression and a source of considerable financial burden to the healthcare system.
Consequently, the identification of patients at risk for readmission is a key step in improving disease management and patient outcome.
In this work, we used a large administrative claims dataset to (1)explore the systematic application of neural network-based models versus logistic regression for predicting 30 days all-cause readmission after discharge from a HF admission, and (2)to examine the additive value of patients' hospitalization timelines on prediction performance.
Based on data from 272,778 (49% female) patients with a mean (SD) age of 73 years (14) and 343,328 HF admissions (67% of total admissions), we trained and tested our predictive readmission models following a stratified 5-fold cross-validation scheme.
Among the deep learning approaches, a recurrent neural network (RNN) combined with conditional random fields (CRF) model (RNNCRF) achieved the best performance in readmission prediction with 0.642 AUC (95% CI, 0.640-0.645).
Other models, such as those based on RNN, convolutional neural networks and CRF alone had lower performance, with a non-timeline based model (MLP) performing worst.
A competitive model based on logistic regression with LASSO achieved a performance of 0.643 AUC (95%CI, 0.640-0.646).
We conclude that data from patient timelines improve 30 day readmission prediction for neural network-based models, that a logistic regression with LASSO has equal performance to the best neural network model and that the use of administrative data result in competitive performance compared to published approaches based on richer clinical datasets.
In this paper we compare several Python tools for automatic differentiation.
In order to assess the difference in performance and precision, the problem of finding the optimal geometrical structure of the cluster with identical atoms is used as follows.
First, we compare performance of calculating gradients for the objective function.
We showed that the PyADOL-C and PyCppAD tools have much better performance for big clusters than the other ones.
Second, we assess precision of these two tools by calculating the difference between the obtained at the optimal configuration gradient norms.
We conclude that PyCppAD has the best performance among others, while having almost the same precision as the second- best performing tool - PyADOL-C.
Conventional seq2seq chatbot models only try to find the sentences with the highest probabilities conditioned on the input sequences, without considering the sentiment of the output sentences.
Some research works trying to modify the sentiment of the output sequences were reported.
In this paper, we propose five models to scale or adjust the sentiment of the chatbot response: persona-based model, reinforcement learning, plug and play model, sentiment transformation network and cycleGAN, all based on the conventional seq2seq model.
We also develop two evaluation metrics to estimate if the responses are reasonable given the input.
These metrics together with other two popularly used metrics were used to analyze the performance of the five proposed models on different aspects, and reinforcement learning and cycleGAN were shown to be very attractive.
The evaluation metrics were also found to be well correlated with human evaluation.
The present paper proposes a novel transmission strategy, referred to as cocktail BPSK, whereat two independent BPSKs are superposed with the non-orthogonal basis in a parallel transmission.
In contrast to the conventional signal superpositions, the proposed scheme avoids the interference between the two symbols, allows the symbol-energy-reuse of each other and gains the extra energy.
Based on the formulation of the mutual informations, the theoretical analysis shows that the cocktail BPSK scheme can achieve high data rate beyond the channel capacity at very low SNR, and the numerical results confirm this approach eventually.
End-to-end approaches have drawn much attention recently for significantly simplifying the construction of an automatic speech recognition (ASR) system.
RNN transducer (RNN-T) is one of the popular end-to-end methods.
Previous studies have shown that RNN-T is difficult to train and a very complex training process is needed for a reasonable performance.
In this paper, we explore RNN-T for a Chinese large vocabulary continuous speech recognition (LVCSR) task and aim to simplify the training process while maintaining performance.
First, a new strategy of learning rate decay is proposed to accelerate the model convergence.
Second, we find that adding convolutional layers at the beginning of the network and using ordered data can discard the pre-training process of the encoder without loss of performance.
Besides, we design experiments to find a balance among the usage of GPU memory, training circle and model performance.
Finally, we achieve 16.9% character error rate (CER) on our test set which is 2% absolute improvement from a strong BLSTM CE system with language model trained on the same text corpus.
Verifying the identity of a person using handwritten signatures is challenging in the presence of skilled forgeries, where a forger has access to a person's signature and deliberately attempt to imitate it.
In offline (static) signature verification, the dynamic information of the signature writing process is lost, and it is difficult to design good feature extractors that can distinguish genuine signatures and skilled forgeries.
This reflects in a relatively poor performance, with verification errors around 7% in the best systems in the literature.
To address both the difficulty of obtaining good features, as well as improve system performance, we propose learning the representations from signature images, in a Writer-Independent format, using Convolutional Neural Networks.
In particular, we propose a novel formulation of the problem that includes knowledge of skilled forgeries from a subset of users in the feature learning process, that aims to capture visual cues that distinguish genuine signatures and forgeries regardless of the user.
Extensive experiments were conducted on four datasets: GPDS, MCYT, CEDAR and Brazilian PUC-PR datasets.
On GPDS-160, we obtained a large improvement in state-of-the-art performance, achieving 1.72% Equal Error Rate, compared to 6.97% in the literature.
We also verified that the features generalize beyond the GPDS dataset, surpassing the state-of-the-art performance in the other datasets, without requiring the representation to be fine-tuned to each particular dataset.
qPCF is a paradigmatic quantum programming language that ex- tends PCF with quantum circuits and a quantum co-processor.
Quantum circuits are treated as classical data that can be duplicated and manipulated in flexible ways by means of a dependent type system.
The co-processor is essentially a standard QRAM device, albeit we avoid to store permanently quantum states in between two co-processor's calls.
Despite its quantum features, qPCF retains the classic programming approach of PCF.
We introduce qPCF syntax, typing rules, and its operational semantics.
We prove fundamental properties of the system, such as Preservation and Progress Theorems.
Moreover, we provide some higher-order examples of circuit encoding.
The Resource Description Framework (RDF) is a W3C standard for representing graph-structured data, and SPARQL is the standard query language for RDF.
Recent advances in Information Extraction, Linked Data Management and the Semantic Web have led to a rapid increase in both the volume and the variety of RDF data that are publicly available.
As businesses start to capitalize on RDF data, RDF data management systems are being exposed to workloads that are far more diverse and dynamic than what they were designed to handle.
Consequently, there is a growing need for developing workload-adaptive and self-tuning RDF data management systems.
To realize this vision, we introduce a fast and efficient method for dynamically clustering records in an RDF data management system.
Specifically, we assume nothing about the workload upfront, but as SPARQL queries are executed, we keep track of records that are co-accessed by the queries in the workload and physically cluster them.
To decide dynamically (hence, in constant-time) where a record needs to be placed in the storage system, we develop a new locality-sensitive hashing (LSH) scheme, Tunable-LSH.
Using Tunable-LSH, records that are co-accessed across similar sets of queries can be hashed to the same or nearby physical pages in the storage system.
What sets Tunable-LSH apart from existing LSH schemes is that it can auto-tune to achieve the aforementioned clustering objective with high accuracy even when the workloads change.
Experimental evaluation of Tunable-LSH in our prototype RDF data management system, chameleon-db, as well as in a standalone hashtable shows significant end-to-end improvements over existing solutions.
The reconfigurability, energy-efficiency, and massive parallelism on FPGAs make them one of the best choices for implementing efficient deep learning accelerators.
However, state-of-art implementations seldom consider the balance between high throughput of computation power and the ability of the memory subsystem to support it.
In this paper, we implement an accelerator on FPGA by combining the sparse Winograd convolution, clusters of small-scale systolic arrays, and a tailored memory layout design.
We also provide an analytical model analysis for the general Winograd convolution algorithm as a design reference.
Experimental results on VGG16 show that it achieves very high computational resource utilization, 20x ~ 30x energy efficiency, and more than 5x speedup compared with the dense implementation.
Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data.
However, they are both complex and memory intensive due to their recursive nature.
These limitations make RNNs difficult to embed on mobile devices requiring real-time processes with limited hardware resources.
To address the above issues, we introduce a method that can learn binary and ternary weights during the training phase to facilitate hardware implementations of RNNs.
As a result, using this approach replaces all multiply-accumulate operations by simple accumulations, bringing significant benefits to custom hardware in terms of silicon area and power consumption.
On the software side, we evaluate the performance (in terms of accuracy) of our method using long short-term memories (LSTMs) on various sequential models including sequence classification and language modeling.
We demonstrate that our method achieves competitive results on the aforementioned tasks while using binary/ternary weights during the runtime.
On the hardware side, we present custom hardware for accelerating the recurrent computations of LSTMs with binary/ternary weights.
Ultimately, we show that LSTMs with binary/ternary weights can achieve up to 12x memory saving and 10x inference speedup compared to the full-precision implementation on an ASIC platform.
Embedding data into vector spaces is a very popular strategy of pattern recognition methods.
When distances between embeddings are quantized, performance metrics become ambiguous.
In this paper, we present an analysis of the ambiguity quantized distances introduce and provide bounds on the effect.
We demonstrate that it can have a measurable effect in empirical data in state-of-the-art systems.
We also approach the phenomenon from a computer security perspective and demonstrate how someone being evaluated by a third party can exploit this ambiguity and greatly outperform a random predictor without even access to the input data.
We also suggest a simple solution making the performance metrics, which rely on ranking, totally deterministic and impervious to such exploits.
The increasing nature of World Wide Web has imposed great challenges for researchers in improving the search efficiency over the internet.
Now days web document clustering has become an important research topic to provide most relevant documents in huge volumes of results returned in response to a simple query.
In this paper, first we proposed a novel approach, to precisely define clusters based on maximal frequent item set (MFI) by Apriori algorithm.
Afterwards utilizing the same maximal frequent item set (MFI) based similarity measure for Hierarchical document clustering.
By considering maximal frequent item sets, the dimensionality of document set is decreased.
Secondly, providing privacy preserving of open web documents is to avoiding duplicate documents.
There by we can protect the privacy of individual copy rights of documents.
This can be achieved using equivalence relation.
Human beings cannot be happy with any kind of tiredness based work, so they focused on machines to work on behalf of humans.
The Internet-based latest technology provides the platforms for human beings to relax and unburden feeling.
The Internet of Things (IoT) field efficiently helps human beings with smart decisions through Machine-to-Machine (M2M) communication all over the world.
It has been difficult to ignore the importance of the IoT field with the new development of applications such as a smartphone in the present era.
The IoT field sensor plays a vital role in sensing the intelligent object/things and making an intelligent decision after sensing the objects.
The rapid development of new applications using smartphones in the world caused all users of the IoT community to be faced with one major challenge of security in the form of side channel attacks against highly intensive 3D printing systems.
The smartphone formulated Intellectual property (IP) of side channel attacks investigate against 3D printer in the physical domain through reconstructed G-code file through primitive operations.
The smartphone (Nexus 5) solved the main problems such as orientation fixing, model accuracy of frame size and validate the feasibility and effectiveness in real case studies against the 3D printer.
The 3D printing estimated value reached 20.2 billion of dollars in 2021.
The thermal camera is used for exploring the side channel attacks after reconstructing the objects against 3D printers.
The researcher analyzed IoT security relevant issues which were avoided in future by enhanced strong security mechanism strategy, encryption, and machine learning-based algorithms, latest technologies, schemes and protocols utilized in an efficient way.
Keywords: - Internet of Things (IoT), Machine-to-Machine (M2M), Security, 3D printer, smartphone
We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training.
Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples.
We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training.
We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance.
Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.
This paper introduces the ongoing integration of Contiki's uIP stack into the OMNeT++ port of the Network Simulation Cradle (NSC).
The NSC utilizes code from real world stack implementations and allows for an accurate simulation and comparison of different TCP/IP stacks. uIP(v6) provides resource-constrained devices with an RFC-compliant TCP/IP stack and promotes the use of IPv6 in the vastly growing field of Internet of Things scenarios.
This work-in-progress report discusses our motivation to integrate uIP into the NSC, our chosen approach and possible use cases for the simulation of uIP in OMNeT++.
This paper describes an architecture for robots that combines the complementary strengths of probabilistic graphical models and declarative programming to represent and reason with logic-based and probabilistic descriptions of uncertainty and domain knowledge.
An action language is extended to support non-boolean fluents and non-deterministic causal laws.
This action language is used to describe tightly-coupled transition diagrams at two levels of granularity, with a fine-resolution transition diagram defined as a refinement of a coarse-resolution transition diagram of the domain.
The coarse-resolution system description, and a history that includes (prioritized) defaults, are translated into an Answer Set Prolog (ASP) program.
For any given goal, inference in the ASP program provides a plan of abstract actions.
To implement each such abstract action, the robot automatically zooms to the part of the fine-resolution transition diagram relevant to this action.
A probabilistic representation of the uncertainty in sensing and actuation is then included in this zoomed fine-resolution system description, and used to construct a partially observable Markov decision process (POMDP).
The policy obtained by solving the POMDP is invoked repeatedly to implement the abstract action as a sequence of concrete actions, with the corresponding observations being recorded in the coarse-resolution history and used for subsequent reasoning.
The architecture is evaluated in simulation and on a mobile robot moving objects in an indoor domain, to show that it supports reasoning with violation of defaults, noisy observations and unreliable actions, in complex domains.
Temporal difference (TD) learning is an important approach in reinforcement learning, as it combines ideas from dynamic programming and Monte Carlo methods in a way that allows for online and incremental model-free learning.
A key idea of TD learning is that it is learning predictive knowledge about the environment in the form of value functions, from which it can derive its behavior to address long-term sequential decision making problems.
The agent's horizon of interest, that is, how immediate or long-term a TD learning agent predicts into the future, is adjusted through a discount rate parameter.
In this paper, we introduce an alternative view on the discount rate, with insight from digital signal processing, to include complex-valued discounting.
Our results show that setting the discount rate to appropriately chosen complex numbers allows for online and incremental estimation of the Discrete Fourier Transform (DFT) of a signal of interest with TD learning.
We thereby extend the types of knowledge representable by value functions, which we show are particularly useful for identifying periodic effects in the reward sequence.
This paper describes a technique to compare large text sources using word vector representations (word2vec) and dimensionality reduction (t-SNE) and how it can be implemented using Python.
The technique provides a bird's-eye view of text sources, e.g. text summaries and their source material, and enables users to explore text sources like a geographical map.
Word vector representations capture many linguistic properties such as gender, tense, plurality and even semantic concepts like "capital city of".
Using dimensionality reduction, a 2D map can be computed where semantically similar words are close to each other.
The technique uses the word2vec model from the gensim Python library and t-SNE from scikit-learn.
We overview dataflow matrix machines as a Turing complete generalization of recurrent neural networks and as a programming platform.
We describe vector space of finite prefix trees with numerical leaves which allows us to combine expressive power of dataflow matrix machines with simplicity of traditional recurrent neural networks.
In this paper, linear systems with a crisp real coefficient matrix and with a vector of fuzzy triangular numbers on the right-hand side are studied.
A new method, which is based on the geometric representations of linear transformations, is proposed to find solutions.
The method uses the fact that a vector of fuzzy triangular numbers forms a rectangular prism in n-dimensional space and that the image of a parallelepiped is also a parallelepiped under a linear transformation.
The suggested method clarifies why in general case different approaches do not generate solutions as fuzzy numbers.
It is geometrically proved that if the coefficient matrix is a generalized permutation matrix, then the solution of a fuzzy linear system (FLS) is a vector of fuzzy numbers irrespective of the vector on the right-hand side.
The most important difference between this and previous papers on FLS is that the solution is sought as a fuzzy set of vectors (with real components) rather than a vector of fuzzy numbers.
Each vector in the solution set solves the given FLS with a certain possibility.
The suggested method can also be applied in the case when the right-hand side is a vector of fuzzy numbers in parametric form.
However, in this case, -cuts of the solution can not be determined by geometric similarity and additional computations are needed.
In the paper, we analyze the distribution of complexities in the Vai script, an indigenous syllabic writing system from Liberia.
It is found that the uniformity hypothesis for complexities fails for this script.
The models using Poisson distribution for the number of components and hyper-Poisson distribution for connections provide good fits in the case of the Vai script.
Outsourcing jobs to a public cloud is a cost-effective way to address the problem of satisfying the peak resource demand when the local cloud has insufficient resources.
In this paper, we study on managing deadline-constrained bag-of-tasks jobs on hybrid clouds.
We present a binary nonlinear programming (BNP) problem to model the hybrid cloud management where the utilization of physical machines (PMs) in the local cloud/cluster is maximized when the local resources are enough to satisfy the deadline constraints of jobs, while when not, the rent cost from the public cloud is minimized.
To solve this BNP problem in polynomial time, we proposed a heuristic algorithm.
Its main idea is assigning the task closest to its deadline to current core until the core cannot finish any task within its deadline.
When there is no available core, the algorithm adds an available PM with most capacity or rents a new VM with highest cost-performance ratio.
Extensive experimental results show that our heuristic algorithm saves 16.2%-76% rent cost and improves 47.3%-182.8% resource utilizations satisfying deadline constraints, compared with first fit decreasing algorithm.
In this paper, we introduce iBoW-LCD, a novel appearance-based loop closure detection method.
The presented approach makes use of an incremental Bag-of-Words (BoW) scheme based on binary descriptors to retrieve previously seen similar images, avoiding any vocabulary training stage usually required by classic BoW models.
In addition, to detect loop closures, iBoW-LCD builds on the concept of dynamic islands, a simple but effective mechanism to group similar images close in time, which reduces the computational times typically associated to Bayesian frameworks.
Our approach is validated using several indoor and outdoor public datasets, taken under different environmental conditions, achieving a high accuracy and outperforming other state-of-the-art solutions.
This review considers methods of nonlinear dynamics to apply for analysis of time series corresponding to information streams on the Internet.
In the main, these methods are based on correlation, fractal, multifractal, wavelet, and Fourier analysis.
The article is dedicated to a detailed description of these approaches and interconnections among them.
The methods and corresponding algorithms presented can be used for detecting key points in the dynamic of information processes; identifying periodicity, anomaly, self-similarity, and correlations; forecasting various information processes.
The methods discussed can form the basis for detecting information attacks, campaigns, operations, and wars.
Datasets are important for researchers to build models and test how well their machine learning algorithms perform.
This paper presents the Rainforest Automation Energy (RAE) dataset to help smart grid researchers test their algorithms which make use of smart meter data.
This initial release of RAE contains 1Hz data (mains and sub-meters) from two a residential house.
In addition to power data, environmental and sensor data from the house's thermostat is included.
Sub-meter data from one of the houses includes heat pump and rental suite captures which is of interest to power utilities.
We also show and energy breakdown of each house and show (by example) how RAE can be used to test non-intrusive load monitoring (NILM) algorithms.
Several variants of the Constraint Satisfaction Problem have been proposed and investigated in the literature for modelling those scenarios where solutions are associated with some given costs.
Within these frameworks computing an optimal solution is an NP-hard problem in general; yet, when restricted over classes of instances whose constraint interactions can be modelled via (nearly-)acyclic graphs, this problem is known to be solvable in polynomial time.
In this paper, larger classes of tractable instances are singled out, by discussing solution approaches based on exploiting hypergraph acyclicity and, more generally, structural decomposition methods, such as (hyper)tree decompositions.
This paper studies the problem of predicting the coding effort for a subsequent year of development by analysing metrics extracted from project repositories, with an emphasis on projects containing XML code.
The study considers thirteen open source projects and applies machine learning algorithms to generate models to predict one-year coding effort, measured in terms of lines of code added, modified and deleted.
Both organisational and code metrics associated to revisions are taken into account.
The results show that coding effort is highly determined by the expertise of developers while source code metrics have little effect on improving the accuracy of estimations of coding effort.
The study also shows that models trained on one project are unreliable at estimating effort in other projects.
The Statistical Learning Theory (SLT) provides the theoretical guarantees for supervised machine learning based on the Empirical Risk Minimization Principle (ERMP).
Such principle defines an upper bound to ensure the uniform convergence of the empirical risk Remp(f), i.e., the error measured on a given data sample, to the expected value of risk R(f) (a.k.a. actual risk), which depends on the Joint Probability Distribution P(X x Y) mapping input examples x in X to class labels y in Y.
The uniform convergence is only ensured when the Shattering coefficient N(F,2n) has a polynomial growing behavior.
This paper proves the Shattering coefficient for any Hilbert space H containing the input space X and discusses its effects in terms of learning guarantees for supervised machine algorithms.
The metrics play increasingly fundamental role in the design, development, deployment and operation of telecommunication systems.
Despite their importance, the studies of metrics are usually limited to a narrow area or a well-defined objective.
Our study aims to more broadly survey the metrics that are commonly used for analyzing, developing and managing telecommunication networks in order to facilitate understanding of the current metrics landscape.
The metrics are simple abstractions of systems, and they directly influence how the systems are perceived by different stakeholders.
However, defining and using metrics for telecommunication systems with ever increasing complexity is a complicated matter which has not been so far systematically and comprehensively considered in the literature.
The common metrics sources are identified, and how the metrics are used and selected is discussed.
The most commonly used metrics for telecommunication systems are categorized and presented as energy and power metrics, quality-of-service metrics, quality-of-experience metrics, security metrics, and reliability and resilience metrics.
Finally, the research directions and recommendations how the metrics can evolve, and be defined and used more effectively are outlined.
The proliferation of online biometric authentication has necessitated security requirements of biometric templates.
The existing secure biometric authentication schemes feature a server-centric model, where a service provider maintains a biometric database and is fully responsible for the security of the templates.
The end-users have to fully trust the server in storing, processing and managing their private templates.
As a result, the end-users' templates could be compromised by outside attackers or even the service provider itself.
In this paper, we propose a user-centric biometric authentication scheme (PassBio) that enables end-users to encrypt their own templates with our proposed light-weighted encryption scheme.
During authentication, all the templates remain encrypted such that the server will never see them directly.
However, the server is able to determine whether the distance of two encrypted templates is within a pre-defined threshold.
Our security analysis shows that no critical information of the templates can be revealed under both passive and active attacks.
PassBio follows a "compute-then-compare" computational model over encrypted data.
More specifically, our proposed Threshold Predicate Encryption (TPE) scheme can encrypt two vectors x and y in such a manner that the inner product of x and y can be evaluated and compared to a pre-defined threshold.
TPE guarantees that only the comparison result is revealed and no key information about x and y can be learned.
Furthermore, we show that TPE can be utilized as a flexible building block to evaluate different distance metrics such as Hamming distance and Euclidean distance over encrypted data.
Such a compute-then-compare computational model, enabled by TPE, can be widely applied in many interesting applications such as searching over encrypted data while ensuring data security and privacy.
Supervised feature learning using convolutional neural networks (CNNs) can provide concise and disease relevant representations of medical images.
However, training CNNs requires annotated image data.
Annotating medical images can be a time-consuming task and even expert annotations are subject to substantial inter- and intra-rater variability.
Assessing visual similarity of images instead of indicating specific pathologies or estimating disease severity could allow non-experts to participate, help uncover new patterns, and possibly reduce rater variability.
We consider the task of assessing emphysema extent in chest CT scans.
We derive visual similarity triplets from visually assessed emphysema extent and learn a low dimensional embedding using CNNs.
We evaluate the networks on 973 images, and show that the CNNs can learn disease relevant feature representations from derived similarity triplets.
To our knowledge this is the first medical image application where similarity triplets has been used to learn a feature representation that can be used for embedding unseen test images
An important property of programming language semantics is that they should be compositional.
However, unstructured low-level code contains goto-like commands making it hard to define a semantics that is compositional.
In this paper, we follow the ideas of Saabas and Uustalu to structure low-level code.
This gives us the possibility to define a compositional denotational semantics based on least fixed points to allow for the use of inductive verification methods.
We capture the semantics of communication using finite traces similar to the denotations of CSP.
In addition, we examine properties of this semantics and give an example that demonstrates reasoning about communication and jumps.
With this semantics, we lay the foundations for a proof calculus that captures both, the semantics of unstructured low-level code and communication.
The amount of Android malware has increased greatly during the last few years.
Static analysis is widely used in detecting such malware by analyzing the code without execution.
The effectiveness of current tools relies on the app model as well as the malware detection algorithm which analyzes the app model.
If the model and/or the algorithm is inadequate, then sophisticated attacks that are triggered by specific sequences of events will not be detected.
This paper presents a static analysis framework called Dexteroid, which uses reverse-engineered life cycle models to accurately capture the behaviors of Android components.
Dexteroid systematically derives event sequences from the models, and uses them to detect attacks launched by specific ordering of events.
A prototype implementation of Dexteroid detects two types of attacks: (1) leakage of private information, and (2) sending SMS to premium-rate numbers.
A series of experiments are conducted on 1526 Google Play apps, 1259 Genome Malware apps, and a suite of benchmark apps called DroidBench and the results are compared with a state-of-the-art static analysis tool called FlowDroid.
The evaluation results show that the proposed framework is effective and efficient in terms of precision, recall, and execution time.
Visual reasoning is a special visual question answering problem that is multi-step and compositional by nature, and also requires intensive text-vision interactions.
We propose CMM: Cascaded Mutual Modulation as a novel end-to-end visual reasoning model.
CMM includes a multi-step comprehension process for both question and image.
In each step, we use a Feature-wise Linear Modulation (FiLM) technique to enable textual/visual pipeline to mutually control each other.
Experiments show that CMM significantly outperforms most related models, and reach state-of-the-arts on two visual reasoning benchmarks: CLEVR and NLVR, collected from both synthetic and natural languages.
Ablation studies confirm that both our multistep framework and our visual-guided language modulation are critical to the task.
Our code is available at https://github.com/FlamingHorizon/CMM-VR.
Predicting business process behaviour is an important aspect of business process management.
Motivated by research in natural language processing, this paper describes an application of deep learning with recurrent neural networks to the problem of predicting the next event in a business process.
This is both a novel method in process prediction, which has largely relied on explicit process models, and also a novel application of deep learning methods.
The approach is evaluated on two real datasets and our results surpass the state-of-the-art in prediction precision.
Distant supervision for relation extraction is an efficient method to reduce labor costs and has been widely used to seek novel relational facts in large corpora, which can be identified as a multi-instance multi-label problem.
However, existing distant supervision methods suffer from selecting important words in the sentence and extracting valid sentences in the bag.
Towards this end, we propose a novel approach to address these problems in this paper.
Firstly, we propose a linear attenuation simulation to reflect the importance of words in the sentence with respect to the distances between entities and words.
Secondly, we propose a non-independent and identically distributed (non-IID) relevance embedding to capture the relevance of sentences in the bag.
Our method can not only capture complex information of words about hidden relations, but also express the mutual information of instances in the bag.
Extensive experiments on a benchmark dataset have well-validated the effectiveness of the proposed method.
We investigate the 'Digital Synaptic Neural Substrate' (DSNS) computational creativity approach further with respect to the size and quality of images that can be used to seed the process.
In previous work we demonstrated how combining photographs of people and sequences taken from chess games between weak players can be used to generate chess problems or puzzles of higher aesthetic quality, on average, compared to alternative approaches.
In this work we show experimentally that using larger images as opposed to smaller ones improves the output quality even further.
The same is also true for using clearer or less corrupted images.
The reasons why these things influence the DSNS process is presently not well-understood and debatable but the findings are nevertheless immediately applicable for obtaining better results.
In this paper, we present the step by step knowledge acquisition process by choosing a structured method through using a questionnaire as a knowledge acquisition tool.
Here we want to depict the problem domain as, how to evaluate teachers performance in higher education through the use of expert system technology.
The problem is how to acquire the specific knowledge for a selected problem efficiently and effectively from human experts and encode it in the suitable computer format.
Acquiring knowledge from human experts in the process of expert systems development is one of the most common problems cited till yet.
This questionnaire was sent to 87 domain experts within all public and private universities in Pakistani.
Among them 25 domain experts sent their valuable opinions.
Most of the domain experts were highly qualified, well experienced and highly responsible persons.
The whole questionnaire was divided into 15 main groups of factors, which were further divided into 99 individual questions.
These facts were analyzed further to give a final shape to the questionnaire.
This knowledge acquisition technique may be used as a learning tool for further research work.
Many computer vision problems are formulated as the optimization of a cost function.
This approach faces two main challenges: (i) designing a cost function with a local optimum at an acceptable solution, and (ii) developing an efficient numerical method to search for one (or multiple) of these local optima.
While designing such functions is feasible in the noiseless case, the stability and location of local optima are mostly unknown under noise, occlusion, or missing data.
In practice, this can result in undesirable local optima or not having a local optimum in the expected place.
On the other hand, numerical optimization algorithms in high-dimensional spaces are typically local and often rely on expensive first or second order information to guide the search.
To overcome these limitations, this paper proposes Discriminative Optimization (DO), a method that learns search directions from data without the need of a cost function.
Specifically, DO explicitly learns a sequence of updates in the search space that leads to stationary points that correspond to desired solutions.
We provide a formal analysis of DO and illustrate its benefits in the problem of 3D point cloud registration, camera pose estimation, and image denoising.
We show that DO performed comparably or outperformed state-of-the-art algorithms in terms of accuracy, robustness to perturbations, and computational efficiency.
#NAME?
Particular attention has been paid to the frequent exchange of Cooperative Awareness Messages (CAMs) on which many road safety appli
A marginal problem asks whether a given family of marginal distributions for some set of random variables arises from some joint distribution of these variables.
Here we point out that the existence of such a joint distribution imposes non-trivial conditions already on the level of Shannon entropies of the given marginals.
These entropic inequalities are necessary (but not sufficient) criteria for the existence of a joint distribution.
For every marginal problem, a list of such Shannon-type entropic inequalities can be calculated by Fourier-Motzkin elimination, and we offer a software interface to a Fourier-Motzkin solver for doing so.
For the case that the hypergraph of given marginals is a cycle graph, we provide a complete analytic solution to the problem of classifying all relevant entropic inequalities, and use this result to bound the decay of correlations in stochastic processes.
Furthermore, we show that Shannon-type inequalities for differential entropies are not relevant for continuous-variable marginal problems; non-Shannon-type inequalities are, both in the discrete and in the continuous case.
In contrast to other approaches, our general framework easily adapts to situations where one has additional (conditional) independence requirements on the joint distribution, as in the case of graphical models.
We end with a list of open problems.
A complementary article discusses applications to quantum nonlocality and contextuality.
Semantic annotation is fundamental to deal with large-scale lexical information, mapping the information to an enumerable set of categories over which rules and algorithms can be applied, and foundational ontology classes can be used as a formal set of categories for such tasks.
A previous alignment between WordNet noun synsets and DOLCE provided a starting point for ontology-based annotation, but in NLP tasks verbs are also of substantial importance.
This work presents an extension to the WordNet-DOLCE noun mapping, aligning verbs according to their links to nouns denoting perdurants, transferring to the verb the DOLCE class assigned to the noun that best represents that verb's occurrence.
To evaluate the usefulness of this resource, we implemented a foundational ontology-based semantic annotation framework, that assigns a high-level foundational category to each word or phrase in a text, and compared it to a similar annotation tool, obtaining an increase of 9.05% in accuracy.
Clinical trial registries can be used to monitor the production of trial evidence and signal when systematic reviews become out of date.
However, this use has been limited to date due to the extensive manual review required to search for and screen relevant trial registrations.
Our aim was to evaluate a new method that could partially automate the identification of trial registrations that may be relevant for systematic review updates.
We identified 179 systematic reviews of drug interventions for type 2 diabetes, which included 537 clinical trials that had registrations in ClinicalTrials.gov.
We tested a matrix factorisation approach that uses a shared latent space to learn how to rank relevant trial registrations for each systematic review, comparing the performance to document similarity to rank relevant trial registrations.
The two approaches were tested on a holdout set of the newest trials from the set of type 2 diabetes systematic reviews and an unseen set of 141 clinical trial registrations from 17 updated systematic reviews published in the Cochrane Database of Systematic Reviews.
The matrix factorisation approach outperformed the document similarity approach with a median rank of 59 and recall@100 of 60.9%, compared to a median rank of 138 and recall@100 of 42.8% in the document similarity baseline.
In the second set of systematic reviews and their updates, the highest performing approach used document similarity and gave a median rank of 67 (recall@100 of 62.9%).
The proposed method was useful for ranking trial registrations to reduce the manual workload associated with finding relevant trials for systematic review updates.
The results suggest that the approach could be used as part of a semi-automated pipeline for monitoring potentially new evidence for inclusion in a review update.
We propose a new iterative segmentation model which can be accurately learned from a small dataset.
A common approach is to train a model to directly segment an image, requiring a large collection of manually annotated images to capture the anatomical variability in a cohort.
In contrast, we develop a segmentation model that recursively evolves a segmentation in several steps, and implement it as a recurrent neural network.
We learn model parameters by optimizing the interme- diate steps of the evolution in addition to the final segmentation.
To this end, we train our segmentation propagation model by presenting incom- plete and/or inaccurate input segmentations paired with a recommended next step.
Our work aims to alleviate challenges in segmenting heart structures from cardiac MRI for patients with congenital heart disease (CHD), which encompasses a range of morphological deformations and topological changes.
We demonstrate the advantages of this approach on a dataset of 20 images from CHD patients, learning a model that accurately segments individual heart chambers and great vessels.
Com- pared to direct segmentation, the iterative method yields more accurate segmentation for patients with the most severe CHD malformations.
This paper studies different signaling techniques on the continuous spectrum (CS) of nonlinear optical fiber defined by nonlinear Fourier transform.
Three different signaling techniques are proposed and analyzed based on the statistics of the noise added to CS after propagation along the nonlinear optical fiber.
The proposed methods are compared in terms of error performance, distance reach, and complexity.
Furthermore, the effect of chromatic dispersion on the data rate and noise in nonlinear spectral domain is investigated.
It is demonstrated that, for a given sequence of CS symbols, an optimal bandwidth (or symbol rate) can be determined so that the temporal duration of the propagated signal at the end of the fiber is minimized.
In effect, the required guard interval between the subsequently transmitted data packets in time is minimized and the effective data rate is significantly enhanced.
Moreover, by selecting the proper signaling method and design criteria a reach distance of 7100 km is reported by only singling on the CS at a rate of 9.6 Gbps.
The new model of quantum computation is proposed, for which an effective algorithm of solving any task in NP is described.
The work is based and inspired be the Grover's algorithm for solving NP-tasks with quadratic speedup compared to the classical computation model.
The provided model and algorithm exhibit the exponential speedup over that described by Grover.
This contribution reports an application of MultiFractal Detrended Fluctuation Analysis, MFDFA based novel feature extraction technique for automated detection of epilepsy.
In fractal geometry, Multifractal Detrended Fluctuation Analysis MFDFA is a popular technique to examine the self-similarity of a nonlinear, chaotic and noisy time series.
In the present research work, EEG signals representing healthy, interictal (seizure free) and ictal activities (seizure) are acquired from an existing available database.
The acquired EEG signals of different states are at first analyzed using MFDFA.
To requisite the time series singularity quantification at local and global scales, a novel set of fourteen different features.
Suitable feature ranking employing students t-test has been done to select the most statistically significant features which are henceforth being used as inputs to a support vector machines (SVM) classifier for the classification of different EEG signals.
Eight different classification problems have been presented in this paper and it has been observed that the overall classification accuracy using MFDFA based features are reasonably satisfactory for all classification problems.
The performance of the proposed method are also found to be quite commensurable and in some cases even better when compared with the results published in existing literature studied on the similar data set.
Message Passing Interface (MPI) is widely used to implement parallel programs.
Although Windowsbased architectures provide the facilities of parallel execution and multi-threading, little attention has been focused on using MPI on these platforms.
In this paper we use the dual core Window-based platform to study the effect of parallel processes number and also the number of cores on the performance of three MPI parallel implementations for some sorting algorithms.
Timbre and pitch are the two main perceptual properties of musical sounds.
Depending on the target applications, we sometimes prefer to focus on one of them, while reducing the effect of the other.
Researchers have managed to hand-craft such timbre-invariant or pitch-invariant features using domain knowledge and signal processing techniques, but it remains difficult to disentangle them in the resulting feature representations.
Drawing upon state-of-the-art techniques in representation learning, we propose in this paper two deep convolutional neural network models for learning disentangled representation of musical timbre and pitch.
Both models use encoders/decoders and adversarial training to learn music representations, but the second model additionally uses skip connections to deal with the pitch information.
As music is an art of time, the two models are supervised by frame-level instrument and pitch labels using a new dataset collected from MuseScore.
We compare the result of the two disentangling models with a new evaluation protocol called "timbre crossover", which leads to interesting applications in audio-domain music editing.
Via various objective evaluations, we show that the second model can better change the instrumentation of a multi-instrument music piece without much affecting the pitch structure.
By disentangling timbre and pitch, we envision that the model can contribute to generating more realistic music audio as well.
Cloud Computing emerges from the global economic crisis as an option to use computing resources from a more rational point of view.
In other words, a cheaper way to have IT resources.
However, issues as security and privacy, SLA (Service Layer Agreement), resource sharing, and billing has left open questions about the real gains of that model.
This study aims to investigate state-of-the-art in Cloud Computing, identify gaps, challenges, synthesize available evidences both its use and development, and provides relevant information, clarifying open questions and common discussed issues about that model through literature.
The good practices of systematic map- ping study methodology were adopted in order to reach those objectives.
Al- though Cloud Computing is based on a business model with over 50 years of existence, evidences found in this study indicate that Cloud Computing still presents limitations that prevent the full use of the proposal on-demand.
One of the open challenges in designing robots that operate successfully in the unpredictable human environment is how to make them able to predict what actions they can perform on objects, and what their effects will be, i.e., the ability to perceive object affordances.
Since modeling all the possible world interactions is unfeasible, learning from experience is required, posing the challenge of collecting a large amount of experiences (i.e., training data).
Typically, a manipulative robot operates on external objects by using its own hands (or similar end-effectors), but in some cases the use of tools may be desirable, nevertheless, it is reasonable to assume that while a robot can collect many sensorimotor experiences using its own hands, this cannot happen for all possible human-made tools.
Therefore, in this paper we investigate the developmental transition from hand to tool affordances: what sensorimotor skills that a robot has acquired with its bare hands can be employed for tool use?
By employing a visual and motor imagination mechanism to represent different hand postures compactly, we propose a probabilistic model to learn hand affordances, and we show how this model can generalize to estimate the affordances of previously unseen tools, ultimately supporting planning, decision-making and tool selection tasks in humanoid robots.
We present experimental results with the iCub humanoid robot, and we publicly release the collected sensorimotor data in the form of a hand posture affordances dataset.
On-device intelligence is gaining significant attention recently as it offers local data processing and low power consumption.
In this research, an on-device training circuitry for threshold-current memristors integrated in a crossbar structure is proposed.
Furthermore, alternate approaches of mapping the synaptic weights into fully-trained and semi-trained crossbars are investigated.
In a semi-trained crossbar a confined subset of memristors are tuned and the remaining subset of memristors are not programmed.
This translates to optimal resource utilization and power consumption, compared to a fully programmed crossbar.
The semi-trained crossbar architecture is applicable to a broad class of neural networks.
System level verification is performed with an extreme learning machine for binomial and multinomial classification.
The total power for a single 4x4 layer network, when implemented in IBM 65nm node, is estimated to be ~ 42.16uW and the area is estimated to be 26.48um x 22.35um.
Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens.
In this work, we use attention distributions as a confidence metric for output translations.
We present two strategies of using the attention distributions: filtering out bad translations from a large back-translated corpus, and selecting the best translation in a hybrid setup of two different translation systems.
While manual evaluation indicated only a weak correlation between our confidence score and human judgments, the use-cases showed improvements of up to 2.22 BLEU points for filtering and 0.99 points for hybrid translation, tested on English<->German and English<->Latvian translation.
The Grey Wolf Optimizer (GWO) is a swarm intelligence meta-heuristic algorithm inspired by the hunting behaviour and social hierarchy of grey wolves in nature.
This paper analyses the use of chaos theory in this algorithm to improve its ability to escape local optima by replacing the key parameters by chaotic variables.
The optimal choice of chaotic maps is then used to apply the Chaotic Grey Wolf Optimizer (CGWO) to the problem of factoring a large semi prime into its prime factors.
Assuming the number of digits of the factors to be equal, this is a computationally difficult task upon which the RSA-cryptosystem relies.
This work proposes the use of a new objective function to solve the problem and uses the CGWO to optimize it and compute the factors.
It is shown that this function performs better than its predecessor for large semi primes and CGWO is an efficient algorithm to optimize it.
The Intel Core i7 processor code named Nehalem provides a feature named Turbo Boost which opportunistically varies the frequencies of the processor's cores.
The frequency of a core is determined by core temperature, the number of active cores, the estimated power consumption, the estimated current consumption, and operating system frequency scaling requests.
For a chip multi-processor(CMP) that has a small number of physical cores and a small set of performance states, deciding the Turbo Boost frequency to use on a given core might not be difficult.
However, we do not know the complexity of this decision making process in the context of a large number of cores, scaling to the 100s, as predicted by researchers in the field.
Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.
Existing superpixel algorithms are not differentiable, making them difficult to integrate into otherwise end-to-end trainable deep neural networks.
We develop a new differentiable model for superpixel sampling that leverages deep networks for learning superpixel segmentation.
The resulting "Superpixel Sampling Network" (SSN) is end-to-end trainable, which allows learning task-specific superpixels with flexible loss functions and has fast runtime.
Extensive experimental analysis indicates that SSNs not only outperform existing superpixel algorithms on traditional segmentation benchmarks, but can also learn superpixels for other tasks.
In addition, SSNs can be easily integrated into downstream deep networks resulting in performance improvements.
Despite being the appearance-based classifier of choice in recent years, relatively few works have examined how much convolutional neural networks (CNNs) can improve performance on accepted expression recognition benchmarks and, more importantly, examine what it is they actually learn.
In this work, not only do we show that CNNs can achieve strong performance, but we also introduce an approach to decipher which portions of the face influence the CNN's predictions.
First, we train a zero-bias CNN on facial expression data and achieve, to our knowledge, state-of-the-art performance on two expression recognition benchmarks: the extended Cohn-Kanade (CK+) dataset and the Toronto Face Dataset (TFD).
We then qualitatively analyze the network by visualizing the spatial patterns that maximally excite different neurons in the convolutional layers and show how they resemble Facial Action Units (FAUs).
Finally, we use the FAU labels provided in the CK+ dataset to verify that the FAUs observed in our filter visualizations indeed align with the subject's facial movements.
In this paper, probabilistic shaping is numerically and experimentally investigated for increasing the transmission reach of wavelength division multiplexed (WDM) optical communication system employing quadrature amplitude modulation (QAM).
An optimized probability mass function (PMF) of the QAM symbols is first found from a modified Blahut-Arimoto algorithm for the optical channel.
A turbo coded bit interleaved coded modulation system is then applied, which relies on many-to-one labeling to achieve the desired PMF, thereby achieving shaping gain.
Pilot symbols at rate at most 2% are used for synchronization and equalization, making it possible to receive input constellations as large as 1024QAM.
The system is evaluated experimentally on a 10 GBaud, 5 channels WDM setup.
The maximum system reach is increased w.r.t. standard 1024QAM by 20% at input data rate of 4.65 bits/symbol and up to 75% at 5.46 bits/symbol.
It is shown that rate adaptation does not require changing of the modulation format.
The performance of the proposed 1024QAM shaped system is validated on all 5 channels of the WDM signal for selected distances and rates.
Finally, it was shown via EXIT charts and BER analysis that iterative demapping, while generally beneficial to the system, is not a requirement for achieving the shaping gain.
This paper presents a bionic reflex control strategy for a kinematically constrained robotic finger.
Here, the bionic reflex is achieved through a force tracking impedance control strategy.
The dynamic model of the finger is reduced subject to kinematic constraints.
Thereafter, an impedance control strategy that allows exact tracking of forces is discussed.
Simulation results for a single finger holding a rectangular object against a flat surface are presented.
Bionic reflex response time is of the order of milliseconds.
The common feature of nearly all logic and memory devices is that they make use of stable units to represent 0's and 1's.
A completely different paradigm is based on three-terminal stochastic units which could be called "p-bits", where the output is a random telegraphic signal continuously fluctuating between 0 and 1 with a tunable mean. p-bits can be interconnected to receive weighted contributions from others in a network, and these weighted contributions can be chosen to not only solve problems of optimization and inference but also to implement precise Boolean functions in an inverted mode.
This inverted operation of Boolean gates is particularly striking: They provide inputs consistent to a given output along with unique outputs to a given set of inputs.
The existing demonstrations of accurate invertible logic are intriguing, but will these striking properties observed in computer simulations carry over to hardware implementations?
This paper uses individual micro controllers to emulate p-bits, and we present results for a 4-bit ripple carry adder with 48 p-bits and a 4-bit multiplier with 46 p-bits working in inverted mode as a factorizer.
Our results constitute a first step towards implementing p-bits with nano devices, like stochastic Magnetic Tunnel Junctions.
Integer factorization is one of the vital algorithms discussed as a part of analysis of any black-box cipher suites where the cipher algorithm is based on number theory.
The origin of the problem is from Discrete Logarithmic Problem which appears under the analysis of the crypto-graphic algorithms as seen by a crypt-analyst.
The integer factorization algorithm poses a potential in computational science too, obtaining the factors of a very large number is challenging with a limited computing infrastructure.
This paper analyses the Pollards Rho heuristic with a varying input size to evaluate the performance under a multi-core environment and also to estimate the threshold for each computing infrastructure.
The volume of convolutional neural network (CNN) models proposed for face recognition has been continuously growing larger to better fit large amount of training data.
When training data are obtained from internet, the labels are likely to be ambiguous and inaccurate.
This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels.
First, we introduce a variation of maxout activation, called Max-Feature-Map (MFM), into each convolutional layer of CNN.
Different from maxout activation that uses many feature maps to linearly approximate an arbitrary convex activation function, MFM does so via a competitive relationship.
MFM can not only separate noisy and informative signals but also play the role of feature selection between two feature maps.
Second, three networks are carefully designed to obtain better performance meanwhile reducing the number of parameters and computational costs.
Lastly, a semantic bootstrapping method is proposed to make the prediction of the networks more consistent with noisy labels.
Experimental results show that the proposed framework can utilize large-scale noisy data to learn a Light model that is efficient in computational costs and storage spaces.
The learned single network with a 256-D representation achieves state-of-the-art results on various face benchmarks without fine-tuning.
The code is released on https://github.com/AlfredXiangWu/LightCNN.
Online social networks (OSNs) have become the main medium for connecting people, sharing knowledge and information, and for communication.
The social connections between people using these OSNs are formed as virtual links (e.g., friendship and following connections) that connect people.
These links are the heart of today's OSNs as they facilitate all of the activities that the members of a social network can do.
However, many of these networks suffer from noisy links, i.e., links that do not reflect a real relationship or links that have a low intensity, that change the structure of the network and prevent accurate analysis of these networks.
Hence, a process for assessing and ranking the links in a social network is crucial in order to sustain a healthy and real network.
Here, we define link assessment as the process of identifying noisy and non-noisy links in a network.
In this paper, we address the problem of link assessment and link ranking in social networks using external interaction networks.
In addition to a friendship social network, additional exogenous interaction networks are utilized to make the assessment process more meaningful.
We employed machine learning classifiers for assessing and ranking the links in the social network of interest using the data from exogenous interaction networks.
The method was tested with two different datasets, each containing the social network of interest, with the ground truth, along with the exogenous interaction networks.
The results show that it is possible to effectively assess the links of a social network using only the structure of a single network of the exogenous interaction networks, and also using the structure of the whole set of exogenous interaction networks.
The experiments showed that some classifiers do better than others regarding both link classification and link ranking.
We present and evaluate a compiler from Prolog (and extensions) to JavaScript which makes it possible to use (constraint) logic programming to develop the client side of web applications while being compliant with current industry standards.
Targeting JavaScript makes (C)LP programs executable in virtually every modern computing device with no additional software requirements from the point of view of the user.
In turn, the use of a very high-level language facilitates the development of high-quality, complex software.
The compiler is a back end of the Ciao system and supports most of its features, including its module system and its rich language extension mechanism based on packages.
We present an overview of the compilation process and a detailed description of the run-time system, including the support for modular compilation into separate JavaScript code.
We demonstrate the maturity of the compiler by testing it with complex code such as a CLP(FD) library written in Prolog with attributed variables.
Finally, we validate our proposal by measuring the performance of some LP and CLP(FD) benchmarks running on top of major JavaScript engines.
We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation.
Our novel technique makes deep reinforcement learning more practical by drastically reducing the training time.
We evaluate the performance of our approach on the 49 games of the challenging Arcade Learning Environment, and report significant improvements in both training time and accuracy.
We present a straightforward procedure to evaluate the scientific contribution of territories and institutions that combines the size-dependent geometric mean, Q, of the number of research documents (N) and citations (C), and a scale-free measure of quality, q=C/N.
We introduce a Global Research Output (GRO-index) as the geometric mean of Q and q.
We show that the GRO-index correlates with the h-index, but appears to be more strongly correlated with other well known, widely used bibliometric indicators.
We also compute relative GRO-indexes (GROr) associated with the scientific production within research fields.
We note that although total sums of GROr values are larger than the GRO-index, due to the non-linearity in the computation of the geometric means, both counts are nevertheless highly correlated.
That enables us to make useful comparative analyses among territories and institutions.
Furthermore, to identify strengths and weaknesses of a given country or institution, we compute a Relative Research Output count (RROr-index) to tackle variations of the C/N ratio across research fields.
Moreover, by using a wealth-index also based on quantitative and qualitative variables, we show that the GRO and RRO indexes are highly correlated with the wealth of the countries and the states of the USA.
Given the simplicity of the procedures introduced in this paper and the fact that their results are easily understandable by non-specialists, we believe they could become as useful for the assessment of the research output of countries and institutions as the impact factor is for journals or the h-index for individuals.
Hybrid beamforming (HB) has been widely studied for reducing the number of costly radio frequency (RF) chains in massive multiple-input multiple-output (MIMO) systems.
However, previous works on HB are limited to a single user equipment (UE) or a single group of UEs, employing the frequency-flat first-level analog beamforming (AB) that cannot be applied to multiple groups of UEs served in different frequency resources in an orthogonal frequency-division multiplexing (OFDM) system.
In this paper, a novel HB algorithm with unified AB based on the spatial covariance matrix (SCM) knowledge of all UEs is proposed for a massive MIMO-OFDM system in order to support multiple groups of UEs.
The proposed HB method with a much smaller number of RF chains can achieve more than 95% performance of full digital beamforming.
In addition, a novel practical subspace construction (SC) algorithm based on partial channel state information is proposed to estimate the required SCM.
The proposed SC method can offer more than 97% performance of the perfect SCM case.
With the proposed methods, significant cost and power savings can be achieved without large loss in performance.
Furthermore, the proposed methods can be applied to massive MIMO-OFDM systems in both time-division duplex and frequency-division duplex.
Recent developments in the field of Networking have provided opportunities for networks to efficiently cater application specific needs of a user.
In this context, a routing path is not only dependent upon the network states but also is calculated in the best interest of an application using the network.
These advanced routing algorithms can exploit application state data to enhance advanced network services such as anycast, edge cloud computing and cyber physical systems (CPS).
In this work, we aim to design such a routing algorithm where the router decisions are based upon convex optimization techniques.
This paper presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics.
Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability.
In order to resolve the instability issue, existing methods have assumed that the system model is known and used pre-actuation or inverse approximation techniques.
In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input-output data.
Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking.
Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.
Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many applications.
However, these algorithms cannot be directly applied to supervised tasks.
This paper thus presents a generative model extending the centroid-based clustering approach to be applicable to classification and regression tasks.
Given an arbitrary loss function, the proposed approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk.
Entropy-based regularization is also employed to fuzzify the partition and to weight features, enabling the method to capture more complex patterns, identify significant features, and yield better performance facing high-dimensional data.
An iterative algorithm based on block coordinate descent scheme is formulated to efficiently find a local optimum.
Extensive classification experiments on synthetic, real-world, and high-dimensional datasets demonstrate that the predictive performance of SFP is competitive with state-of-the-art algorithms such as random forest and SVM.
The SFP has a major advantage over such methods, in that it not only leads to a flexible, nonlinear model but also can exploit any convex loss function in the training phase without compromising computational efficiency.
Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions.
In the present work, we extend the usage of LRP to recurrent neural networks.
We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs.
We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.
We present a dockerized version of a real-time strategy game StarCraft: Brood War, commonly used as a domain for AI research, with a pre-installed collection of AI developement tools supporting all the major types of StarCraft bots.
This provides a convenient way to deploy StarCraft AIs on numerous hosts at once and across multiple platforms despite limited OS support of StarCraft.
In this technical report, we describe the design of our Docker images and present a few use cases.
Compressive sensing is a method to recover the original image from undersampled measurements.
In order to overcome the ill-posedness of this inverse problem, image priors are used such as sparsity in the wavelet domain, minimum total-variation, or self-similarity.
Recently, deep learning based compressive image recovery methods have been proposed and have yielded state-of-the-art performances.
They used deep learning based data-driven approaches instead of hand-crafted image priors to solve the ill-posed inverse problem with undersampled data.
Ironically, training deep neural networks for them requires "clean" ground truth images, but obtaining the best quality images from undersampled data requires well-trained deep neural networks.
To resolve this dilemma, we propose novel methods based on two well-grounded theories: denoiser-approximate message passing and Stein's unbiased risk estimator.
Our proposed methods were able to train deep learning based image denoisers from undersampled measurements without ground truth images and without image priors, and to recover images with state-of-the-art qualities from undersampled data.
We evaluated our methods for various compressive sensing recovery problems with Gaussian random, coded diffraction pattern, and compressive sensing MRI measurement matrices.
Our methods yielded state-of-the-art performances for all cases without ground truth images and without image priors.
They also yielded comparable performances to the methods with ground truth data.
We prove that an auxiliary two-point boundary value problem presented in V. L. Kharitonov, Lyapunov matrices for a class of time delay systems, Systems & Control Letters 55 (2006) 610-617 has linearly dependent boundary conditions, and consequently a unique solution does not exist.
Therefore, the two-point boundary value problem presented therein fails to be a basis for constructing Lyapunov matrices for the class of time delay systems investigated.
Linear programming is now included in algorithm undergraduate and postgraduate courses for computer science majors.
We give a self-contained treatment of an interior-point method which is particularly tailored to the typical mathematical background of CS students.
In particular, only limited knowledge of linear algebra and calculus is assumed.
We present a versatile and fast MATLAB program (UmUTracker) that automatically detects and tracks particles by analyzing video sequences acquired by either light microscopy or digital in-line holographic microscopy.
Our program detects the 2D lateral positions of particles with an algorithm based on the isosceles triangle transform, and reconstructs their 3D axial positions by a fast implementation of the Rayleigh-Sommerfeld model using a radial intensity profile.
To validate the accuracy and performance of our program, we first track the 2D position of polystyrene particles using bright field and digital holographic microscopy.
Second, we determine the 3D particle position by analyzing synthetic and experimentally acquired holograms.
Finally, to highlight the full program features, we profile the microfluidic flow in a 100 micrometer high flow chamber.
This result agrees with computational fluid dynamic simulations.
On a regular desktop computer UmUTracker can detect, analyze, and track multiple particles at 5 frames per second for a template size of 201 x 201 in a 1024 x 1024 image.
To enhance usability and to make it easy to implement new functions we used object-oriented programming.
UmUTracker is suitable for studies related to: particle dynamics, cell localization, colloids and microfluidic flow measurement.
We present Web-STAR, an online platform for story understanding built on top of the STAR reasoning engine for STory comprehension through ARgumentation.
The platform includes a web-based IDE, integration with the STAR system, and a web service infrastructure to support integration with other systems that rely on story understanding functionality to complete their tasks.
The platform also delivers a number of "social" features, including a community repository for public story sharing with a built-in commenting system, and tools for collaborative story editing that can be used for team development projects and for educational purposes.
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown.
Via a series of empirical studies, we demonstrate how accurate OPE is strongly dependent on the calibration of estimated behaviour policy models: how precisely the behaviour policy is estimated from data.
We show how powerful parametric models such as neural networks can result in highly uncalibrated behaviour policy models on a real-world medical dataset, and illustrate how a simple, non-parametric, k-nearest neighbours model produces better calibrated behaviour policy estimates and can be used to obtain superior importance sampling-based OPE estimates.
We present a framework for online inference in the presence of a nonexhaustively defined set of classes that incorporates supervised classification with class discovery and modeling.
A Dirichlet process prior (DPP) model defined over class distributions ensures that both known and unknown class distributions originate according to a common base distribution.
In an attempt to automatically discover potentially interesting class formations, the prior model is coupled with a suitably chosen data model, and sequential Monte Carlo sampling is used to perform online inference.
Our research is driven by a biodetection application, where a new class of pathogen may suddenly appear, and the rapid increase in the number of samples originating from this class indicates the onset of an outbreak.
It is difficult to estimate the midsagittal plane of human subjects with craniomaxillofacial (CMF) deformities.
We have developed a LAndmark GEometric Routine (LAGER), which automatically estimates a midsagittal plane for such subjects.
The LAGER algorithm was based on the assumption that the optimal midsagittal plane of a patient with a deformity is the premorbid midsagittal plane of the patient (i.e.hypothetically normal without deformity).
The LAGER algorithm consists of three steps.
The first step quantifies the asymmetry of the landmarks using a Euclidean distance matrix analysis and ranks the landmarks according to their degree of asymmetry.
The second step uses a recursive algorithm to drop outlier landmarks.
The third step inputs the remaining landmarks into an optimization algorithm to determine an optimal midsaggital plane.
We validate LAGER on 20 synthetic models mimicking the skulls of real patients with CMF deformities.
The results indicated that all the LAGER algorithm-generated midsagittal planes met clinical criteria.
Thus it can be used clinically to determine the midsagittal plane for patients with CMF deformities.
Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text.
In addition to relation instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are aliases for the relation founderOfCompany).
RE models usually ignore such readily available side information.
In this paper, we propose RESIDE, a distantly-supervised neural relation extraction method which utilizes additional side information from KBs for improved relation extraction.
It uses entity type and relation alias information for imposing soft constraints while predicting relations.
RESIDE employs Graph Convolution Networks (GCN) to encode syntactic information from text and improves performance even when limited side information is available.
Through extensive experiments on benchmark datasets, we demonstrate RESIDE's effectiveness.
We have made RESIDE's source code available to encourage reproducible research.
Optimal use of computing resources requires extensive coding, tuning and benchmarking.
To boost developer productivity in these time consuming tasks, we introduce the Experimental Linear Algebra Performance Studies framework (ELAPS), a multi-platform open source environment for fast yet powerful performance experimentation with dense linear algebra kernels, algorithms, and libraries.
ELAPS allows users to construct experiments to investigate how performance and efficiency vary depending on factors such as caching, algorithmic parameters, problem size, and parallelism.
Experiments are designed either through Python scripts or a specialized GUI, and run on the whole spectrum of architectures, ranging from laptops to clusters, accelerators, and supercomputers.
The resulting experiment reports provide various metrics and statistics that can be analyzed both numerically and visually.
We demonstrate the use of ELAPS in four concrete application scenarios and in as many computing environments, illustrating its practical value in supporting critical performance decisions.
Keeping students engaged with the course content outside the classroom is a challenging task.
Since learning during undergraduate years occurs not only as student engagement in class, but also during out of class activities, we need to redesign and reinvent such activities for this and future generation of students.
Although active learning has been used widely to improve in class student learning and engagement, its usage outside the classroom is not widespread and researched.
Active learning is often not utilized for out of class activities and traditional unsupervised activities are used mostly to keep students engaged in the content after they leave the classroom.
Although there has been tremendous research performed to improve student learning and engagement in the classroom, there are a few pieces of researches on improving out of class learning and student engagement.
This poster will present an approach to redesign the traditional out of class activities with the help of mobile apps, which are interactive and adaptive, and will provide personalization to satisfy student's needs outside the classroom so that optimal learning experience can be achieved.
The conventional high-speed Wi-Fi has recently become a contender for low-power Internet-of-Things (IoT) communications.
OFDM continues its adoption in the new IoT Wi-Fi standard due to its spectrum efficiency that can support the demand of massive IoT connectivity.
While the IoT Wi-Fi standard offers many new features to improve power and spectrum efficiency, the basic physical layer (PHY) structure of transceiver design still conforms to its conventional design rationale where access points (AP) and clients employ the same OFDM PHY.
In this paper, we argue that current Wi-Fi PHY design does not take full advantage of the inherent asymmetry between AP and IoT.
To fill the gap, we propose an asymmetric design where IoT devices transmit uplink packets using the lowest power while pushing all the decoding burdens to the AP side.
Such a design utilizes the sufficient power and computational resources at AP to trade for the transmission (TX) power of IoT devices.
The core technique enabling this asymmetric design is that the AP takes full power of its high clock rate to boost the decoding ability.
We provide an implementation of our design and show that it can reduce the IoT's TX power by boosting the decoding capability at the receivers.
The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage.
Consequently, the deployment of CNNs in hardware has become more challenging.
In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-grained level.
Unlike other pruning methods such as Fine-Grained pruning, IKR pruning maintains regular kernel structures that are exploitable in a hardware accelerator.
Experimental results demonstrate up to 10x parameter reduction and 7x computational reduction at a cost of less than 1% degradation in accuracy versus the un-pruned case.
We provide theoretical investigation of curriculum learning in the context of stochastic gradient descent when optimizing the convex linear regression loss.
We prove that the rate of convergence of an ideal curriculum learning method is monotonically increasing with the difficulty of the examples.
Moreover, among all equally difficult points, convergence is faster when using points which incur higher loss with respect to the current hypothesis.
We then analyze curriculum learning in the context of training a CNN.
We describe a method which infers the curriculum by way of transfer learning from another network, pre-trained on a different task.
While this approach can only approximate the ideal curriculum, we observe empirically similar behavior to the one predicted by the theory, namely, a significant boost in convergence speed at the beginning of training.
When the task is made more difficult, improvement in generalization performance is also observed.
Finally, curriculum learning exhibits robustness against unfavorable conditions such as excessive regularization.
A graph G=(V,E) is a pairwise compatibility graph (PCG) if there exists an edge-weighted tree T and two non-negative real numbers `d' and `D' such that each leaf `u' of T is a node of V and the edge `(u,v) belongs to E' iff `d <= d_T(u, v) <= D' where d_T(u, v) is the sum of weights of the edges on the unique path from `u' to `v' in T. The main issue on these graphs consists in characterizing them.
In this note we prove the inclusion in the PCG class of threshold tolerance graphs and the non-inclusion of a number of intersection graphs, such as disk and grid intersection graphs, circular arc and tolerance graphs.
The non-inclusion of some superclasses (trapezoid, permutation and rectangle intersection graphs) follows.
Lurking is a complex user-behavioral phenomenon that occurs in all large-scale online communities and social networks.
It generally refers to the behavior characterizing users that benefit from the information produced by others in the community without actively contributing back to the production of social content.
The amount and evolution of lurkers may strongly affect an online social environment, therefore understanding the lurking dynamics and identifying strategies to curb this trend are relevant problems.
In this regard, we introduce the Lurker Game, i.e., a model for analyzing the transitions from a lurking to a non-lurking (i.e., active) user role, and vice versa, in terms of evolutionary game theory.
We evaluate the proposed Lurker Game by arranging agents on complex networks and analyzing the system evolution, seeking relations between the network topology and the final equilibrium of the game.
Results suggest that the Lurker Game is suitable to model the lurking dynamics, showing how the adoption of rewarding mechanisms combined with the modeling of hypothetical heterogeneity of users' interests may lead users in an online community towards a cooperative behavior.
The paper discusses various applications of permutation group theory in the synthesis of reversible logic circuits consisting of Toffoli gates with negative control lines.
An asymptotically optimal synthesis algorithm for circuits consisting of gates from the NCT library is described.
An algorithm for gate complexity reduction, based on equivalent replacements of gates compositions, is introduced.
A new approach for combining a group-theory-based synthesis algorithm with a Reed-Muller-spectra-based synthesis algorithm is described.
Experimental results are presented to show that the proposed synthesis techniques allow a reduction in input lines count, gate complexity or quantum cost of reversible circuits for various benchmark functions.
The instance segmentation can be considered an extension of the object detection problem where bounding boxes are replaced by object contours.
Strictly speaking the problem requires to identify each pixel instance and class independently of the artifice used for this mean.
The advantage of instance segmentation over the usual object detection lies in the precise delineation of objects improving object localization.
Additionally, object contours allow the evaluation of partial occlusion with basic image processing algorithms.
This work approaches the instance segmentation problem as an annotation problem and presents a novel technique to encode and decode ground truth annotations.
We propose a mathematical representation of instances that any deep semantic segmentation model can learn and generalize.
Each individual instance is represented by a center of mass and a field of vectors pointing to it.
This encoding technique has been denominated Distance to Center of Mass Encoding (DCME).
In this article we consider the basic ideas, approaches and results of developing of mathematical knowledge management technologies based on ontologies.
These solutions form the basis of a specialized digital ecosystem OntoMath which consists of the ontology of the logical structure of mathematical documents Mocassin and ontology of mathematical knowledge OntoMathPRO, tools of text analysis, recommender system and other applications to manage mathematical knowledge.
The studies are in according to the ideas of creating a distributed system of interconnected repositories of digitized versions of mathematical documents and project to create a World Digital Mathematical Library.
Automated melodic phrase detection and segmentation is a classical task in content-based music information retrieval and also the key towards automated music structure analysis.
However, traditional methods still cannot satisfy practical requirements.
In this paper, we explore and adapt various neural network architectures to see if they can be generalized to work with the symbolic representation of music and produce satisfactory melodic phrase segmentation.
The main issue of applying deep-learning methods to phrase detection is the sparse labeling problem of training sets.
We proposed two tailored label engineering with corresponding training techniques for different neural networks in order to make decisions at a sequential level.
Experiment results show that the CNN-CRF architecture performs the best, being able to offer finer segmentation and faster to train, while CNN, Bi-LSTM-CNN and Bi-LSTM-CRF are acceptable alternatives.
We present in this paper a new family of implicit function for synthesizing a wide variety of 3D surfaces.
The basis of this family consists of the usual functions that are: the function rectangular pulses, the function saw-tooth pulses, the function of triangular pulses, the staircase function and the power function.
By combining these common functions, named constituent functions, in one implicit function and by varying some parameters of this function we can synthesize a wide variety of 3D surfaces with the possibility to set their deformations.
The lack of reliable data in developing countries is a major obstacle to sustainable development, food security, and disaster relief.
Poverty data, for example, is typically scarce, sparse in coverage, and labor-intensive to obtain.
Remote sensing data such as high-resolution satellite imagery, on the other hand, is becoming increasingly available and inexpensive.
Unfortunately, such data is highly unstructured and currently no techniques exist to automatically extract useful insights to inform policy decisions and help direct humanitarian efforts.
We propose a novel machine learning approach to extract large-scale socioeconomic indicators from high-resolution satellite imagery.
The main challenge is that training data is very scarce, making it difficult to apply modern techniques such as Convolutional Neural Networks (CNN).
We therefore propose a transfer learning approach where nighttime light intensities are used as a data-rich proxy.
We train a fully convolutional CNN model to predict nighttime lights from daytime imagery, simultaneously learning features that are useful for poverty prediction.
The model learns filters identifying different terrains and man-made structures, including roads, buildings, and farmlands, without any supervision beyond nighttime lights.
We demonstrate that these learned features are highly informative for poverty mapping, even approaching the predictive performance of survey data collected in the field.
Many recent works that study the performance of multi-input multi-output (MIMO) systems in practice assume a Kronecker model where the variances of the channel entries, upon decomposition on to the transmit and the receive eigen-bases, admit a separable form.
Measurement campaigns, however, show that the Kronecker model results in poor estimates for capacity.
Motivated by these observations, a channel model that does not impose a separable structure has been recently proposed and shown to fit the capacity of measured channels better.
In this work, we show that this recently proposed modeling framework can be viewed as a natural consequence of channel decomposition on to its canonical coordinates, the transmit and/or the receive eigen-bases.
Using tools from random matrix theory, we then establish the theoretical basis behind the Kronecker mismatch at the low- and the high-SNR extremes: 1) Sparsity of the dominant statistical degrees of freedom (DoF) in the true channel at the low-SNR extreme, and 2) Non-regularity of the sparsity structure (disparities in the distribution of the DoF across the rows and the columns) at the high-SNR extreme.
The Model / View / Controller design pattern divides an application environment into three components to handle the user-interactions, computations and output respectively.
This separation greatly favors architectural reusability.
The pattern works well in the case of single-address space and not proven to be efficient for web applications involving multiple address spaces.
Web applications force the designers to decide which of the components of the pattern are to be partitioned between the server and client(s) before the design phase commences.
For any rapidly growing web application, it is very difficult to incorporate future changes in policies related to partitioning.
One solution to this problem is to duplicate the Model and controller components at both server and client(s).
However, this may add further problems like delayed data fetch, security and scalability issues.
In order to overcome this, a new architecture SPIM has been proposed that deals with the partitioning problem in an alternative way.
SPIM shows tremendous improvements in performance when compared with a similar architecture.
With the considerable development of customer-to-customer (C2C) e-commerce in the recent years, there is a big demand for an effective recommendation system that suggests suitable websites for users to sell their items with some specified needs.
Nonetheless, e-commerce recommendation systems are mostly designed for business-to-customer (B2C) websites, where the systems offer the consumers the products that they might like to buy.
Almost none of the related research works focus on choosing selling sites for target items.
In this paper, we introduce an approach that recommends the selling websites based upon the item's description, category, and desired selling price.
This approach employs NoSQL data-based machine learning techniques for building and training topic models and classification models.
The trained models can then be used to rank the websites dynamically with respect to the user needs.
The experimental results with real-world datasets from Vietnam C2C websites will demonstrate the effectiveness of our proposed method.
Snapshot compressive imaging (SCI) refers to compressive imaging systems where multiple frames are mapped into a single measurement, with video compressive imaging and hyperspectral compressive imaging as two representative applications.
Though exciting results of high-speed videos and hyperspectral images have been demonstrated, the poor reconstruction quality precludes SCI from wide applications.This paper aims to boost the reconstruction quality of SCI via exploiting the high-dimensional structure in the desired signal.
We build a joint model to integrate the nonlocal self-similarity of video/hyperspectral frames and the rank minimization approach with the SCI sensing process.
Following this, an alternating minimization algorithm is developed to solve this non-convex problem.
We further investigate the special structure of the sampling process in SCI to tackle the computational workload and memory issues in SCI reconstruction.
Both simulation and real data (captured by four different SCI cameras) results demonstrate that our proposed algorithm leads to significant improvements compared with current state-of-the-art algorithms.
We hope our results will encourage the researchers and engineers to pursue further in compressive imaging for real applications.
How do you learn to navigate an Unmanned Aerial Vehicle (UAV) and avoid obstacles?
One approach is to use a small dataset collected by human experts: however, high capacity learning algorithms tend to overfit when trained with little data.
An alternative is to use simulation.
But the gap between simulation and real world remains large especially for perception problems.
The reason most research avoids using large-scale real data is the fear of crashes!
In this paper, we propose to bite the bullet and collect a dataset of crashes itself!
We build a drone whose sole purpose is to crash into objects: it samples naive trajectories and crashes into random objects.
We crash our drone 11,500 times to create one of the biggest UAV crash dataset.
This dataset captures the different ways in which a UAV can crash.
We use all this negative flying data in conjunction with positive data sampled from the same trajectories to learn a simple yet powerful policy for UAV navigation.
We show that this simple self-supervised model is quite effective in navigating the UAV even in extremely cluttered environments with dynamic obstacles including humans.
For supplementary video see: https://youtu.be/u151hJaGKUo
Learning language of protein sequences, which captures non-local interactions between amino acids close in the spatial structure, is a long-standing bioinformatics challenge, which requires at least context-free grammars.
However, complex character of protein interactions impedes unsupervised learning of context-free grammars.
Using structural information to constrain the syntactic trees proved effective in learning probabilistic natural and RNA languages.
In this work, we establish a framework for learning probabilistic context-free grammars for protein sequences from syntactic trees partially constrained using amino acid contacts obtained from wet experiments or computational predictions, whose reliability has substantially increased recently.
Within the framework, we implement the maximum-likelihood and contrastive estimators of parameters for simple yet practical grammars.
Tested on samples of protein motifs, grammars developed within the framework showed improved precision in recognition and higher fidelity to protein structures.
The framework is applicable to other biomolecular languages and beyond wherever knowledge of non-local dependencies is available.
In this work we propose, implement, and evaluate novel models called Third-Order Hidden Markov Models (HMM3s) to enhance low performance of text-independent speaker identification in shouted talking environments.
The proposed models have been tested on our collected speech database using Mel-Frequency Cepstral Coefficients (MFCCs).
Our results demonstrate that HMM3s significantly improve speaker identification performance in such talking environments by 11.3% and 166.7% compared to second-order hidden Markov models (HMM2s) and first-order hidden Markov models (HMM1s), respectively.
The achieved results based on the proposed models are close to those obtained in subjective assessment by human listeners.
In mobile crowdsensing, finding the best match between tasks and users is crucial to ensure both the quality and effectiveness of a crowdsensing system.
Existing works usually assume a centralized task assignment by the crowdsensing platform, without addressing the need of fine-grained personalized task matching.
In this paper, we argue that it is essential to match tasks to users based on a careful characterization of both the users' preference and reliability.
To that end, we propose a personalized task recommender system for mobile crowdsensing, which recommends tasks to users based on a recommendation score that jointly takes each user's preference and reliability into consideration.
We first present a hybrid preference metric to characterize users' preference by exploiting their implicit feedback.
Then, to profile users' reliability levels, we formalize the problem as a semi-supervised learning model, and propose an efficient block coordinate descent algorithm to solve the problem.
For some tasks that lack users' historical information, we further propose a matrix factorization method to infer the users' reliability levels on those tasks.
We conduct extensive experiments to evaluate the performance of our system, and the evaluation results demonstrate that our system can achieve superior performance to the benchmarks in both user profiling and personalized task recommendation.
This study proposes a logic architecture for the high-speed and power efficiently training of a gradient boosting decision tree model of binary classification.
We implemented the proposed logic architecture on an FPGA and compared training time and power efficiency with three general GBDT software libraries using CPU and GPU.
The training speed of the logic architecture on the FPGA was 26-259 times faster than the software libraries.
The power efficiency of the logic architecture was 90-1,104 times higher than the software libraries.
The results show that the logic architecture suits for high-performance and edge computing.
This paper proposes a general framework for structure-preserving model reduction of a secondorder network system based on graph clustering.
In this approach, vertex dynamics are captured by the transfer functions from inputs to individual states, and the dissimilarities of vertices are quantified by the H2-norms of the transfer function discrepancies.
A greedy hierarchical clustering algorithm is proposed to place those vertices with similar dynamics into same clusters.
Then, the reduced-order model is generated by the Petrov-Galerkin method, where the projection is formed by the characteristic matrix of the resulting network clustering.
It is shown that the simplified system preserves an interconnection structure, i.e., it can be again interpreted as a second-order system evolving over a reduced graph.
Furthermore, this paper generalizes the definition of network controllability Gramian to second-order network systems.
Based on it, we develop an efficient method to compute H2-norms and derive the approximation error between the full-order and reduced-order models.
Finally, the approach is illustrated by the example of a small-world network.
Solving tasks in Reinforcement Learning is no easy feat.
As the goal of the agent is to maximize the accumulated reward, it often learns to exploit loopholes and misspecifications in the reward signal resulting in unwanted behavior.
While constraints may solve this issue, there is no closed form solution for general constraints.
In this work we present a novel multi-timescale approach for constrained policy optimization, called `Reward Constrained Policy Optimization' (RCPO), which uses an alternative penalty signal to guide the policy towards a constraint satisfying one.
We prove the convergence of our approach and provide empirical evidence of its ability to train constraint satisfying policies.
It has been well demonstrated that adversarial examples, i.e., natural images with visually imperceptible perturbations added, generally exist for deep networks to fail on image classification.
In this paper, we extend adversarial examples to semantic segmentation and object detection which are much more difficult.
Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e.g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations.
Based on this idea, we propose a novel algorithm named Dense Adversary Generation (DAG), which generates a large family of adversarial examples, and applies to a wide range of state-of-the-art deep networks for segmentation and detection.
We also find that the adversarial perturbations can be transferred across networks with different training data, based on different architectures, and even for different recognition tasks.
In particular, the transferability across networks with the same architecture is more significant than in other cases.
Besides, summing up heterogeneous perturbations often leads to better transfer performance, which provides an effective method of black-box adversarial attack.
Gait is an important biometric trait for surveillance and forensic applications, which can be used to identify individuals at a large distance through CCTV cameras.
However, it is very difficult to develop robust automated gait recognition systems, since gait may be affected by many covariate factors such as clothing, walking surface, walking speed, camera view angle, etc.
Out of them, large view angle was deemed as the most challenging factor since it may alter the overall gait appearance substantially.
Recently, some deep learning approaches (such as CNNs) have been employed to extract view-invariant features, and achieved encouraging results on small datasets.
However, they do not scale well to large dataset, and the performance decreases significantly w.r.t. number of subjects, which is impractical to large-scale surveillance applications.
To address this issue, in this work we propose a Discriminant Gait Generative Adversarial Network (DiGGAN) framework, which not only can learn view-invariant gait features for cross-view gait recognition tasks, but also can be used to reconstruct the gait templates in all views --- serving as important evidences for forensic applications.
We evaluated our DiGGAN framework on the world's largest multi-view OU-MVLP dataset (which includes more than 10000 subjects), and our method outperforms state-of-the-art algorithms significantly on various cross-view gait identification scenarios (e.g., cooperative/uncooperative mode).
Our DiGGAN framework also has the best results on the popular CASIA-B dataset, and it shows great generalisation capability across different datasets.
Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks.
In this work, we use these models to explore the use of geographical features in predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2012.
Specifically, we compare model performance using a newly defined metric -- area between the curves (ABC) -- to assess (a) whether survival curves can be reasonably predicted for colorectal cancer patients in the state of Iowa, (b) whether geographical features improve predictive performance, and (c) whether a simple binary representation or richer, spectral clustering-based representation perform better.
Our findings suggest that survival curves can be reasonably estimated on average, with predictive performance deviating at the five-year survival mark.
We also find that geographical features improve predictive performance, and that the best performance is obtained using richer, spectral analysis-elicited features.
With the advent of the 5th generation of wireless standards and an increasing demand for higher throughput, methods to improve the spectral efficiency of wireless systems have become very important.
In the context of cognitive radio, a substantial increase in throughput is possible if the secondary user can make smart decisions regarding which channel to sense and when or how often to sense.
Here, we propose an algorithm to not only select a channel for data transmission but also to predict how long the channel will remain unoccupied so that the time spent on channel sensing can be minimized.
Our algorithm learns in two stages - a reinforcement learning approach for channel selection and a Bayesian approach to determine the optimal duration for which sensing can be skipped.
Comparisons with other learning methods are provided through extensive simulations.
We show that the number of sensing is minimized with negligible increase in primary interference; this implies that lesser energy is spent by the secondary user in sensing and also higher throughput is achieved by saving on sensing.
Due to its simplicity and versatility, k-means remains popular since it was proposed three decades ago.
The performance of k-means has been enhanced from different perspectives over the years.
Unfortunately, a good trade-off between quality and efficiency is hardly reached.
In this paper, a novel k-means variant is presented.
Different from most of k-means variants, the clustering procedure is driven by an explicit objective function, which is feasible for the whole l2-space.
The classic egg-chicken loop in k-means has been simplified to a pure stochastic optimization procedure.
The procedure of k-means becomes simpler and converges to a considerably better local optima.
The effectiveness of this new variant has been studied extensively in different contexts, such as document clustering, nearest neighbor search and image clustering.
Superior performance is observed across different scenarios.
Electric Vehicle (EV) is playing a significant role in the distribution energy management systems since the power consumption level of the EVs is much higher than the other regular home appliances.
The randomness of the EV driver behaviors make the optimal charging or discharging scheduling even more difficult due to the uncertain charging session parameters.
To minimize the impact of behavioral uncertainties, it is critical to develop effective methods to predict EV load for smart EV energy management.
Using the EV smart charging infrastructures on UCLA campus and city of Santa Monica as testbeds, we have collected real-world datasets of EV charging behaviors, based on which we proposed an EV user modeling technique which combines statistical analysis and machine learning approaches.
Specifically, unsupervised clustering algorithm, and multilayer perceptron are applied to historical charging record to make the day-ahead EV parking and load prediction.
Experimental results with cross-validation show that our model can achieve good performance for charging control scheduling and online EV load forecasting.
The same-origin policy is a fundamental part of the Web.
Despite the restrictions imposed by the policy, embedding of third-party JavaScript code is allowed and commonly used.
Nothing is guaranteed about the integrity of such code.
To tackle this deficiency, solutions such as the subresource integrity standard have been recently introduced.
Given this background, this paper presents the first empirical study on the temporal integrity of cross-origin JavaScript code.
According to the empirical results based on a ten day polling period of over 35 thousand scripts collected from popular websites, (i) temporal integrity changes are relatively common; (ii) the adoption of the subresource integrity standard is still in its infancy; and (iii) it is possible to statistically predict whether a temporal integrity change is likely to occur.
With these results and the accompanying discussion, the paper contributes to the ongoing attempts to better understand security and privacy in the current Web.
We present a general-purpose tagger based on convolutional neural networks (CNN), used for both composing word vectors and encoding context information.
The CNN tagger is robust across different tagging tasks: without task-specific tuning of hyper-parameters, it achieves state-of-the-art results in part-of-speech tagging, morphological tagging and supertagging.
The CNN tagger is also robust against the out-of-vocabulary problem, it performs well on artificially unnormalized texts.
Recommendation system is a type of information filtering systems that recommend various objects from a vast variety and quantity of items which are of the user interest.
This results in guiding an individual in personalized way to interesting or useful objects in a large space of possible options.
Such systems also help many businesses to achieve more profits to sustain in their filed against their rivals.
But looking at the amount of information which a business holds it becomes difficult to identify the items of user interest.
Therefore personalization or user profiling is one of the challenging tasks that give access to user relevant information which can be used in solving the difficult task of classification and ranking items according to an individuals interest.
Profiling can be done in various ways such assupervised or unsupervised, individual or group profiling, distributive or and non distributive profiling.
Our focus in this paper will be on the dataset which we will use, we identify some interesting facts by using Weka Tool that can be used for recommending the items from dataset.
Our aim is to present a novel technique to achieve user profiling in recommendation system.
Impedance control is a well-established technique to control interaction forces in robotics.
However, real implementations of impedance control with an inner loop may suffer from several limitations.
Although common practice in designing nested control systems is to maximize the bandwidth of the inner loop to improve tracking performance, it may not be the most suitable approach when a certain range of impedance parameters has to be rendered.
In particular, it turns out that the viable range of stable stiffness and damping values can be strongly affected by the bandwidth of the inner control loops (e.g. a torque loop) as well as by the filtering and sampling frequency.
This paper provides an extensive analysis on how these aspects influence the stability region of impedance parameters as well as the passivity of the system.
This will be supported by both simulations and experimental data.
Moreover, a methodology for designing joint impedance controllers based on an inner torque loop and a positive velocity feedback loop will be presented.
The goal of the velocity feedback is to increase (given the constraints to preserve stability) the bandwidth of the torque loop without the need of a complex controller.
This paper proposes an Agile Aggregating Multi-Level feaTure framework (Agile Amulet) for salient object detection.
The Agile Amulet builds on previous works to predict saliency maps using multi-level convolutional features.
Compared to previous works, Agile Amulet employs some key innovations to improve training and testing speed while also increase prediction accuracy.
More specifically, we first introduce a contextual attention module that can rapidly highlight most salient objects or regions with contextual pyramids.
Thus, it effectively guides the learning of low-layer convolutional features and tells the backbone network where to look.
The contextual attention module is a fully convolutional mechanism that simultaneously learns complementary features and predicts saliency scores at each pixel.
In addition, we propose a novel method to aggregate multi-level deep convolutional features.
As a result, we are able to use the integrated side-output features of pre-trained convolutional networks alone, which significantly reduces the model parameters leading to a model size of 67 MB, about half of Amulet.
Compared to other deep learning based saliency methods, Agile Amulet is of much lighter-weight, runs faster (30 fps in real-time) and achieves higher performance on seven public benchmarks in terms of both quantitative and qualitative evaluation.
The technology related to networking moves wired connection to wireless connection.The basic problem concern in the wireless domain, random packet loss for the end to end connection.
In this paper we show the performance and the impact of the packet loss and delay, by the bit error rate throughput etc with respect to the real world scenario vehicular ad hoc network in 3-dimension space (VANET in 3D).
Over the years software development has responded to the increasing growth of wireless connectivity in developing network enabled software.
In this paper we consider the real world physical problem in three dimensional wireless domain and map the problem to analytical problem .
In this paper we simulate that analytic problem with respect to real world scenario by using enhanced antenna position system (EAPS) mounted over the mobile node in 3D space.
In this paper we convert the real world problem into lab oriented problem by using the EAPS -system and shown the performance in wireless domain in 3 dimensional space.
Excluding irrelevant features in a pattern recognition task plays an important role in maintaining a simpler machine learning model and optimizing the computational efficiency.
Nowadays with the rise of large scale datasets, feature selection is in great demand as it becomes a central issue when facing high-dimensional datasets.
The present study provides a new measure of saliency for features by employing a Sensitivity Analysis (SA) technique called the extended Fourier amplitude sensitivity test, and a well-trained Feedforward Neural Network (FNN) model, which ultimately leads to the selection of a promising optimal feature subset.
Ideas of the paper are mainly demonstrated based on adopting FNN model for feature selection in classification problems.
But in the end, a generalization framework is discussed in order to give insights into the usage in regression problems as well as expressing how other function approximate models can be deployed.
Effectiveness of the proposed method is verified by result analysis and data visualization for a series of experiments over several well-known datasets drawn from UCI machine learning repository.
The hyperlink prediction task, that of proposing new links between webpages, can be used to improve search engines, expand the visibility of web pages, and increase the connectivity and navigability of the web.
Hyperlink prediction is typically performed on webgraphs composed by thousands or millions of vertices, where on average each webpage contains less than fifty links.
Algorithms processing graphs so large and sparse require to be both scalable and precise, a challenging combination.
Similarity-based algorithms are among the most scalable solutions within the link prediction field, due to their parallel nature and computational simplicity.
These algorithms independently explore the nearby topological features of every missing link from the graph in order to determine its likelihood.
Unfortunately, the precision of similarity-based algorithms is limited, which has prevented their broad application so far.
In this work we explore the performance of similarity-based algorithms for the particular problem of hyperlink prediction on large webgraphs, and propose a novel method which assumes the existence of hierarchical properties.
We evaluate this new approach on several webgraphs and compare its performance with that of the current best similarity-based algorithms.
Its remarkable performance leads us to argue on the applicability of the proposal, identifying several use cases of hyperlink prediction.
We also describes the approach we took for the computation of large-scale graphs from the perspective of high-performance computing, providing details on the implementation and parallelization of code.
Easy access and vast amount of data, especially from long period of time, allows to divide social network into timeframes and create temporal social network.
Such network enables to analyse its dynamics.
One aspect of the dynamics is analysis of social communities evolution, i.e., how particular group changes over time.
To do so, the complete group evolution history is needed.
That is why in this paper the new method for group evolution extraction called GED is presented.
Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources.
Upcoming missions will soon provide large data streams that will make land cover/use classification difficult.
Machine learning classifiers can help at this, and many methods are currently available.
A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines.
However, its computational cost is prohibitive for large scale applications, and constitutes the main obstacle precluding wide adoption.
This paper tackles this problem by introducing two novel efficient methodologies for Gaussian Process (GP) classification.
We first include the standard random Fourier features approximation into GPC, which largely decreases its computational cost and permits large scale remote sensing image classification.
In addition, we propose a model which avoids randomly sampling a number of Fourier frequencies, and alternatively learns the optimal ones within a variational Bayes approach.
The performance of the proposed methods is illustrated in complex problems of cloud detection from multispectral imagery and infrared sounding data.
Excellent empirical results support the proposal in both computational cost and accuracy.
We combine conditional variational autoencoders (VAE) with adversarial censoring in order to learn invariant representations that are disentangled from nuisance/sensitive variations.
In this method, an adversarial network attempts to recover the nuisance variable from the representation, which the VAE is trained to prevent.
Conditioning the decoder on the nuisance variable enables clean separation of the representation, since they are recombined for model learning and data reconstruction.
We show this natural approach is theoretically well-founded with information-theoretic arguments.
Experiments demonstrate that this method achieves invariance while preserving model learning performance, and results in visually improved performance for style transfer and generative sampling tasks.
Many people take photos and videos with smartphones and more recently with 360-degree cameras at popular places and events, and share them in social media.
Such visual content is produced in large volumes in urban areas, and it is a source of information that online users could exploit to learn what has got the interest of the general public on the streets of the cities where they live or plan to visit.
A key step to providing users with that information is to identify the most popular k spots in specified areas.
In this paper, we propose a clustering and incremental sampling (C&IS) approach that trades off accuracy of top-k results for detection speed.
It uses clustering to determine areas with high density of visual content, and incremental sampling, controlled by stopping criteria, to limit the amount of computational work.
It leverages spatial metadata, which represent the scenes in the visual content, to rapidly detect the hotspots, and uses a recently proposed Gaussian probability model to describe the capture intention distribution in the query area.
We evaluate the approach with metadata, derived from a non-synthetic, user-generated dataset, for regular mobile and 360-degree visual content.
Our results show that the C&IS approach offers 2.8x-19x reductions in processing time over an optimized baseline, while in most cases correctly identifying 4 out of 5 top locations.
Cloud gaming enables playing high end games, originally designed for PC or game console setups, on low end devices, such as net-books and smartphones, by offloading graphics rendering to GPU powered cloud servers.
However, transmitting the high end graphics requires a large amount of available network bandwidth, even though it is a compressed video stream.
Foveated video encoding (FVE) reduces the bandwidth requirement by taking advantage of the non-uniform acuity of human visual system and by knowing where the user is looking.
We have designed and implemented a system for cloud gaming with foveated graphics using a consumer grade real-time eye tracker and an open source cloud gaming platform.
In this article, we describe the system and its evaluation through measurements with representative games from different genres to understand the effect of parameterization of the FVE scheme on bandwidth requirements and to understand its feasibility from the latency perspective.
We also present results from a user study.
The results suggest that it is possible to find a "sweet spot" for the encoding parameters so that the users hardly notice the presence of foveated encoding but at the same time the scheme yields most of the bandwidth savings achievable.
In this paper we introduce Smooth Particle Networks (SPNets), a framework for integrating fluid dynamics with deep networks.
SPNets adds two new layers to the neural network toolbox: ConvSP and ConvSDF, which enable computing physical interactions with unordered particle sets.
We use these lay- ers in combination with standard neural network layers to directly implement fluid dynamics inside a deep network, where the parameters of the network are the fluid parameters themselves (e.g., viscosity, cohesion, etc.).
Because SPNets are imple- mented as a neural network, the resulting fluid dynamics are fully differentiable.
We then show how this can be successfully used to learn fluid parameters from data, perform liquid control tasks, and learn policies to manipulate liquids.
This paper develops a novel methodology for using symbolic knowledge in deep learning.
From first principles, we derive a semantic loss function that bridges between neural output vectors and logical constraints.
This loss function captures how close the neural network is to satisfying the constraints on its output.
An experimental evaluation shows that it effectively guides the learner to achieve (near-)state-of-the-art results on semi-supervised multi-class classification.
Moreover, it significantly increases the ability of the neural network to predict structured objects, such as rankings and paths.
These discrete concepts are tremendously difficult to learn, and benefit from a tight integration of deep learning and symbolic reasoning methods.
In this paper we present the ADAPT system built for the Basque to English Low Resource MT Evaluation Campaign.
Basque is a low-resourced, morphologically-rich language.
This poses a challenge for Neural Machine Translation models which usually achieve better performance when trained with large sets of data.
Accordingly, we used synthetic data to improve the translation quality produced by a model built using only authentic data.
Our proposal uses back-translated data to: (a) create new sentences, so the system can be trained with more data; and (b) translate sentences that are close to the test set, so the model can be fine-tuned to the document to be translated.
The universal scalability law (USL) is an analytic model used to quantify application scaling.
It is universal because it subsumes Amdahl's law and Gustafson linearized scaling as special cases.
Using simulation, we show: (i) that the USL is equivalent to synchronous queueing in a load-dependent machine repairman model and (ii) how USL, Amdahl's law, and Gustafson scaling can be regarded as boundaries defining three scalability zones.
Typical throughput measurements lie across all three zones.
Simulation scenarios provide deeper insight into queueing effects and thus provide a clearer indication of which application features should be tuned to get into the optimal performance zone.
A social network grows over a period of time with the formation of new connections and relations.
In recent years we have witnessed a massive growth of online social networks like Facebook, Twitter etc.
So it has become a problem of extreme importance to know the destiny of these networks.
Thus predicting the evolution of a social network is a question of extreme importance.
A good model for evolution of a social network can help in understanding the properties responsible for the changes occurring in a network structure.
In this paper we propose such a model for evolution of social networks.
We model the social network as an undirected graph where nodes represent people and edges represent the friendship between them.
We define the evolution process as a set of rules which resembles very closely to how a social network grows in real life.
We simulate the evolution process and show, how starting from an initial network, a network evolves using this model.
We also discuss how our model can be used to model various complex social networks other than online social networks like political networks, various organizations etc..
We consider distributed optimization over orthogonal collision channels in spatial random access networks.
Users are spatially distributed and each user is in the interference range of a few other users.
Each user is allowed to transmit over a subset of the shared channels with a certain attempt probability.
We study both the non-cooperative and cooperative settings.
In the former, the goal of each user is to maximize its own rate irrespective of the utilities of other users.
In the latter, the goal is to achieve proportionally fair rates among users.
Simple distributed learning algorithms are developed to solve these problems.
The efficiencies of the proposed algorithms are demonstrated via both theoretical analysis and simulation results.
In the era of big data and cloud computing, large amounts of data are generated from user applications and need to be processed in the datacenter.
Data-parallel computing frameworks, such as Apache Spark, are widely used to perform such data processing at scale.
Specifically, Spark leverages distributed memory to cache the intermediate results, represented as Resilient Distributed Datasets (RDDs).
This gives Spark an advantage over other parallel frameworks for implementations of iterative machine learning and data mining algorithms, by avoiding repeated computation or hard disk accesses to retrieve RDDs.
By default, caching decisions are left at the programmer's discretion, and the LRU policy is used for evicting RDDs when the cache is full.
However, when the objective is to minimize total work, LRU is woefully inadequate, leading to arbitrarily suboptimal caching decisions.
In this paper, we design an algorithm for multi-stage big data processing platforms to adaptively determine and cache the most valuable intermediate datasets that can be reused in the future.
Our solution automates the decision of which RDDs to cache: this amounts to identifying nodes in a direct acyclic graph (DAG) representing computations whose outputs should persist in the memory.
Our experiment results show that our proposed cache optimization solution can improve the performance of machine learning applications on Spark decreasing the total work to recompute RDDs by 12%.
Automatic sleep staging has been often treated as a simple classification problem that aims at determining the label of individual target polysomnography (PSG) epochs one at a time.
In this work, we tackle the task as a sequence-to-sequence classification problem that receives a sequence of multiple epochs as input and classifies all of their labels at once.
For this purpose, we propose a hierarchical recurrent neural network named SeqSleepNet.
At the epoch processing level, the network consists of a filterbank layer tailored to learn frequency-domain filters for preprocessing and an attention-based recurrent layer designed for short-term sequential modelling.
At the sequence processing level, a recurrent layer placed on top of the learned epoch-wise features for long-term modelling of sequential epochs.
The classification is then carried out on the output vectors at every time step of the top recurrent layer to produce the sequence of output labels.
Despite being hierarchical, we present a strategy to train the network in an end-to-end fashion.
We show that the proposed network outperforms state-of-the-art approaches, achieving an overall accuracy, macro F1-score, and Cohen's kappa of 87.1%, 83.3%, and 0.815 on a publicly available dataset with 200 subjects.
There is a resurging interest in developing a neural-network-based solution to the supervised machine learning problem.
The convolutional neural network (CNN) will be studied in this note.
To begin with, we introduce a RECOS transform as a basic building block of CNNs.
The "RECOS" is an acronym for "REctified-COrrelations on a Sphere".
It consists of two main concepts: 1) data clustering on a sphere and 2) rectification.
Afterwards, we interpret a CNN as a network that implements the guided multi-layer RECOS transform with three highlights.
First, we compare the traditional single-layer and modern multi-layer signal analysis approaches, point out key ingredients that enable the multi-layer approach, and provide a full explanation to the operating principle of CNNs.
Second, we discuss how guidance is provided by labels through backpropagation (BP) in the training.
Third, we show that a trained network can be greatly simplified in the testing stage demanding only one-bit representation for both filter weights and inputs.
The beer game is a widely used in-class game that is played in supply chain management classes to demonstrate the bullwhip effect.
The game is a decentralized, multi-agent, cooperative problem that can be modeled as a serial supply chain network in which agents cooperatively attempt to minimize the total cost of the network even though each agent can only observe its own local information.
Each agent chooses order quantities to replenish its stock.
Under some conditions, a base-stock replenishment policy is known to be optimal.
However, in a decentralized supply chain in which some agents (stages) may act irrationally (as they do in the beer game), there is no known optimal policy for an agent wishing to act optimally.
We propose a machine learning algorithm, based on deep Q-networks, to optimize the replenishment decisions at a given stage.
When playing alongside agents who follow a base-stock policy, our algorithm obtains near-optimal order quantities.
It performs much better than a base-stock policy when the other agents use a more realistic model of human ordering behavior.
Unlike most other algorithms in the literature, our algorithm does not have any limits on the beer game parameter values.
Like any deep learning algorithm, training the algorithm can be computationally intensive, but this can be performed ahead of time; the algorithm executes in real time when the game is played.
Moreover, we propose a transfer learning approach so that the training performed for one agent and one set of cost coefficients can be adapted quickly for other agents and costs.
Our algorithm can be extended to other decentralized multi-agent cooperative games with partially observed information, which is a common type of situation in real-world supply chain problems.
Lane detection is to detect lanes on the road and provide the accurate location and shape of each lane.
It severs as one of the key techniques to enable modern assisted and autonomous driving systems.
However, several unique properties of lanes challenge the detection methods.
The lack of distinctive features makes lane detection algorithms tend to be confused by other objects with similar local appearance.
Moreover, the inconsistent number of lanes on a road as well as diverse lane line patterns, e.g. solid, broken, single, double, merging, and splitting lines further hamper the performance.
In this paper, we propose a deep neural network based method, named LaneNet, to break down the lane detection into two stages: lane edge proposal and lane line localization.
Stage one uses a lane edge proposal network for pixel-wise lane edge classification, and the lane line localization network in stage two then detects lane lines based on lane edge proposals.
Please note that the goal of our LaneNet is built to detect lane line only, which introduces more difficulties on suppressing the false detections on the similar lane marks on the road like arrows and characters.
Despite all the difficulties, our lane detection is shown to be robust to both highway and urban road scenarios method without relying on any assumptions on the lane number or the lane line patterns.
The high running speed and low computational cost endow our LaneNet the capability of being deployed on vehicle-based systems.
Experiments validate that our LaneNet consistently delivers outstanding performances on real world traffic scenarios.
Estimation of Distribution Algorithms (EDAs) require flexible probability models that can be efficiently learned and sampled.
Autoencoders (AE) are generative stochastic networks with these desired properties.
We integrate a special type of AE, the Denoising Autoencoder (DAE), into an EDA and evaluate the performance of DAE-EDA on several combinatorial optimization problems with a single objective.
We asses the number of fitness evaluations as well as the required CPU times.
We compare the results to the performance to the Bayesian Optimization Algorithm (BOA) and RBM-EDA, another EDA which is based on a generative neural network which has proven competitive with BOA.
For the considered problem instances, DAE-EDA is considerably faster than BOA and RBM-EDA, sometimes by orders of magnitude.
The number of fitness evaluations is higher than for BOA, but competitive with RBM-EDA.
These results show that DAEs can be useful tools for problems with low but non-negligible fitness evaluation costs.
Identification of causal effects is one of the most fundamental tasks of causal inference.
We consider an identifiability problem where some experimental and observational data are available but neither data alone is sufficient for the identification of the causal effect of interest.
Instead of the outcome of interest, surrogate outcomes are measured in the experiments.
This problem is a generalization of identifiability using surrogate experiments and we label it as surrogate outcome identifiability.
We show that the concept of transportability provides a sufficient criteria for determining surrogate outcome identifiability for a large class of queries.
The notion of a spiral unfolding of a convex polyhedron, resulting by flattening a special type of Hamiltonian cut-path, is explored.
The Platonic and Archimedian solids all have nonoverlapping spiral unfoldings, although among generic polyhedra, overlap is more the rule than the exception.
The structure of spiral unfoldings is investigated, primarily by analyzing one particular class, the polyhedra of revolution.
This paper presents data analysis from a course on Software Engineering in an effort to identify metrics and techniques that would allow instructor to act proactively and identify patterns of low engagement and inefficient peer collaboration.
Over the last two terms, 106 students in their second year of studies formed 20 groups and worked collaboratively to develop video games.
Throughout the lab, students have to use a variety of tools for managing and developing their projects, such as software version control, static analysis tools, wikis, mailing lists, etc.
The students are also supported by weekly meetings with teaching assistants and instructors regarding group progress, code quality, and management issues.
Through these meetings and their interactions with the software tools, students leave a detailed trace of data related to their individual engagement and their collaboration behavior in their groups.
The paper provides discussion on the different source of data that can be monitored, and present preliminary results on how these data can be used to analyze students' activity.
We propose a new method to estimate the 6-dof trajectory of a flying object such as a quadrotor UAV within a 3D airspace monitored using multiple fixed ground cameras.
It is based on a new structure from motion formulation for the 3D reconstruction of a single moving point with known motion dynamics.
Our main contribution is a new bundle adjustment procedure which in addition to optimizing the camera poses, regularizes the point trajectory using a prior based on motion dynamics (or specifically flight dynamics).
Furthermore, we can infer the underlying control input sent to the UAV's autopilot that determined its flight trajectory.
Our method requires neither perfect single-view tracking nor appearance matching across views.
For robustness, we allow the tracker to generate multiple detections per frame in each video.
The true detections and the data association across videos is estimated using robust multi-view triangulation and subsequently refined during our bundle adjustment procedure.
Quantitative evaluation on simulated data and experiments on real videos from indoor and outdoor scenes demonstrates the effectiveness of our method.
Existing scrubbing techniques for SEU mitigation on FPGAs do not guarantee an error-free operation after SEU recovering if the affected configuration bits do belong to feedback loops of the implemented circuits.
In this paper, we a) provide a netlist-based circuit analysis technique to distinguish so-called critical configuration bits from essential bits in order to identify configuration bits which will need also state-restoring actions after a recovered SEU and which not.
Furthermore, b) an alternative classification approach using fault injection is developed in order to compare both classification techniques.
Moreover, c) we will propose a floorplanning approach for reducing the effective number of scrubbed frames and d), experimental results will give evidence that our optimization methodology not only allows to detect errors earlier but also to minimize the Mean-Time-To-Repair (MTTR) of a circuit considerably.
In particular, we show that by using our approach, the MTTR for datapath-intensive circuits can be reduced by up to 48.5% in comparison to standard approaches.
In this paper, we introduce the syndrome loss, an alternative loss function for neural error-correcting decoders based on a relaxation of the syndrome.
The syndrome loss penalizes the decoder for producing outputs that do not correspond to valid codewords.
We show that training with the syndrome loss yields decoders with consistently lower frame error rate for a number of short block codes, at little additional cost during training and no additional cost during inference.
The proposed method does not depend on knowledge of the transmitted codeword, making it a promising tool for online adaptation to changing channel conditions.
In this paper, we propose Emo2Vec which encodes emotional semantics into vectors.
We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition.
Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora.
When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier.
The Landau collision integral is an accurate model for the small-angle dominated Coulomb collisions in fusion plasmas.
We investigate a high order accurate, fully conservative, finite element discretization of the nonlinear multi-species Landau integral with adaptive mesh refinement using the PETSc library (www.mcs.anl.gov/petsc).
We develop algorithms and techniques to efficiently utilize emerging architectures with an approach that minimizes memory usage and movement and is suitable for vector processing.
The Landau collision integral is vectorized with Intel AVX-512 intrinsics and the solver sustains as much as 22% of the theoretical peak flop rate of the Second Generation Intel Xeon Phi, Knights Landing, processor.
In this work, a study on Variable Neighborhood Search algorithms for multi-depot dial-a-ride problems is presented.
In dial-a-ride problems patients need to be transported from pre-specified pickup locations to pre-specified delivery locations, under different considerations.
The addressed problem presents several constraints and features, such as heterogeneous vehicles, distributed in different depots, and heterogeneous patients.
The aim is of minimizing the total routing cost, while respecting time-window, ride-time, capacity and route duration constraints.
The objective of the study is of determining the best algorithm configuration in terms of initial solution, neighborhood and local search procedures.
At this aim, two different procedures for the computation of an initial solution, six different type of neighborhoods and five local search procedures, where only intra-route changes are made, have been considered and compared.
We have also evaluated an "adjusting procedure" that aims to produce feasible solutions from infeasible solutions with small constraints violations.
The different VNS algorithms have been tested on instances from literature as well as on random instances arising from a real-world healthcare application.
This paper studies the problem of self-organizing heterogeneous LTE systems.
We propose a model that jointly considers several important characteristics of heterogeneous LTE system, including the usage of orthogonal frequency division multiple access (OFDMA), the frequency-selective fading for each link, the interference among different links, and the different transmission capabilities of different types of base stations.
We also consider the cost of energy by taking into account the power consumption, including that for wireless transmission and that for operation, of base stations and the price of energy.
Based on this model, we aim to propose a distributed protocol that improves the spectrum efficiency of the system, which is measured in terms of the weighted proportional fairness among the throughputs of clients, and reduces the cost of energy.
We identify that there are several important components involved in this problem.
We propose distributed strategies for each of these components.
Each of the proposed strategies requires small computational and communicational overheads.
Moreover, the interactions between components are also considered in the proposed strategies.
Hence, these strategies result in a solution that jointly considers all factors of heterogeneous LTE systems.
Simulation results also show that our proposed strategies achieve much better performance than existing ones.
This paper explores the spatial and temporal diffusion of political violence in North and West Africa.
It does so by endeavoring to represent the mental landscape that lives in the back of a group leader's mind as he contemplates strategic targeting.
We assume that this representation is a combination of the physical geography of the target environment, and the mental and physical cost of following a seemingly random pattern of attacks.
Focusing on the distance and time between attacks and taking into consideration the transaction costs that state boundaries impose, we wish to understand what constrains a group leader to attack at a location other than the one that would seem to yield the greatest overt payoff.
By its very nature, the research problem defies the collection of a full set of structural data.
Instead, we leverage functional data from the Armed Conflict Location and Event Data project (ACLED) dataset that, inter alia, meticulously catalogues violent extremist incidents in North and West Africa since 1997, to generate a network whose nodes are administrative regions.
These nodes are connected by edges of qualitatively different types: undirected edges representing geographic distance, undirected edges representing borders, and directed edges representing consecutive attacks by the same group at the two endpoints.
We analyze the resulting network using novel spectral embedding techniques that are able to account fully for the different types of edges.
The result is a map of North and West Africa that depicts the permeability to violence.
A better understanding of how location, time, and borders condition attacks enables planning, prepositioning, and response.
Gender inequality starts before birth.
Parents tend to prefer boys over girls, which is manifested in reproductive behavior, marital life, and parents' pastimes and investments in their children.
While social media and sharing information about children (so-called "sharenting") have become an integral part of parenthood, it is not well-known if and how gender preference shapes online behavior of users.
In this paper, we investigate public mentions of daughters and sons on social media.
We use data from a popular social networking site on public posts from 635,665 users.
We find that both men and women mention sons more often than daughters in their posts.
We also find that posts featuring sons get more "likes" on average.
Our results indicate that girls are underrepresented in parents' digital narratives about their children.
This gender imbalance may send a message that girls are less important than boys, or that they deserve less attention, thus reinforcing gender inequality.
This paper presents a novel design of a crawler robot which is capable of transforming its chassis from an Omni crawler mode to a large-sized wheel mode using a novel mechanism.
The transformation occurs without any additional actuators.
Interestingly the robot can transform into a large diameter and small width wheel which enhances its maneuverability like small turning radius and fast/efficient locomotion.
This paper contributes on improving the locomotion mode of previously developed hybrid compliant omnicrawler robot CObRaSO.
In addition to legged and tracked mechanism, CObRaSO can now display large wheel mode which contributes to its locomotion capabilities.
Mechanical design of the robot has been explained in a detailed manner in this paper and also the transforming experiment and torque analysis has been shown clearly
This paper describes an approach to automatically extracting floor plans from the kinds of incomplete measurements that could be acquired by an autonomous mobile robot.
The approach proceeds by reasoning about extended structural layout surfaces which are automatically extracted from the available data.
The scheme can be run in an online manner to build water tight representations of the environment.
The system effectively speculates about room boundaries and free space regions which provides useful guidance to subsequent motion planning systems.
Experimental results are presented on multiple data sets.
Integration Adapters are a fundamental part of an integration system, since they provide (business) applications access to its messaging channel.
However, their modeling and configuration remain under-represented.
In previous work, the integration control and data flow syntax and semantics have been expressed in the Business Process Model and Notation (BPMN) as a semantic model for message-based integration, while adapter and the related quality of service modeling were left for further studies.
In this work we specify common adapter capabilities and derive general modeling patterns, for which we define a compliant representation in BPMN.
The patterns extend previous work by the adapter flow, evaluated syntactically and semantically for common adapter characteristics.
This paper presents a formal specification of the Ad hoc On-Demand Distance Vector (AODV) routing protocol using AWN (Algebra for Wireless Networks), a recent process algebra which has been tailored for the modelling of Mobile Ad Hoc Networks and Wireless Mesh Network protocols.
Our formalisation models the exact details of the core functionality of AODV, such as route discovery, route maintenance and error handling.
We demonstrate how AWN can be used to reason about critical protocol properties by providing detailed proofs of loop freedom and route correctness.
Applications of perceptual image quality assessment (IQA) in image and video processing, such as image acquisition, image compression, image restoration and multimedia communication, have led to the development of many IQA metrics.
In this paper, a reliable full reference IQA model is proposed that utilize gradient similarity (GS), chromaticity similarity (CS), and deviation pooling (DP).
By considering the shortcomings of the commonly used GS to model human visual system (HVS), a new GS is proposed through a fusion technique that is more likely to follow HVS.
We propose an efficient and effective formulation to calculate the joint similarity map of two chromatic channels for the purpose of measuring color changes.
In comparison with a commonly used formulation in the literature, the proposed CS map is shown to be more efficient and provide comparable or better quality predictions.
Motivated by a recent work that utilizes the standard deviation pooling, a general formulation of the DP is presented in this paper and used to compute a final score from the proposed GS and CS maps.
This proposed formulation of DP benefits from the Minkowski pooling and a proposed power pooling as well.
The experimental results on six datasets of natural images, a synthetic dataset, and a digitally retouched dataset show that the proposed index provides comparable or better quality predictions than the most recent and competing state-of-the-art IQA metrics in the literature, it is reliable and has low complexity.
The MATLAB source code of the proposed metric is available at https://www.mathworks.com/matlabcentral/fileexchange/59809.
We compare the visibility of Latin American and Caribbean (LAC) publications in the Core Collection indexes of the Web of Science (WoS)--Science Citation Index Expanded, Social Sciences Citation Index, and Arts & Humanities Citation Index--and the SciELO Citation Index (SciELO CI) which was integrated into the larger WoS platform in 2014.
The purpose of this comparison is to contribute to our understanding of the communication of scientific knowledge produced in Latin America and the Caribbean, and to provide some reflections on the potential benefits of the articulation of regional indexing exercises into WoS for a better understanding of geographic and disciplinary contributions.
How is the regional level of SciELO CI related to the global range of WoS?
In WoS, LAC authors are integrated at the global level in international networks, while SciELO has provided a platform for interactions among LAC researchers.
The articulation of SciELO into WoS may improve the international visibility of the regional journals, but at the cost of independent journal inclusion criteria.
We address the problem of localisation of objects as bounding boxes in images with weak labels.
This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes.
We propose a novel framework based on Bayesian joint topic modelling.
Our framework has three distinctive advantages over previous works: (1) All object classes and image backgrounds are modelled jointly together in a single generative model so that "explaining away" inference can resolve ambiguity and lead to better learning and localisation.
(2) The Bayesian formulation of the model enables easy integration of prior knowledge about object appearance to compensate for limited supervision.
(3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning.
Extensive experiments on the challenging VOC dataset demonstrate that our approach outperforms the state-of-the-art competitors.
False information can be created and spread easily through the web and social media platforms, resulting in widespread real-world impact.
Characterizing how false information proliferates on social platforms and why it succeeds in deceiving readers are critical to develop efficient detection algorithms and tools for early detection.
A recent surge of research in this area has aimed to address the key issues using methods based on feature engineering, graph mining, and information modeling.
Majority of the research has primarily focused on two broad categories of false information: opinion-based (e.g., fake reviews), and fact-based (e.g., false news and hoaxes).
Therefore, in this work, we present a comprehensive survey spanning diverse aspects of false information, namely (i) the actors involved in spreading false information, (ii) rationale behind successfully deceiving readers, (iii) quantifying the impact of false information, (iv) measuring its characteristics across different dimensions, and finally, (iv) algorithms developed to detect false information.
In doing so, we create a unified framework to describe these recent methods and highlight a number of important directions for future research.
Despite the importance of predicting evacuation mobility dynamics after large scale disasters for effective first response and disaster relief, our general understanding of evacuation behavior remains limited because of the lack of empirical evidence on the evacuation movement of individuals across multiple disaster instances.
Here we investigate the GPS trajectories of a total of more than 1 million anonymized mobile phone users whose positions are tracked for a period of 2 months before and after four of the major earthquakes that occurred in Japan.
Through a cross comparative analysis between the four disaster instances, we find that in contrast with the assumed complexity of evacuation decision making mechanisms in crisis situations, the individuals' evacuation probability is strongly dependent on the seismic intensity that they experience.
In fact, we show that the evacuation probabilities in all earthquakes collapse into a similar pattern, with a critical threshold at around seismic intensity 5.5.
This indicates that despite the diversity in the earthquakes profiles and urban characteristics, evacuation behavior is similarly dependent on seismic intensity.
Moreover, we found that probability density functions of the distances that individuals evacuate are not dependent on seismic intensities that individuals experience.
These insights from empirical analysis on evacuation from multiple earthquake instances using large scale mobility data contributes to a deeper understanding of how people react to earthquakes, and can potentially assist decision makers to simulate and predict the number of evacuees in urban areas with little computational time and cost, by using population density information and seismic intensity which can be observed instantaneously after the shock.
In this paper, we construct MDS Euclidean self-dual codes which are extended cyclic duadic codes.
And we obtain many new MDS Euclidean self-dual codes.
We also construct MDS Hermitian self-dual codes from generalized Reed-Solomon codes and constacyclic codes.
And we give some results on Hermitian self-dual codes, which are the extended cyclic duadic codes.
In the dawn of computer science and the eve of neuroscience we participate in rebirth of neuroscience due to new technology that allows us to deeply and precisely explore whole new world that dwells in our brains.
Increasingly, Software Engineering (SE) researchers use search-based optimization techniques to solve SE problems with multiple conflicting objectives.
These techniques often apply CPU-intensive evolutionary algorithms to explore generations of mutations to a population of candidate solutions.
An alternative approach, proposed in this paper, is to start with a very large population and sample down to just the better solutions.
We call this method "SWAY", short for "the sampling way".
Sway is very simple to implement and, in studies with various software engineering models, this sampling approach was found to be competitive with corresponding state-of-the-art evolutionary algorithms while requiring far less computation cost.
Considering the simplicity and effectiveness of Sway, we, therefore, propose this approach as a baseline method for search-based software engineering models, especially for models that are very slow to execute.
Numerous propagation models describing social influence in social networks can be found in the literature.
This makes the choice of an appropriate model in a given situation difficult.
Selecting the most relevant model requires the ability to objectively compare them.
This comparison can only be made at the cost of describing models based on a common formalism and yet independent from them.
We propose to use graph rewriting to formally describe propagation mechanisms as local transformation rules applied according to a strategy.
This approach makes sense when it is supported by a visual analytics framework dedicated to graph rewriting.
The paper first presents our methodology to describe some propagation models as a graph rewriting problem.
Then, we illustrate how our visual analytics framework allows to interactively manipulate models, and underline their differences based on measures computed on simulation traces.
Bitcoin blockchain faces the bitcoin scalability problem, for which bitcoin's blocks contain the transactions on the bitcoin network.
The on-chain transaction processing capacity of the bitcoin network is limited by the average block creation time of 10 minutes and the block size limit.
These jointly constrain the network's throughput.
The transaction processing capacity maximum is estimated between 3.3 and 7 transactions per second (TPS).
A Layer2 Network, named Lightning Network, is proposed and activated solutions to address this problem.
LN operates on top of the bitcoin network as a cache to allow payments to be affected that are not immediately put on the blockchain.
However, it also brings some drawbacks.
In this paper, we observe a specific payment issue among current LN, which requires additional claims to blockchain and is time-consuming.
We call the issue as shares issue.
Therefore, we propose Rapido to explicitly address the shares issue.
Furthermore, a new smart contract, D-HTLC, is equipped with Rapido as the payment protocol.
We finally provide a proof of concept implementation and simulation for both Rapido and LN, in which Rapdio not only mitigates the shares issue but also mitigates the skewness issue thus is proved to be more applicable for various transactions than LN.
In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs).
Relaxation techniques arising in statistical physics which have already been used successfully in this context are reinterpreted as solutions of a viscous Hamilton-Jacobi PDE.
Using a stochastic control interpretation allows we prove that the modified algorithm performs better in expectation that stochastic gradient descent.
Well-known PDE regularity results allow us to analyze the geometry of the relaxed energy landscape, confirming empirical evidence.
The PDE is derived from a stochastic homogenization problem, which arises in the implementation of the algorithm.
The algorithms scale well in practice and can effectively tackle the high dimensionality of modern neural networks.
Few ideas have enjoyed as large an impact on deep learning as convolution.
For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate.
In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space.
Although convolutional networks would seem appropriate for this task, we show that they fail spectacularly.
We demonstrate and carefully analyze the failure first on a toy problem, at which point a simple fix becomes obvious.
We call this solution CoordConv, which works by giving convolution access to its own input coordinates through the use of extra coordinate channels.
Without sacrificing the computational and parametric efficiency of ordinary convolution, CoordConv allows networks to learn either complete translation invariance or varying degrees of translation dependence, as required by the end task.
CoordConv solves the coordinate transform problem with perfect generalization and 150 times faster with 10--100 times fewer parameters than convolution.
This stark contrast raises the question: to what extent has this inability of convolution persisted insidiously inside other tasks, subtly hampering performance from within?
A complete answer to this question will require further investigation, but we show preliminary evidence that swapping convolution for CoordConv can improve models on a diverse set of tasks.
Using CoordConv in a GAN produced less mode collapse as the transform between high-level spatial latents and pixels becomes easier to learn.
A Faster R-CNN detection model trained on MNIST showed 24% better IOU when using CoordConv, and in the RL domain agents playing Atari games benefit significantly from the use of CoordConv layers.
Management of data in education sector particularly management of data for big universities with several employees, departments and students is a very challenging task.
There are also problems such as lack of proper funds and manpower for management of such data in universities.
Education sector can easily and effectively take advantage of cloud computing skills for management of data.
It can enhance the learning experience as a whole and can add entirely new dimensions to the way in which education is imbibed.
Several benefits of Cloud computing such as monetary benefits, environmental benefits and remote data access for management of data such as university database can be used in education sector.
Therefore, in this paper we have proposed an effective framework for managing university data using a cloud based environment.
We have also proposed cloud data management simulator: a new simulation framework which demonstrates the applicability of cloud in the current education sector.
The framework consists of a cloud developed for processing a universities database which consists of staff and students.
It has the following features (i) support for modeling cloud computing infrastructure, which includes data centers containing university database; (ii) a user friendly interface; (iii) flexibility to switch between the different types of users; and (iv) virtualized access to cloud data.
Organizing data into semantically more meaningful is one of the fundamental modes of understanding and learning.
Cluster analysis is a formal study of methods for understanding and algorithm for learning.
K-mean clustering algorithm is one of the most fundamental and simple clustering algorithms.
When there is no prior knowledge about the distribution of data sets, K-mean is the first choice for clustering with an initial number of clusters.
In this paper a novel distance metric called Design Specification (DS) distance measure function is integrated with K-mean clustering algorithm to improve cluster accuracy.
The K-means algorithm with proposed distance measure maximizes the cluster accuracy to 99.98% at P = 1.525, which is determined through the iterative procedure.
The performance of Design Specification (DS) distance measure function with K - mean algorithm is compared with the performances of other standard distance functions such as Euclidian, squared Euclidean, City Block, and Chebshew similarity measures deployed with K-mean algorithm.The proposed method is evaluated on the engineering materials database.
The experiments on cluster analysis and the outlier profiling show that these is an excellent improvement in the performance of the proposed method.
SDN controllers must be periodically modified to add features, improve performance, and fix bugs, but current techniques for implementing dynamic updates are inadequate.
Simply halting old controllers and bringing up new ones can cause state to be lost, which often leads to incorrect behavior-e.g., if the state represents hosts blacklisted by a firewall, then traffic that should be blocked may be allowed to pass through.
Techniques based on record and replay can reconstruct state automatically, but they are expensive to deploy and can lead to incorrect behavior.
Problematic scenarios are especially likely to arise in distributed controllers and with semantics-altering updates.
This paper presents a new approach to implementing dynamic controller updates based on explicit state transfer.
Instead of attempting to infer state changes automatically-an approach that is expensive and fundamentally incomplete-our framework gives programmers effective tools for implementing correct updates that avoid major disruptions.
We develop primitives that enable programmers to directly (and easily, in most cases) initialize the new controller's state as a function of old state and we design protocols that ensure consistent behavior during the transition.
We also present a prototype implementation called Morpheus, and evaluate its effectiveness on representative case studies.
Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years.
However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures.
In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on the numbers of convolutional layers and pooling layers from the traditional convolutional auto-encoder.
We also design an architecture discovery method by using particle swarm optimization, which is capable of automatically searching for the optimal architectures of the proposed flexible convolutional auto-encoder with much less computational resource and without any manual intervention.
We use the designed architecture optimization algorithm to test the proposed flexible convolutional auto-encoder through utilizing one graphic processing unit card on four extensively used image classification datasets.
Experimental results show that our work in this paper significantly outperform the peer competitors including the state-of-the-art algorithm.
This paper is about detecting functional objects and inferring human intentions in surveillance videos of public spaces.
People in the videos are expected to intentionally take shortest paths toward functional objects subject to obstacles, where people can satisfy certain needs (e.g., a vending machine can quench thirst), by following one of three possible intent behaviors: reach a single functional object and stop, or sequentially visit several functional objects, or initially start moving toward one goal but then change the intent to move toward another.
Since detecting functional objects in low-resolution surveillance videos is typically unreliable, we call them "dark matter" characterized by the functionality to attract people.
We formulate the Agent-based Lagrangian Mechanics wherein human trajectories are probabilistically modeled as motions of agents in many layers of "dark-energy" fields, where each agent can select a particular force field to affect its motions, and thus define the minimum-energy Dijkstra path toward the corresponding source "dark matter".
For evaluation, we compiled and annotated a new dataset.
The results demonstrate our effectiveness in predicting human intent behaviors and trajectories, and localizing functional objects, as well as discovering distinct functional classes of objects by clustering human motion behavior in the vicinity of functional objects.
High-Level Synthesis (HLS) is emerging as a mainstream design methodology, allowing software designers to enjoy the benefits of a hardware implementation.
Significant work has led to effective compilers that produce high-quality hardware designs from software specifications.
However, in order to fully benefit from the promise of HLS, a complete ecosystem that provides the ability to analyze, debug, and optimize designs is essential.
This ecosystem has to be accessible to software designers.
This is challenging, since software developers view their designs very differently than how they are physically implemented on-chip.
Rather than individual sequential lines of code, the implementation consists of gates operating in parallel across multiple clock cycles.
In this paper, we report on our efforts to create an ecosystem that allows software designers to debug HLS-generated circuits in a familiar manner.
We have implemented our ideas in a debug framework that will be included in the next release of the popular LegUp high-level synthesis tool.
Reinforcement learning has enjoyed multiple successes in recent years.
However, these successes typically require very large amounts of data before an agent achieves acceptable performance.
This paper introduces a novel way of combating such requirements by leveraging existing (human or agent) knowledge.
In particular, this paper uses demonstrations from agents and humans, allowing an untrained agent to quickly achieve high performance.
We empirically compare with, and highlight the weakness of, HAT and CHAT, methods of transferring knowledge from a source agent/human to a target agent.
This paper introduces an effective transfer approach, DRoP, combining the offline knowledge (demonstrations recorded before learning) with online confidence-based performance analysis.
DRoP dynamically involves the demonstrator's knowledge, integrating it into the reinforcement learning agent's online learning loop to achieve efficient and robust learning.
With the rise of machine learning, there is a great deal of interest in treating programs as data to be fed to learning algorithms.
However, programs do not start off in a form that is immediately amenable to most off-the-shelf learning techniques.
Instead, it is necessary to transform the program to a suitable representation before a learning technique can be applied.
In this paper, we use abstractions of traces obtained from symbolic execution of a program as a representation for learning word embeddings.
We trained a variety of word embeddings under hundreds of parameterizations, and evaluated each learned embedding on a suite of different tasks.
In our evaluation, we obtain 93% top-1 accuracy on a benchmark consisting of over 19,000 API-usage analogies extracted from the Linux kernel.
In addition, we show that embeddings learned from (mainly) semantic abstractions provide nearly triple the accuracy of those learned from (mainly) syntactic abstractions.
Owing to high device density, scalability and non-volatility, Magnetic Tunnel Junction-based crossbars have garnered significant interest for implementing the weights of an artificial neural network.
The existence of only two stable states in MTJs implies a high overhead of obtaining optimal binary weights in software.
We illustrate that the inherent parallelism in the crossbar structure makes it highly appropriate for in-situ training, wherein the network is taught directly on the hardware.
It leads to significantly smaller training overhead as the training time is independent of the size of the network, while also circumventing the effects of alternate current paths in the crossbar and accounting for manufacturing variations in the device.
We show how the stochastic switching characteristics of MTJs can be leveraged to perform probabilistic weight updates using the gradient descent algorithm.
We describe how the update operations can be performed on crossbars both with and without access transistors and perform simulations on them to demonstrate the effectiveness of our techniques.
The results reveal that stochastically trained MTJ-crossbar NNs achieve a classification accuracy nearly same as that of real-valued-weight networks trained in software and exhibit immunity to device variations.
Tumor segmentation from magnetic resonance imaging (MRI) data is an important but time consuming manual task performed by medical experts.
Automating this process is a challenging task because of the high diversity in the appearance of tumor tissues among different patients and in many cases similarity with the normal tissues.
MRI is an advanced medical imaging technique providing rich information about the human soft-tissue anatomy.
There are different brain tumor detection and segmentation methods to detect and segment a brain tumor from MRI images.
These detection and segmentation approaches are reviewed with an importance placed on enlightening the advantages and drawbacks of these methods for brain tumor detection and segmentation.
The use of MRI image detection and segmentation in different procedures are also described.
Here a brief review of different segmentation for detection of brain tumor from MRI of brain has been discussed.
Learning structured representations has emerged as an important problem in many domains, including document and Web data mining, bioinformatics, and image analysis.
One approach to learning complex structures is to integrate many smaller, incomplete and noisy structure fragments.
In this work, we present an unsupervised probabilistic approach that extends affinity propagation to combine the small ontological fragments into a collection of integrated, consistent, and larger folksonomies.
This is a challenging task because the method must aggregate similar structures while avoiding structural inconsistencies and handling noise.
We validate the approach on a real-world social media dataset, comprised of shallow personal hierarchies specified by many individual users, collected from the photosharing website Flickr.
Our empirical results show that our proposed approach is able to construct deeper and denser structures, compared to an approach using only the standard affinity propagation algorithm.
Additionally, the approach yields better overall integration quality than a state-of-the-art approach based on incremental relational clustering.
We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events.
The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling.
Furthermore, the output layer is designed to handle arbitrary degrees of event overlap.
At each time step in the recurrent output sequence, an output triple is dedicated to each event category of interest to jointly model event occurrence and temporal boundaries.
That is, the network jointly determines whether an event of this category occurs, and when it occurs, by estimating onset and offset positions at each recurrent time step.
We then introduce three sequential losses for network training: multi-label classification loss, distance estimation loss, and confidence loss.
We demonstrate good generalization on two datasets: ITC-Irst for isolated audio event detection, and TUT-SED-Synthetic-2016 for overlapping audio event detection.
By the Gibbard--Satterthwaite theorem, every reasonable voting rule for three or more alternatives is susceptible to manipulation: there exist elections where one or more voters can change the election outcome in their favour by unilaterally modifying their vote.
When a given election admits several such voters, strategic voting becomes a game among potential manipulators: a manipulative vote that leads to a better outcome when other voters are truthful may lead to disastrous results when other voters choose to manipulate as well.
We consider this situation from the perspective of a boundedly rational voter, and use the cognitive hierarchy framework to identify good strategies.
We then investigate the associated algorithmic questions under the k-approval voting rule.
We obtain positive algorithmic results for k=1 and 2, and NP- and coNP-hardness results for k>3.
Grid and cloud computing systems have been extensively used to solve large and complex problems in science and engineering areas.
These systems include powerful computing resources connected through high-speed networks.
Due to recent advances in mobile computing and networking technologies, it has become feasible to integrate various mobile devices such as robots, aerial vehicles, sensors, and smartphones with grid and cloud computing systems.
This integration enables design and development of next generation of applications through sharing of resources in mobile environments and also introduces several challenges due to dynamic and unpredictable network.
This paper discusses applications, research challenges involved in design and development of mobile grid and cloud computing systems, and recent advances in the field.
In today's typical industrial environments, the computation of the data distribution schedules is highly centralised.
Typically, a central entity configures the data forwarding paths so as to guarantee low delivery delays between data producers and consumers.
However, these requirements might become impossible to meet later on, due to link or node failures, or excessive degradation of their performance.
In this paper, we focus on maintaining the network functionality required by the applications after such events.
We avoid continuously recomputing the configuration centrally, by designing an energy efficient local and distributed path reconfiguration method.
Specifically, given the operational parameters required by the applications, we provide several algorithmic functions which locally reconfigure the data distribution paths, when a communication link or a network node fails.
We compare our method through simulations to other state of the art methods and we demonstrate performance gains in terms of energy consumption and data delivery success rate as well as some emerging key insights which can lead to further performance gains.
Many popular form factors of digital assistant---such as Amazon Echo, Apple Homepod or Google Home---enable the user to hold a conversation with the assistant based only on the speech modality.
The lack of a screen from which the user can read text or watch supporting images or video presents unique challenges.
In order to satisfy the information need of a user, we believe that the presentation of the answer needs to be optimized for such voice-only interactions.
In this paper we propose a task of evaluating usefulness of prosody modifications for the purpose of voice-only question answering.
We describe a crowd-sourcing setup where we evaluate the quality of these modifications along multiple dimensions corresponding to the informativeness, naturalness, and ability of the user to identify the key part of the answer.
In addition, we propose a set of simple prosodic modifications that highlight important parts of the answer using various acoustic cues.
Topological aspects, like community structure, and temporal activity patterns, like burstiness, have been shown to severly influence the speed of spreading in temporal networks.
We study the influence of the topology on the susceptible-infected (SI) spreading on time stamped communication networks, as obtained from a dataset of mobile phone records.
We consider city level networks with intra- and inter-city connections.
The networks using only intra-city links are usually sparse, where the spreading depends mainly on the average degree.
The inter-city links serve as bridges in spreading, speeding up considerably the process.
We demonstrate the effect also on model simulations.
Autonomous unmanned aerial vehicles (UAVs) that can execute aggressive (i.e., high-speed and high-acceleration) maneuvers have attracted significant attention in the past few years.
In this paper, we propose a novel control law for accurate tracking of aggressive quadcopter trajectories.
The proposed method tracks position and yaw angle with their derivatives of up to fourth order, specifically, the position, velocity, acceleration, jerk, and snap along with the yaw angle, yaw rate and yaw acceleration.
Two key aspects of the proposed method are the following.
First, the controller exploits the differential flatness of the quadcopter dynamics to generate feedforward inputs for attitude rate and attitude acceleration in order to track the jerk and snap references.
The tracking is enabled by direct control of body torque using closed-loop control of all four propeller speeds based on optical encoders attached to the motors.
Second, the controller utilizes the incremental nonlinear dynamic inversion (INDI) method for accurate tracking of linear and angular accelerations despite external disturbances.
Hence, no prior modeling of aerodynamic effects is required.
We rigorously analyze the proposed controller through response analysis, and we demonstrate it in experiments.
The proposed control law enables a 1-kg quadcopter UAV to track complex 3D trajectories, reaching speeds up to 8.2 m/s and accelerations up to 2g, while keeping the root-mean-square tracking error down to 4 cm, in a flight volume that is roughly 6.5 m long, 6.5 m wide, and 1.5 m tall.
We also demonstrate the robustness of the controller by attaching a drag plate to the UAV in flight tests and by pulling on the UAV with a rope during hover.
Heterogeneous face recognition (HFR) refers to matching face images acquired from different sources (i.e., different sensors or different wavelengths) for identification.
HFR plays an important role in both biometrics research and industry.
In spite of promising progresses achieved in recent years, HFR is still a challenging problem due to the difficulty to represent two heterogeneous images in a homogeneous manner.
Existing HFR methods either represent an image ignoring the spatial information, or rely on a transformation procedure which complicates the recognition task.
Considering these problems, we propose a novel graphical representation based HFR method (G-HFR) in this paper.
Markov networks are employed to represent heterogeneous image patches separately, which takes the spatial compatibility between neighboring image patches into consideration.
A coupled representation similarity metric (CRSM) is designed to measure the similarity between obtained graphical representations.
Extensive experiments conducted on multiple HFR scenarios (viewed sketch, forensic sketch, near infrared image, and thermal infrared image) show that the proposed method outperforms state-of-the-art methods.
Breast tissue segmentation into dense and fat tissue is important for determining the breast density in mammograms.
Knowing the breast density is important both in diagnostic and computer-aided detection applications.
There are many different ways to express the density of a breast and good quality segmentation should provide the possibility to perform accurate classification no matter which classification rule is being used.
Knowing the right breast density and having the knowledge of changes in the breast density could give a hint of a process which started to happen within a patient.
Mammograms generally suffer from a problem of different tissue overlapping which results in the possibility of inaccurate detection of tissue types.
Fibroglandular tissue presents rather high attenuation of X-rays and is visible as brighter in the resulting image but overlapping fibrous tissue and blood vessels could easily be replaced with fibroglandular tissue in automatic segmentation algorithms.
Small blood vessels and microcalcifications are also shown as bright objects with similar intensities as dense tissue but do have some properties which makes possible to suppress them from the final results.
In this paper we try to divide dense and fat tissue by suppressing the scattered structures which do not represent glandular or dense tissue in order to divide mammograms more accurately in the two major tissue types.
For suppressing blood vessels and microcalcifications we have used Gabor filters of different size and orientation and a combination of morphological operations on filtered image with enhanced contrast.
Fixing a software error requires understanding its root cause.
In this paper, we introduce ''causality traces'', crafted execution traces augmented with the information needed to reconstruct the causal chain from the root cause of a bug to an execution error.
We propose an approach and a tool, called Casper, for dynamically constructing causality traces for null dereference errors.
The core idea of Casper is to inject special values, called ''ghosts'', into the execution stream to construct the causality trace at runtime.
We evaluate our contribution by providing and assessing the causality traces of 14 real null dereference bugs collected over six large, popular open-source projects.
Over this data set, Casper builds a causality trace in less than 5 seconds.
Continuous Integration (CI) implies that a whole developer team works together on the mainline of a software project.
CI systems automate the builds of a software.
Sometimes a developer checks in code, which breaks the build.
A broken build might not be a problem by itself, but it has the potential to disrupt co-workers, hence it affects the performance of the team.
In this study, we investigate the interplay between nonfunctional requirements (NFRs) and builds statuses from 1,283 software projects.
We found significant differences among NFRs related-builds statuses.
Thus, tools can be proposed to improve CI with focus on new ways to prevent failures into CI, specially for efficiency and usability related builds.
Also, the time required to put a broken build back on track indicates a bimodal distribution along all NFRs, with higher peaks within a day and lower peaks in six weeks.
Our results suggest that more planned schedule for maintainability for Ruby, and for functionality and reliability for Java would decrease delays related to broken builds.
Working adults spend nearly one third of their daily time at their jobs.
In this paper, we study job-related social media discourse from a community of users.
We use both crowdsourcing and local expertise to train a classifier to detect job-related messages on Twitter.
Additionally, we analyze the linguistic differences in a job-related corpus of tweets between individual users vs. commercial accounts.
The volumes of job-related tweets from individual users indicate that people use Twitter with distinct monthly, daily, and hourly patterns.
We further show that the moods associated with jobs, positive and negative, have unique diurnal rhythms.
One of the big restrictions in brain computer interface field is the very limited training samples, it is difficult to build a reliable and usable system with such limited data.
Inspired by generative adversarial networks, we propose a conditional Deep Convolutional Generative Adversarial (cDCGAN) Networks method to generate more artificial EEG signal automatically for data augmentation to improve the performance of convolutional neural networks in brain computer interface field and overcome the small training dataset problems.
We evaluate the proposed cDCGAN method on BCI competition dataset of motor imagery.
The results show that the generated artificial EEG data from Gaussian noise can learn the features from raw EEG data and has no less than the classification accuracy of raw EEG data in the testing dataset.
Also by using generated artificial data can effectively improve classification accuracy at the same model with limited training data.
We present a simple yet effective approach for linking entities in queries.
The key idea is to search sentences similar to a query from Wikipedia articles and directly use the human-annotated entities in the similar sentences as candidate entities for the query.
Then, we employ a rich set of features, such as link-probability, context-matching, word embeddings, and relatedness among candidate entities as well as their related entities, to rank the candidates under a regression based framework.
The advantages of our approach lie in two aspects, which contribute to the ranking process and final linking result.
First, it can greatly reduce the number of candidate entities by filtering out irrelevant entities with the words in the query.
Second, we can obtain the query sensitive prior probability in addition to the static link-probability derived from all Wikipedia articles.
We conduct experiments on two benchmark datasets on entity linking for queries, namely the ERD14 dataset and the GERDAQ dataset.
Experimental results show that our method outperforms state-of-the-art systems and yields 75.0% in F1 on the ERD14 dataset and 56.9% on the GERDAQ dataset.
While objects from different categories can be reliably decoded from fMRI brain response patterns, it has proved more difficult to distinguish visually similar inputs, such as different instances of the same category.
Here, we apply a recently developed deep learning system to the reconstruction of face images from human fMRI patterns.
We trained a variational auto-encoder (VAE) neural network using a GAN (Generative Adversarial Network) unsupervised training procedure over a large dataset of celebrity faces.
The auto-encoder latent space provides a meaningful, topologically organized 1024-dimensional description of each image.
We then presented several thousand face images to human subjects, and learned a simple linear mapping between the multi-voxel fMRI activation patterns and the 1024 latent dimensions.
Finally, we applied this mapping to novel test images, turning the obtained fMRI patterns into VAE latent codes, and ultimately the codes into face reconstructions.
Qualitative and quantitative evaluation of the reconstructions revealed robust pairwise decoding (>95% correct), and a strong improvement relative to a baseline model (PCA decomposition).
Furthermore, this brain decoding model can readily be recycled to probe human face perception along many dimensions of interest; for example, the technique allowed for accurate gender classification, and even to decode which face was imagined, rather than seen by the subject.
We hypothesize that the latent space of modern deep learning generative models could serve as a valid approximation for human brain representations.
In this paper, the problem of secure transmission of sensitive contents over the public network Internet is addressed by proposing a novel data hiding method in encrypted images with dual-level security.
The secret information is divided into three blocks using a specific pattern, followed by an encryption mechanism based on the three-level encryption algorithm (TLEA).
The input image is scrambled using a secret key, and the encrypted sub-message blocks are then embedded in the scrambled image by cyclic18 least significant bit (LSB) substitution method, utilizing LSBs and intermediate LSB planes.
Furthermore, the cover image and its planes are rotated at different angles using a secret key prior to embedding, deceiving the attacker during data extraction.
The usage of message blocks division, TLEA, image scrambling, and the cyclic18 LSB method results in an advanced security system, maintaining the visual transparency of resultant images and increasing the security of embedded data.
In addition, employing various secret keys for image scrambling, data encryption, and data hiding using the cyclic18 LSB method makes the data recovery comparatively more challenging for attackers.
Experimental results not only validate the effectiveness of the proposed framework in terms of visual quality and security compared to other state-of-the-art methods, but also suggest its feasibility for secure transmission of diagnostically important keyframes to healthcare centers and gastroenterologists during wireless capsule endoscopy.
Today, online privacy is the domain of regulatory measures and privacy-enhancing technologies.
Transparency in the form of external and public assessments has been proposed for improving privacy and security because it exposes otherwise hidden deficiencies.
Previous work has studied privacy attitudes and behavior of consumers.
However, little is known on how organizations react to measures that employ public "naming and shaming" as an incentive for improvement.
We performed the first study on this aspect by conducting a qualitative survey with 152 German health insurers.
We scanned their websites with PrivacyScore.org to generate a public ranking and confronted the insurers with the results.
We obtained a response rate of 27%.
Responses ranged from positive feedback to legal threats.
Only 12% of the sites - mostly non-responders - improved during our study.
Our results show that insurers struggle due to unawareness, reluctance, and incapability, and demonstrate the general difficulties of transparency-based approaches.
In digital painting software, layers organize paintings.
However, layers are not explicitly represented, transmitted, or published with the final digital painting.
We propose a technique to decompose a digital painting into layers.
In our decomposition, each layer represents a coat of paint of a single paint color applied with varying opacity throughout the image.
Our decomposition is based on the painting's RGB-space geometry.
In RGB-space, a geometric structure is revealed due to the linear nature of the standard Porter-Duff "over" pixel compositing operation.
The vertices of the convex hull of pixels in RGB-space suggest paint colors.
Users choose the degree of simplification to perform on the convex hull, as well as a layer order for the colors.
We solve a constrained optimization problem to find maximally translucent, spatially coherent opacity for each layer, such that the composition of the layers reproduces the original image.
We demonstrate the utility of the resulting decompositions for re-editing.
This paper proposes a novel method for understanding daily hand-object manipulation by developing computer vision-based techniques.
Specifically, we focus on recognizing hand grasp types, object attributes and manipulation actions within an unified framework by exploring their contextual relationships.
Our hypothesis is that it is necessary to jointly model hands, objects and actions in order to accurately recognize multiple tasks that are correlated to each other in hand-object manipulation.
In the proposed model, we explore various semantic relationships between actions, grasp types and object attributes, and show how the context can be used to boost the recognition of each component.
We also explore the spatial relationship between the hand and object in order to detect the manipulated object from hand in cluttered environment.
Experiment results on all three recognition tasks show that our proposed method outperforms traditional appearance-based methods which are not designed to take into account contextual relationships involved in hand-object manipulation.
The visualization and generalizability study of the learned context further supports our hypothesis.
The problem of finding conflict-free trajectories for multiple agents of identical circular shape, operating in shared 2D workspace, is addressed in the paper and decoupled, e.g., prioritized, approach is used to solve this problem.
Agents' workspace is tessellated into the square grid on which any-angle moves are allowed, e.g. each agent can move into an arbitrary direction as long as this move follows the straight line segment whose endpoints are tied to the distinct grid elements.
A novel any-angle planner based on Safe Interval Path Planning (SIPP) algorithm is proposed to find trajectories for an agent moving amidst dynamic obstacles (other agents) on a grid.
This algorithm is then used as part of a prioritized multi-agent planner AA-SIPP(m).
On the theoretical, side we show that AA-SIPP(m) is complete under well-defined conditions.
On the experimental side, in simulation tests with up to 200 agents involved, we show that our planner finds much better solutions in terms of cost (up to 20%) compared to the planners relying on cardinal moves only.
Graph databases in many applications---semantic web, transport or biological networks among others---are not only large, but also frequently modified.
Evaluating graph queries in this dynamic context is a challenging task, as those queries often combine first-order and navigational features.
Motivated by recent results on maintaining dynamic reachability, we study the dynamic evaluation of traditional query languages for graphs in the descriptive complexity framework.
Our focus is on maintaining regular path queries, and extensions thereof, by first-order formulas.
In particular we are interested in path queries defined by non-regular languages and in extended conjunctive regular path queries (which allow to compare labels of paths based on word relations).
Further we study the closely related problems of maintaining distances in graphs and reachability in product graphs.
In this preliminary study we obtain upper bounds for those problems in restricted settings, such as undirected and acyclic graphs, or under insertions only, and negative results regarding quantifier-free update formulas.
In addition we point out interesting directions for further research.
This article presents the consensus of a saturated second order multi-agent system with non-switching dynamics that can be represented by a directed graph.
The system is affected by data processing (input delay) and communication time-delays that are assumed to be asynchronous.
The agents have saturation nonlinearities, each of them is approximated into separate linear and nonlinear elements.
Nonlinear elements are represented by describing functions.
Describing functions and stability of linear elements are used to estimate the existence of limit cycles in the system with multiple control laws.
Stability analysis of the linear element is performed using Lyapunov-Krasovskii functions and frequency domain analysis.
A comparison of pros and cons of both the analyses with respect to time-delay ranges, applicability and computation complexity is presented.
Simulation and corresponding hardware implementation results are demonstrated to support theoretical results.
We consider the flow network model to solve the multiprocessor real-time task scheduling problems.
Using the flow network model or its generic form, linear programming (LP) formulation, for the problems is not new.
However, the previous works have limitations, for example, that they are classified as offline scheduling techniques since they establish a flow network model or an LP problem considering a very long time interval.
In this study, we propose how to construct the flow network model for online scheduling periodic real-time tasks on multiprocessors.
Our key idea is to construct the flow network only for the active instances of tasks at the current scheduling time, while guaranteeing the existence of an optimal schedule for the future instances of the tasks.
The optimal scheduling is here defined to ensure that all real-time tasks meet their deadlines when the total utilization demand of the given tasks does not exceed the total processing capacity.
We then propose the flow network model-based polynomial-time scheduling algorithms.
Advantageously, the flow network model allows the task workload to be collected unfairly within a certain time interval without losing the optimality.
It thus leads us to designing three unfair-but-optimal scheduling algorithms on both continuous and discrete-time models.
Especially, our unfair-but-optimal scheduling algorithm on a discrete-time model is, to the best of our knowledge, the first in the problem domain.
We experimentally demonstrate that it significantly alleviates the scheduling overheads, i.e., the reduced number of preemptions with the comparable number of task migrations across processors.
Generation and load balance is required in the economic scheduling of generating units in the smart grid.
Variable energy generations, particularly from wind and solar energy resources, are witnessing a rapid boost, and, it is anticipated that with a certain level of their penetration, they can become noteworthy sources of uncertainty.
As in the case of load demand, energy forecasting can also be used to mitigate some of the challenges that arise from the uncertainty in the resource.
While wind energy forecasting research is considered mature, solar energy forecasting is witnessing a steadily growing attention from the research community.
This paper presents a support vector regression model to produce solar power forecasts on a rolling basis for 24 hours ahead over an entire year, to mimic the practical business of energy forecasting.
Twelve weather variables are considered from a high-quality benchmark dataset and new variables are extracted.
The added value of the heat index and wind speed as additional variables to the model is studied across different seasons.
The support vector regression model performance is compared with artificial neural networks and multiple linear regression models for energy forecasting.
The Internet of Things (IoT) propagates the paradigm of interconnecting billions of heterogeneous devices by various manufacturers.
To enable IoT applications, the communication between IoT devices follows specifications defined by standard developing organizations.
In this paper, we present a case study that investigates disclosed insecurities of the popular IoT standard ZigBee, and derive general lessons about security economics in IoT standardization efforts.
We discuss the motivation of IoT standardization efforts that are primarily driven from an economic perspective, in which large investments in security are not considered necessary since the consumers do not reward them.
Success at the market is achieved by being quick-to-market, providing functional features and offering easy integration for complementors.
Nevertheless, manufacturers should not only consider economic reasons but also see their responsibility to protect humans and technological infrastructures from being threatened by insecure IoT products.
In this context, we propose a number of recommendations to strengthen the security design in future IoT standardization efforts, ranging from the definition of a precise security model to the enforcement of an update policy.
The success of various applications including robotics, digital content creation, and visualization demand a structured and abstract representation of the 3D world from limited sensor data.
Inspired by the nature of human perception of 3D shapes as a collection of simple parts, we explore such an abstract shape representation based on primitives.
Given a single depth image of an object, we present 3D-PRNN, a generative recurrent neural network that synthesizes multiple plausible shapes composed of a set of primitives.
Our generative model encodes symmetry characteristics of common man-made objects, preserves long-range structural coherence, and describes objects of varying complexity with a compact representation.
We also propose a method based on Gaussian Fields to generate a large scale dataset of primitive-based shape representations to train our network.
We evaluate our approach on a wide range of examples and show that it outperforms nearest-neighbor based shape retrieval methods and is on-par with voxel-based generative models while using a significantly reduced parameter space.
Recently, discriminatively learned correlation filters (DCF) has drawn much attention in visual object tracking community.
The success of DCF is potentially attributed to the fact that a large amount of samples are utilized to train the ridge regression model and predict the location of object.
To solve the regression problem in an efficient way, these samples are all generated by circularly shifting from a search patch.
However, these synthetic samples also induce some negative effects which weaken the robustness of DCF based trackers.
In this paper, we propose a Convolutional Regression framework for visual tracking (CRT).
Instead of learning the linear regression model in a closed form, we try to solve the regression problem by optimizing a one-channel-output convolution layer with Gradient Descent (GD).
In particular, the receptive field size of the convolution layer is set to the size of object.
Contrary to DCF, it is possible to incorporate all "real" samples clipped from the whole image.
A critical issue of the GD approach is that most of the convolutional samples are negative and the contribution of positive samples will be suppressed.
To address this problem, we propose a novel Automatic Hard Negative Mining method to eliminate easy negatives and enhance positives.
Extensive experiments are conducted on a widely-used benchmark with 100 sequences.
The results show that the proposed algorithm achieves outstanding performance and outperforms almost all the existing DCF based algorithms.
In this paper, we propose a novel deep learning architecture for multi-label zero-shot learning (ML-ZSL), which is able to predict multiple unseen class labels for each input instance.
Inspired by the way humans utilize semantic knowledge between objects of interests, we propose a framework that incorporates knowledge graphs for describing the relationships between multiple labels.
Our model learns an information propagation mechanism from the semantic label space, which can be applied to model the interdependencies between seen and unseen class labels.
With such investigation of structured knowledge graphs for visual reasoning, we show that our model can be applied for solving multi-label classification and ML-ZSL tasks.
Compared to state-of-the-art approaches, comparable or improved performances can be achieved by our method.
Multilingual knowledge graph (KG) embeddings provide latent semantic representations of entities and structured knowledge with cross-lingual inferences, which benefit various knowledge-driven cross-lingual NLP tasks.
However, precisely learning such cross-lingual inferences is usually hindered by the low coverage of entity alignment in many KGs.
Since many multilingual KGs also provide literal descriptions of entities, in this paper, we introduce an embedding-based approach which leverages a weakly aligned multilingual KG for semi-supervised cross-lingual learning using entity descriptions.
Our approach performs co-training of two embedding models, i.e. a multilingual KG embedding model and a multilingual literal description embedding model.
The models are trained on a large Wikipedia-based trilingual dataset where most entity alignment is unknown to training.
Experimental results show that the performance of the proposed approach on the entity alignment task improves at each iteration of co-training, and eventually reaches a stage at which it significantly surpasses previous approaches.
We also show that our approach has promising abilities for zero-shot entity alignment, and cross-lingual KG completion.
Context: Visual aesthetics is increasingly seen as an essential factor in perceived usability, interaction, and overall appraisal of user interfaces especially with respect to mobile applications.
Yet, a question that remains is how to assess and to which extend users agree on visual aesthetics.
Objective: This paper analyzes the inter-rater agreement on visual aesthetics of user interfaces of Android apps as a basis for guidelines and evaluation models.
Method: We systematically collected ratings on the visual aesthetics of 100 user interfaces of Android apps from 10 participants and analyzed the frequency distribution, reliability and influencing design aspects.
Results: In general, user interfaces of Android apps are perceived more ugly than beautiful.
Yet, raters only moderately agree on the visual aesthetics.
Disagreements seem to be related to subtle differences with respect to layout, shapes, colors, typography, and background images.
Conclusion: Visual aesthetics is a key factor for the success of apps.
However, the considerable disagreement of raters on the perceived visual aesthetics indicates the need for a better understanding of this software quality with respect to mobile apps.
Despite advances in deep learning, neural networks can only learn multiple tasks when trained on them jointly.
When tasks arrive sequentially, they lose performance on previously learnt tasks.
This phenomenon called catastrophic forgetting is a fundamental challenge to overcome before neural networks can learn continually from incoming data.
In this work, we derive inspiration from human memory to develop an architecture capable of learning continuously from sequentially incoming tasks, while averting catastrophic forgetting.
Specifically, our contributions are: (i) a dual memory architecture emulating the complementary learning systems (hippocampus and the neocortex) in the human brain, (ii) memory consolidation via generative replay of past experiences, (iii) demonstrating advantages of generative replay and dual memories via experiments, and (iv) improved performance retention on challenging tasks even for low capacity models.
Our architecture displays many characteristics of the mammalian memory and provides insights on the connection between sleep and learning.
Logic programs are now used as a representation of object-oriented source code in academic prototypes for about a decade.
This representation allows a clear and concise implementation of analyses of the object-oriented source code.
The full potential of this approach is far from being explored.
In this paper, we report about an application of the well-established theory of update propagation within logic programs.
Given the representation of the object-oriented code as facts in a logic program, a change to the code corresponds to an update of these facts.
We demonstrate how update propagation provides a generic way to generate incremental versions of such analyses.
The natural way to use Answer Set Programming (ASP) to represent knowledge in Artificial Intelligence or to solve a combinatorial problem is to elaborate a first order logic program with default negation.
In a preliminary step this program with variables is translated in an equivalent propositional one by a first tool: the grounder.
Then, the propositional program is given to a second tool: the solver.
This last one computes (if they exist) one or many answer sets (stable models) of the program, each answer set encoding one solution of the initial problem.
Until today, almost all ASP systems apply this two steps computation.
In this article, the project ASPeRiX is presented as a first order forward chaining approach for Answer Set Computing.
This project was amongst the first to introduce an approach of answer set computing that escapes the preliminary phase of rule instantiation by integrating it in the search process.
The methodology applies a forward chaining of first order rules that are grounded on the fly by means of previously produced atoms.
Theoretical foundations of the approach are presented, the main algorithms of the ASP solver ASPeRiX are detailed and some experiments and comparisons with existing systems are provided.
Event management in sensor networks is a multidisciplinary field involving several steps across the processing chain.
In this paper, we discuss the major steps that should be performed in real- or near real-time event handling including event detection, correlation, prediction and filtering.
First, we discuss existing univariate and multivariate change detection schemes for the online event detection over sensor data.
Next, we propose an online event correlation scheme that intends to unveil the internal dynamics that govern the operation of a system and are responsible for the generation of various types of events.
We show that representation of event dependencies can be accommodated within a probabilistic temporal knowledge representation framework that allows the formulation of rules.
We also address the important issue of identifying outdated dependencies among events by setting up a time-dependent framework for filtering the extracted rules over time.
The proposed theory is applied on the maritime domain and is validated through extensive experimentation with real sensor streams originating from large-scale sensor networks deployed in ships.
Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential decision-making problems such as Go and video games, but their performance can be poor when the planning depth and sampling trajectories are limited or when the rewards are sparse.
We present an adaptation of PGRD (policy-gradient for reward-design) for learning a reward-bonus function to improve UCT (a MCTS algorithm).
Unlike previous applications of PGRD in which the space of reward-bonus functions was limited to linear functions of hand-coded state-action-features, we use PGRD with a multi-layer convolutional neural network to automatically learn features from raw perception as well as to adapt the non-linear reward-bonus function parameters.
We also adopt a variance-reducing gradient method to improve PGRD's performance.
The new method improves UCT's performance on multiple ATARI games compared to UCT without the reward bonus.
Combining PGRD and Deep Learning in this way should make adapting rewards for MCTS algorithms far more widely and practically applicable than before.
We present a temporal 6-DOF tracking method which leverages deep learning to achieve state-of-the-art performance on challenging datasets of real world capture.
Our method is both more accurate and more robust to occlusions than the existing best performing approaches while maintaining real-time performance.
To assess its efficacy, we evaluate our approach on several challenging RGBD sequences of real objects in a variety of conditions.
Notably, we systematically evaluate robustness to occlusions through a series of sequences where the object to be tracked is increasingly occluded.
Finally, our approach is purely data-driven and does not require any hand-designed features: robust tracking is automatically learned from data.
Robust and lane-level positioning is essential for autonomous vehicles.
As an irreplaceable sensor, LiDAR can provide continuous and high-frequency pose estimation by means of mapping, on condition that enough environment features are available.
The error of mapping can accumulate over time.
Therefore, LiDAR is usually integrated with other sensors.
In diverse urban scenarios, the environment feature availability relies heavily on the traffic (moving and static objects) and the degree of urbanization.
Common LiDAR-based SLAM demonstrations tend to be studied in light traffic and less urbanized area.
However, its performance can be severely challenged in deep urbanized cities, such as Hong Kong, Tokyo, and New York with dense traffic and tall buildings.
This paper proposes to analyze the performance of standalone NDT-based graph SLAM and its reliability estimation in diverse urban scenarios to further evaluate the relationship between the performance of LiDAR-based SLAM and scenario conditions.
The normal distribution transform (NDT) is employed to calculate the transformation between frames of point clouds.
Then, the LiDAR odometry is performed based on the calculated continuous transformation.
The state-of-the-art graph-based optimization is used to integrate the LiDAR odometry measurements to implement optimization.
The 3D building models are generated and the definition of the degree of urbanization based on Skyplot is proposed.
Experiments are implemented in different scenarios with different degrees of urbanization and traffic conditions.
The results show that the performance of the LiDAR-based SLAM using NDT is strongly related to the traffic condition and degree of urbanization.
Link prediction appears as a central problem of network science, as it calls for unfolding the mechanisms that govern the micro-dynamics of the network.
In this work, we are interested in ego-networks, that is the mere information of interactions of a node to its neighbors, in the context of social relationships.
As the structural information is very poor, we rely on another source of information to predict links among egos' neighbors: the timing of interactions.
We define several features to capture different kinds of temporal information and apply machine learning methods to combine these various features and improve the quality of the prediction.
We demonstrate the efficiency of this temporal approach on a cellphone interaction dataset, pointing out features which prove themselves to perform well in this context, in particular the temporal profile of interactions and elapsed time between contacts.
We present MBIS (Multivariate Bayesian Image Segmentation tool), a clustering tool based on the mixture of multivariate normal distributions model.
MBIS supports multi-channel bias field correction based on a B-spline model.
A second methodological novelty is the inclusion of graph-cuts optimization for the stationary anisotropic hidden Markov random field model.
Along with MBIS, we release an evaluation framework that contains three different experiments on multi-site data.
We first validate the accuracy of segmentation and the estimated bias field for each channel.
MBIS outperforms a widely used segmentation tool in a cross-comparison evaluation.
The second experiment demonstrates the robustness of results on atlas-free segmentation of two image sets from scan-rescan protocols on 21 healthy subjects.
Multivariate segmentation is more replicable than the monospectral counterpart on T1-weighted images.
Finally, we provide a third experiment to illustrate how MBIS can be used in a large-scale study of tissue volume change with increasing age in 584 healthy subjects.
This last result is meaningful as multivariate segmentation performs robustly without the need for prior knowledge
Channel estimation is useful in millimeter wave (mmWave) MIMO communication systems.
Channel state information allows optimized designs of precoders and combiners under different metrics such as mutual information or signal-to-interference-noise (SINR) ratio.
At mmWave, MIMO precoders and combiners are usually hybrid, since this architecture provides a means to trade-off power consumption and achievable rate.
Channel estimation is challenging when using these architectures, however, since there is no direct access to the outputs of the different antenna elements in the array.
The MIMO channel can only be observed through the analog combining network, which acts as a compression stage of the received signal.
Most of prior work on channel estimation for hybrid architectures assumes a frequency-flat mmWave channel model.
In this paper, we consider a frequency-selective mmWave channel and propose compressed-sensing-based strategies to estimate the channel in the frequency domain.
We evaluate different algorithms and compute their complexity to expose trade-offs in complexity-overhead-performance as compared to those of previous approaches.
Organisations store huge amounts of data from multiple heterogeneous sources in the form of Knowledge Graphs (KGs).
One of the ways to query these KGs is to use SPARQL queries over a database engine.
Since SPARQL follows exact match semantics, the queries may return too few or no results.
Recent works have proposed query relaxation where the query engine judiciously replaces a query predicate with similar predicates using weighted relaxation rules mined from the KG.
The space of possible relaxations is potentially too large to fully explore and users are typically interested in only top-k results, so such query engines use top-k algorithms for query processing.
However, they may still process all the relaxations, many of whose answers do not contribute towards top-k answers.
This leads to computation overheads and delayed response times.
We propose Spec-QP, a query planning framework that speculatively determines which relaxations will have their results in the top-k answers.
Only these relaxations are processed using the top-k operators.
We, therefore, reduce the computation overheads and achieve faster response times without adversely affecting the quality of results.
We tested Spec-QP over two datasets - XKG and Twitter, to demonstrate the efficiency of our planning framework at reducing runtimes with reasonable accuracy for query engines supporting relaxations.
AISHELL-1 is by far the largest open-source speech corpus available for Mandarin speech recognition research.
It was released with a baseline system containing solid training and testing pipelines for Mandarin ASR.
In AISHELL-2, 1000 hours of clean read-speech data from iOS is published, which is free for academic usage.
On top of AISHELL-2 corpus, an improved recipe is developed and released, containing key components for industrial applications, such as Chinese word segmentation, flexible vocabulary expension and phone set transformation etc.
Pipelines support various state-of-the-art techniques, such as time-delayed neural networks and Lattic-Free MMI objective funciton.
In addition, we also release dev and test data from other channels(Android and Mic).
For research community, we hope that AISHELL-2 corpus can be a solid resource for topics like transfer learning and robust ASR.
For industry, we hope AISHELL-2 recipe can be a helpful reference for building meaningful industrial systems and products.
We consider a gossip approach for finding a Nash equilibrium in a distributed multi-player network game.
We extend previous results on Nash equilibrium seeking to the case when the players' cost functions may be affected by the actions of any subset of players.
An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements.
For a given interference graph, we design a generalized communication graph so that players with possibly partially-coupled cost functions exchange only their required information and make decisions based on them.
Using a set of standard assumptions on the cost functions, interference and communication graphs, we prove almost sure convergence to a Nash equilibrium for diminishing step sizes.
We then quantify the effect of the second largest eigenvalue of the expected communication matrix on the convergence rate, and illustrate the trade-off between the parameters associated with the communication and the interference graphs.
Finally, the efficacy of the proposed algorithm on a large-scale networked game is demonstrated via simulation.
In the cloud computing environment, cloud virtual machine (VM) will be more and more the number of virtual machine security and management faced giant Challenge.
In order to address security issues cloud computing virtualization environment, this paper presents a virtual machine based on efficient and dynamic deployment VM security management model state migration and scheduling, study of which virtual machine security architecture, based on AHP (Analytic Hierarchy Process) virtual machine deployment and scheduling method, based on CUSUM (Cumulative Sum) DDoS attack detection algorithm, and the above-described method for functional testing and validation.
A current trend in networking and cloud computing is to provide compute resources over widely dispersed places exemplified by initiatives like Network Function Virtualisation.
This paves the way for a widespread service deployment and can improve service quality; a nearby server can reduce the user-perceived response times.
But always using the nearest server is a bad decision if that server is already highly utilized.
This paper investigates the optimal assignment of users to widespread resources -- a convex capacitated facility location problem with integrated queuing systems.
We determine the response times depending on the number of used resources.
This enables service providers to balance between resource costs and the corresponding service quality.
We also present a linear problem reformulation showing small optimality gaps and faster solving times; this speed-up enables a swift reaction to demand changes.
Finally, we compare solutions by either considering or ignoring queuing systems and discuss the response time reduction by using the more complex model.
Our investigations are backed by large-scale numerical evaluations.
ERP systems contain huge amounts of data related to the actual execution of business processes.
These systems have a particular way of recording activities which results in an unclear display of business processes in event logs.
Several works have been conducted on ERP systems, most of them focusing on the development of new algorithms for the automatic discovery of business processes.
We focused on addressing issues like, how can organizations with ERP systems apply process mining for analyzing their business processes in order to improve them.
The data handling aspect of ERP systems contrasts with those of BPMS or workflow based systems, whose systematical storage of events facilitates the application of process mining techniques.
CRISP-DM has emerged as the de facto standard for developing data mining and knowledge discovery projects.
Successful data mining requires three families of analytical capabilities namely reporting, classification and forecasting.
A data miner uses more than one analytical method to get the best results.
The objective of this paper is to improve the usability and understandability of process mining techniques, by implementing CRISP-DM methodology for their application in ERP contexts, detailed in terms of specific implementation tools and step by step coordination.
Our study confirms that data discovery from ERP system improves strategic and operational decision making.
Analog/digital hybrid precoder and combiner have been widely used in millimeter wave (mmWave) multiple-input multiple-output (MIMO) systems due to its energy-efficient and economic superiorities.
Infinite resolution of phase shifters (PSs) for the analog beamformer can achieve very close performance compared to the full-digital scheme but will result in high complexity and intensive power consumption.
Thus, more cost effective and energy efficient low resolution PSs are typically used in practical mmWave MIMO systems.
In this paper, we consider the joint hybrid precoder and combiner design with one-bit quantized PSs in mmWave MIMO systems.
We propose to firstly design the analog precoder and combiner pair for each data stream successively, aiming at conditionally maximizing the spectral efficiency.
We present a novel binary analog precoder and combiner optimization algorithm under a Rank-1 approximation of the interference-included equivalent channel with lower than quadratic complexity.
Then the digital precoder and combiner are computed based on the obtained baseband effective channel to further enhance the spectral efficiency.
Simulation results demonstrate that the proposed algorithm outperforms the existing one-bit PSs based hybrid beamforming scheme.
Sensor networks aim at monitoring their surroundings for event detection and object tracking.
But, due to failure, or death of sensors, false signal can be transmitted.
In this paper, we consider the problems of distributed fault detection in wireless sensor network (WSN).
In particular, we consider how to take decision regarding fault detection in a noisy environment as a result of false detection or false response of event by some sensors, where the sensors are placed at the center of regular hexagons and the event can occur at only one hexagon.
We propose fault detection schemes that explicitly introduce the error probabilities into the optimal event detection process.
We introduce two types of detection probabilities, one for the center node, where the event occurs and the other one for the adjacent nodes.
This second type of detection probability is new in sensor network literature.
We develop schemes under the model selection procedure, multiple model selection procedure and use the concept of Bayesian model averaging to identify a set of likely fault sensors and obtain an average predictive error.
Agriculture is vital for human survival and remains a major driver of several economies around the world; more so in underdeveloped and developing economies.
With increasing demand for food and cash crops, due to a growing global population and the challenges posed by climate change, there is a pressing need to increase farm outputs while incurring minimal costs.
Previous machine vision technologies developed for selective weeding have faced the challenge of reliable and accurate weed detection.
We present approaches for plant seedlings classification with a dataset that contains 4,275 images of approximately 960 unique plants belonging to 12 species at several growth stages.
We compare the performances of two traditional algorithms and a Convolutional Neural Network (CNN), a deep learning technique widely applied to image recognition, for this task.
Our findings show that CNN-driven seedling classification applications when used in farming automation has the potential to optimize crop yield and improve productivity and efficiency when designed appropriately.
Thinning is the removal of contour pixels/points of connected components in an image to produce their skeleton with retained connectivity and structural properties.
The output requirements of a thinning procedure often vary with application.
This paper proposes a sequential algorithm that is very easy to understand and modify based on application to perform the thinning of multi-dimensional binary patterns.
The algorithm was tested on 2D and 3D patterns and showed very good results.
Moreover, comparisons were also made with two of the state-of-the-art methods used for 2D patterns.
The results obtained prove the validity of the procedure.
Fads, product adoption, mobs, rumors, memes, and emergent norms are diverse social contagions that have been modeled as network cascades.
Empirical study of these cascades is vulnerable to what we describe as the "opacity problem": the inability to observe the critical level of peer influence required to trigger an individual's behavioral change.
Even with maximal information, network cascades reveal intervals that bound critical levels of peer exposure, rather than critical values themselves.
Existing practice uses interval maxima, which systematically over-estimates the social influence required for behavioral change.
Simulations reveal that the over-estimation is likely common and large in magnitude.
This is confirmed by an empirical study of hashtag cascades among 3.2 million Twitter users: one in five hashtag adoptions suffers critical value uncertainty due to the opacity problem.
Different assumptions about these intervals lead to qualitatively different conclusions about the role of peer reinforcement in diffusion.
We introduce a solution that combines identifying tightly bounded intervals with predicting uncertain critical values using node-level information.
Ontologies are one of the core foundations of the Semantic Web.
To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved.
Visual notations can provide an overview of the ontology and help users to understand the connections among entities.
However, the users first need to learn the visual notation before they can interpret it correctly.
Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent.
We propose to combine ontology visualizations with contextual ontology verbalizations of selected ontology (diagram) elements, displaying controlled natural language (CNL) explanations of OWL axioms corresponding to the selected visual notation elements.
Thus, the domain experts will benefit from both the high-level overview provided by the graphical notation and the detailed textual explanations of particular elements in the diagram.
Modal logics are widely used in computer science.
The complexity of their satisfiability problems has been an active field of research since the 1970s.
We prove that even very "simple" modal logics can be undecidable: We show that there is an undecidable modal logic that can be obtained by restricting the allowed models with a first-order formula in which only universal quantifiers appear.
Cops and robbers is a vertex-pursuit game played on graphs.
In the classical cops-and-robbers game, a set of cops and a robber occupy the vertices of the graph and move alternately along the graph's edges with perfect information about each other's positions.
If a cop eventually occupies the same vertex as the robber, then the cops win; the robber wins if she can indefinitely evade capture.
Aigner and Frommer established that in every connected planar graph, three cops are sufficient to capture a single robber.
In this paper, we consider a recently studied variant of the cops-and-robbers game, alternately called the one-active-cop game, one-cop-moves game or the lazy-cops-and-robbers game, where at most one cop can move during any round.
We show that Aigner and Frommer's result does not generalise to this game variant by constructing a connected planar graph on which a robber can indefinitely evade three cops in the one-cop-moves game.
This answers a question recently raised by Sullivan, Townsend and Werzanski.
Distributed Denial-of-Service (DDoS) is a menace for service provider and prominent issue in network security.
Defeating or defending the DDoS is a prime challenge.
DDoS make a service unavailable for a certain time.
This phenomenon harms the service providers, and hence, loss of business revenue.
Therefore, DDoS is a grand challenge to defeat.
There are numerous mechanism to defend DDoS, however, this paper surveys the deployment of Bloom Filter in defending a DDoS attack.
The Bloom Filter is a probabilistic data structure for membership query that returns either true or false.
Bloom Filter uses tiny memory to store information of large data.
Therefore, packet information is stored in Bloom Filter to defend and defeat DDoS.
This paper presents a survey on DDoS defending technique using Bloom Filter.
Deep neural networks have shown excellent performance for stereo matching.
Many efforts focus on the feature extraction and similarity measurement of the matching cost computation step while less attention is paid on cost aggregation which is crucial for stereo matching.
In this paper, we present a learning-based cost aggregation method for stereo matching by a novel sub-architecture in the end-to-end trainable pipeline.
We reformulate the cost aggregation as a learning process of the generation and selection of cost aggregation proposals which indicate the possible cost aggregation results.
The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals.
The criterion for the selection is determined by the low-level structure information obtained from a light convolutional network.
The two-stream network offers a global view guidance for the cost aggregation to rectify the mismatching value stemming from the limited view of the matching cost computation.
The comprehensive experiments on challenge datasets such as KITTI and Scene Flow show that our method outperforms the state-of-the-art methods.
Owing to the expeditious growth in the information and communication technologies, smart cities have raised the expectations in terms of efficient functioning and management.
One key aspect of residents' daily comfort is assured through affording reliable traffic management and route planning.
Comprehensively, the majority of the present trip planning applications and service providers are enabling their trip planning recommendations relying on shortest paths and/or fastest routes.
However, such suggestions may discount drivers' preferences with respect to safe and less disturbing trips.
Road anomalies such as cracks, potholes, and manholes induce risky driving scenarios and can lead to vehicles damages and costly repairs.
Accordingly, in this paper, we propose a crowdsensing based dynamic route planning system.
Leveraging both the vehicle motion sensors and the inertial sensors within the smart devices, road surface types and anomalies have been detected and categorized.
In addition, the monitored events are geo-referenced utilizing GPS receivers on both vehicles and smart devices.
Consequently, road segments assessments are conducted using fuzzy system models based on aspects such as the number of anomalies and their severity levels in each road segment.
Afterward, another fuzzy model is adopted to recommend the best trip routes based on the road segments quality in each potential route.
Extensive road experiments are held to build and show the potential of the proposed system.
We report on an extended robot control application of a contact-less and airborne ultrasonic tactile display (AUTD) stimulus-based brain-computer interface (BCI) paradigm, which received last year The Annual BCI Research Award 2014.
In the award winning human communication augmentation paradigm the six palm positions are used to evoke somatosensory brain responses, in order to define a novel contactless tactile BCI.
An example application of a small robot management is also presented in which the users control a small robot online.
We discuss deep reinforcement learning in an overview style.
We draw a big picture, filled with details.
We discuss six core elements, six important mechanisms, and twelve applications, focusing on contemporary work, and in historical contexts.
We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources.
Next we discuss RL core elements, including value function, policy, reward, model, exploration vs. exploitation, and representation.
Then we discuss important mechanisms for RL, including attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn.
After that, we discuss RL applications, including games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art.
Finally we summarize briefly, discuss challenges and opportunities, and close with an epilogue.
This paper presents the probability hypothesis density filter (PHD) and the cardinality PHD (CPHD) filter for sets of trajectories, which are referred to as the trajectory PHD (TPHD) and trajectory CPHD (TCPHD) filters.
Contrary to the PHD/CPHD filters, the TPHD/TCPHD filters are able to produce trajectory estimates from first principles.
The TPHD filter is derived by recursively obtaining the best Poisson multitrajectory density approximation to the posterior density over the alive trajectories by minimising the Kullback-Leibler divergence.
The TCPHD is derived in the same way but propagating an independent identically distributed (IID) cluster multitrajectory density approximation.
We also propose the Gaussian mixture implementations of the TPHD and TCPHD recursions, the Gaussian mixture TPHD (GMTPHD) and the Gaussian mixture TCPHD (GMTCPHD), and the L-scan computationally efficient implementations, which only update the density of the trajectory states of the last L time steps.
Secure email is increasingly being touted as usable by novice users, with a push for adoption based on recent concerns about government surveillance.
To determine whether secure email is for grassroots adoption, we employ a laboratory user study that recruits pairs of novice to install and use several of the latest systems to exchange secure messages.
We present quantitative and qualitative results from 25 pairs of novice users as they use Pwm, Tutanota, and Virtru.
Participants report being more at ease with this type of study and better able to cope with mistakes since both participants are "on the same page".
We find that users prefer integrated solutions over depot-based solutions, and that tutorials are important in helping first-time users.
Hiding the details of how a secure email system provides security can lead to a lack of trust in the system.
Participants expressed a desire to use secure email, but few wanted to use it regularly and most were unsure of when they might use it.
Nowadays, the major challenge in machine learning is the Big Data challenge.
The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow.
The training time has two major components: Time to access the data and time to process (learn from) the data.
So far, the research has focused only on the second part, i.e., learning from the data.
In this paper, we have proposed one possible solution to handle the big data problems in machine learning.
The idea is to reduce the training time through reducing data access time by proposing systematic sampling and cyclic/sequential sampling to select mini-batches from the dataset.
To prove the effectiveness of proposed sampling techniques, we have used Empirical Risk Minimization, which is commonly used machine learning problem, for strongly convex and smooth case.
The problem has been solved using SAG, SAGA, SVRG, SAAG-II and MBSGD (Mini-batched SGD), each using two step determination techniques, namely, constant step size and backtracking line search method.
Theoretical results prove the same convergence for systematic sampling, cyclic sampling and the widely used random sampling technique, in expectation.
Experimental results with bench marked datasets prove the efficacy of the proposed sampling techniques and show up to six times faster training.
In this paper, a complete preprocessing methodology for discovering patterns in web usage mining process to improve the quality of data by reducing the quantity of data has been proposed.
A dynamic ART1 neural network clustering algorithm to group users according to their Web access patterns with its neat architecture is also proposed.
Several experiments are conducted and the results show the proposed methodology reduces the size of Web log files down to 73-82% of the initial size and the proposed ART1 algorithm is dynamic and learns relatively stable quality clusters.
This paper studies dynamic spectrum leasing in a cognitive radio network.
There are two spectrum sellers, who are two primary networks, each with an amount of licensed spectrum bandwidth.
When a seller has some unused spectrum, it would like to lease the unused spectrum to secondary users.
A coordinator helps to perform the spectrum leasing stage-by-stage.
As the two sellers may have different leasing period, there are three epochs, in which seller 1 has spectrum to lease in Epochs II and III, while seller 2 has spectrum to lease in Epochs I and II.
Each seller needs to decide how much spectrum it should lease to secondary users in each stage of its leasing period, with a target at revenue maximization.
It is shown that, when the two sellers both have spectrum to lease (i.e., in Epoch II), the spectrum leasing can be formulated as a non-cooperative game.
Nash equilibria of the game are found in closed form.
Solutions of the two users in the three epochs are derived.
Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems.
From stochastic first order methods, now the focus is shifting to stochastic second order methods due to their faster convergence.
In this paper, we have proposed a novel Stochastic Trust RegiOn inexact Newton method, called as STRON, which uses conjugate gradient (CG) to solve trust region subproblem.
The method uses progressive subsampling in the calculation of gradient and Hessian values to take the advantage of both stochastic approximation and full batch regimes.
We have extended STRON using existing variance reduction techniques to deal with the noisy gradients, and using preconditioned conjugate gradient (PCG) as subproblem solver.
We further extend STRON to solve SVM.
Finally, the theoretical results prove superlinear convergence for STRON and the empirical results prove the efficacy of the proposed method against existing methods with bench marked datasets.
Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems.
The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2.
In this paper, we bring attention to alternative choices for image restoration.
In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer.
We compare the performance of several losses, and propose a novel, differentiable error function.
We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.
Semantic segmentation, like other fields of computer vision, has seen a remarkable performance advance by the use of deep convolution neural networks.
However, considering that neighboring pixels are heavily dependent on each other, both learning and testing of these methods have a lot of redundant operations.
To resolve this problem, the proposed network is trained and tested with only 0.37% of total pixels by superpixel-based sampling and largely reduced the complexity of upsampling calculation.
The hypercolumn feature maps are constructed by pyramid module in combination with the convolution layers of the base network.
Since the proposed method uses a very small number of sampled pixels, the end-to-end learning of the entire network is difficult with a common learning rate for all the layers.
In order to resolve this problem, the learning rate after sampling is controlled by statistical process control (SPC) of gradients in each layer.
The proposed method performs better than or equal to the conventional methods that use much more samples on Pascal Context, SUN-RGBD dataset.
Deep learning is an effective approach to solving image recognition problems.
People draw intuitive conclusions from trading charts; this study uses the characteristics of deep learning to train computers in imitating this kind of intuition in the context of trading charts.
The three steps involved are as follows: 1. Before training, we pre-process the input data from quantitative data to images.
2. We use a convolutional neural network (CNN), a type of deep learning, to train our trading model.
3. We evaluate the model's performance in terms of the accuracy of classification.
A trading model is obtained with this approach to help devise trading strategies.
The main application is designed to help clients automatically obtain personalized trading strategies.
Model pruning has become a useful technique that improves the computational efficiency of deep learning, making it possible to deploy solutions in resource-limited scenarios.
A widely-used practice in relevant work assumes that a smaller-norm parameter or feature plays a less informative role at the inference time.
In this paper, we propose a channel pruning technique for accelerating the computations of deep convolutional neural networks (CNNs) that does not critically rely on this assumption.
Instead, it focuses on direct simplification of the channel-to-channel computation graph of a CNN without the need of performing a computationally difficult and not-always-useful task of making high-dimensional tensors of CNN structured sparse.
Our approach takes two stages: first to adopt an end-to- end stochastic training method that eventually forces the outputs of some channels to be constant, and then to prune those constant channels from the original neural network by adjusting the biases of their impacting layers such that the resulting compact model can be quickly fine-tuned.
Our approach is mathematically appealing from an optimization perspective and easy to reproduce.
We experimented our approach through several image learning benchmarks and demonstrate its interesting aspects and competitive performance.
Advancements in technology and culture lead to changes in our language.
These changes create a gap between the language known by users and the language stored in digital archives.
It affects user's possibility to firstly find content and secondly interpret that content.
In previous work we introduced our approach for Named Entity Evolution Recognition~(NEER) in newspaper collections.
Lately, increasing efforts in Web preservation lead to increased availability of Web archives covering longer time spans.
However, language on the Web is more dynamic than in traditional media and many of the basic assumptions from the newspaper domain do not hold for Web data.
In this paper we discuss the limitations of existing methodology for NEER.
We approach these by adapting an existing NEER method to work on noisy data like the Web and the Blogosphere in particular.
We develop novel filters that reduce the noise and make use of Semantic Web resources to obtain more information about terms.
Our evaluation shows the potentials of the proposed approach.
We propose an algebraic setup for end-to-end physical-layer network coding based on submodule transmission.
We introduce a distance function between modules, describe how it relates to information loss and errors, and show how to compute it.
Then we propose a definition of submodule error-correcting code, and investigate bounds and constructions for such codes.
The main challenge of online multi-object tracking is to reliably associate object trajectories with detections in each video frame based on their tracking history.
In this work, we propose the Recurrent Autoregressive Network (RAN), a temporal generative modeling framework to characterize the appearance and motion dynamics of multiple objects over time.
The RAN couples an external memory and an internal memory.
The external memory explicitly stores previous inputs of each trajectory in a time window, while the internal memory learns to summarize long-term tracking history and associate detections by processing the external memory.
We conduct experiments on the MOT 2015 and 2016 datasets to demonstrate the robustness of our tracking method in highly crowded and occluded scenes.
Our method achieves top-ranked results on the two benchmarks.
A new wave of decision-support systems are being built today using AI services that draw insights from data (like text and video) and incorporate them in human-in-the-loop assistance.
However, just as we expect humans to be ethical, the same expectation needs to be met by automated systems that increasingly get delegated to act on their behalf.
A very important aspect of an ethical behavior is to avoid (intended, perceived, or accidental) bias.
Bias occurs when the data distribution is not representative enough of the natural phenomenon one wants to model and reason about.
The possibly biased behavior of a service is hard to detect and handle if the AI service is merely being used and not developed from scratch, since the training data set is not available.
In this situation, we envisage a 3rd party rating agency that is independent of the API producer or consumer and has its own set of biased and unbiased data, with customizable distributions.
We propose a 2-step rating approach that generates bias ratings signifying whether the AI service is unbiased compensating, data-sensitive biased, or biased.
The approach also works on composite services.
We implement it in the context of text translation and report interesting results.
We develop a novel method, based on the statistical concept of the Vapnik-Chervonenkis dimension, to evaluate the selectivity (output cardinality) of SQL queries - a crucial step in optimizing the execution of large scale database and data-mining operations.
The major theoretical contribution of this work, which is of independent interest, is an explicit bound to the VC-dimension of a range space defined by all possible outcomes of a collection (class) of queries.
We prove that the VC-dimension is a function of the maximum number of Boolean operations in the selection predicate and of the maximum number of select and join operations in any individual query in the collection, but it is neither a function of the number of queries in the collection nor of the size (number of tuples) of the database.
We leverage on this result and develop a method that, given a class of queries, builds a concise random sample of a database, such that with high probability the execution of any query in the class on the sample provides an accurate estimate for the selectivity of the query on the original large database.
The error probability holds simultaneously for the selectivity estimates of all queries in the collection, thus the same sample can be used to evaluate the selectivity of multiple queries, and the sample needs to be refreshed only following major changes in the database.
The sample representation computed by our method is typically sufficiently small to be stored in main memory.
We present extensive experimental results, validating our theoretical analysis and demonstrating the advantage of our technique when compared to complex selectivity estimation techniques used in PostgreSQL and the Microsoft SQL Server.
Remote sensing image classification is a fundamental task in remote sensing image processing.
Remote sensing field still lacks of such a large-scale benchmark compared to ImageNet, Place2.
We propose a remote sensing image classification benchmark (RSI-CB) based on crowd-source data which is massive, scalable, and diversity.
Using crowdsource data, we can efficiently annotate ground objects in remotes sensing image by point of interests, vectors data from OSM or other crowd-source data.
Based on this method, we construct a worldwide large-scale benchmark for remote sensing image classification.
In this benchmark, there are two sub datasets with 256 * 256 and 128 * 128 size respectively since different convolution neural networks requirement different image size.
The former sub dataset contains 6 categories with 35 subclasses with total of more than 24,000 images; the later one contains 6 categories with 45 subclasses with total of more than 36,000 images.
The six categories are agricultural land, construction land and facilities, transportation and facilities, water and water conservancy facilities, woodland and other land, and each category has several subclasses.
This classification system is defined according to the national standard of land use classification in China, and is inspired by the hierarchy mechanism of ImageNet.
Finally, we have done a large number of experiments to compare RSI-CB with SAT-4, UC-Merced datasets on handcrafted features, such as such as SIFT, and classical CNN models, such as AlexNet, VGG, GoogleNet, and ResNet.
We also show CNN models trained by RSI-CB have good performance when transfer to other dataset, i.e.UC-Merced, and good generalization ability.
The experiments show that RSI-CB is more suitable as a benchmark for remote sensing image classification task than other ones in big data era, and can be potentially used in practical applications.
Understanding the loss surface of neural networks is essential for the design of models with predictable performance and their success in applications.
Experimental results suggest that sufficiently deep and wide neural networks are not negatively impacted by suboptimal local minima.
Despite recent progress, the reason for this outcome is not fully understood.
Could deep networks have very few, if at all, suboptimal local optima? or could all of them be equally good?
We provide a construction to show that suboptimal local minima (i.e. non-global ones), even though degenerate, exist for fully connected neural networks with sigmoid activation functions.
The local minima obtained by our proposed construction belong to a connected set of local solutions that can be escaped from via a non-increasing path on the loss curve.
For extremely wide neural networks with two hidden layers, we prove that every suboptimal local minimum belongs to such a connected set.
This provides a partial explanation for the successful application of deep neural networks.
In addition, we also characterize under what conditions the same construction leads to saddle points instead of local minima for deep neural networks.
The most data-efficient algorithms for reinforcement learning (RL) in robotics are based on uncertain dynamical models: after each episode, they first learn a dynamical model of the robot, then they use an optimization algorithm to find a policy that maximizes the expected return given the model and its uncertainties.
It is often believed that this optimization can be tractable only if analytical, gradient-based algorithms are used; however, these algorithms require using specific families of reward functions and policies, which greatly limits the flexibility of the overall approach.
In this paper, we introduce a novel model-based RL algorithm, called Black-DROPS (Black-box Data-efficient RObot Policy Search) that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for data-efficient RL in robotics, and (3) is as fast (or faster) than analytical approaches when several cores are available.
The key idea is to replace the gradient-based optimization algorithm with a parallel, black-box algorithm that takes into account the model uncertainties.
We demonstrate the performance of our new algorithm on two standard control benchmark problems (in simulation) and a low-cost robotic manipulator (with a real robot).
Advances in machine learning have produced systems that attain human-level performance on certain visual tasks, e.g., object identification.
Nonetheless, other tasks requiring visual expertise are unlikely to be entrusted to machines for some time, e.g., satellite and medical imagery analysis.
We describe a human-machine cooperative approach to visual search, the aim of which is to outperform either human or machine acting alone.
The traditional route to augmenting human performance with automatic classifiers is to draw boxes around regions of an image deemed likely to contain a target.
Human experts typically reject this type of hard highlighting.
We propose instead a soft highlighting technique in which the saliency of regions of the visual field is modulated in a graded fashion based on classifier confidence level.
We report on experiments with both synthetic and natural images showing that soft highlighting achieves a performance synergy surpassing that attained by hard highlighting.
In this paper the effect of posibilistic or mixed background risk on the level of optimal prevention is studied.
In the framework of five purely possibilistic or mixed models, necessary and sufficient conditions are found such that the level of optimal saving decreases or increases as a result of the actions of various types of background risk.
This way our results complete those obtained by Courbage and Rey for some prevention models with probabilistic background risk.
Through a combination of experimental and simulation results, we illustrate that passive recommendations encoded in typical computer user-interfaces (UIs) can subdue users' natural proclivity to access diverse information sources.
Inspired by traditional demonstrations of a part-set cueing effect in the cognitive science literature, we performed an online experiment manipulating the operation of the 'New Tab' page for consenting volunteers over a two month period.
Examination of their browsing behavior reveals that typical frequency and recency-based methods for displaying websites in these displays subdues users' propensity to access infrequently visited pages compared to a situation wherein no web page icons are displayed on the new tab page.
Using a carefully designed simulation study, representing user behavior as a random walk on a graph, we inferred quantitative predictions about the extent to which discovery of new sources of information may be hampered by personalized 'New Tab' recommendations in typical computer UIs.
We show that our results are significant at the individual level and explain the potential consequences of the observed suppression in web-exploration.
Although the latent factor model achieves good accuracy in rating prediction, it suffers from many problems including cold-start, non-transparency, and suboptimal results for individual user-item pairs.
In this paper, we exploit textual reviews and item images together with ratings to tackle these limitations.
Specifically, we first apply a proposed multi-modal aspect-aware topic model (MATM) on text reviews and item images to model users' preferences and items' features from different aspects, and also estimate the aspect importance of a user towards an item.
Then the aspect importance is integrated into a novel aspect-aware latent factor model (ALFM), which learns user's and item's latent factors based on ratings.
In particular, ALFM introduces a weight matrix to associate those latent factors with the same set of aspects in MATM, such that the latent factors could be used to estimate aspect ratings.
Finally, the overall rating is computed via a linear combination of the aspect ratings, which are weighted by the corresponding aspect importance.
To this end, our model could alleviate the data sparsity problem and gain good interpretability for recommendation.
Besides, every aspect rating is weighted by its aspect importance, which is dependent on the targeted user's preferences and the targeted item's features.
Therefore, it is expected that the proposed method can model a user's preferences on an item more accurately for each user-item pair.
Comprehensive experimental studies have been conducted on the Yelp 2017 Challenge dataset and Amazon product datasets to demonstrate the effectiveness of our method.
In spite of recent advances in field delineation methods, bibliometricians still don't know the extent to which their topic detection algorithms reconstruct `ground truths', i.e. thematic structures in the scientific literature.
In this paper, we demonstrate a new approach to the delineation of thematic structures that attempts to match the algorithm to theoretically derived and empirically observed properties all thematic structures have in common.
We cluster citation links rather than publication nodes, use predominantly local information and search for communities of links starting from seed subgraphs in order to allow for pervasive overlaps of topics.
We evaluate sets of links with a new cost function and assume that local minima in the cost landscape correspond to link communities.
Because this cost landscape has many local minima we define a valid community as the community with the lowest minimum within a certain range.
Since finding all valid communities is impossible for large networks, we designed a memetic algorithm that combines probabilistic evolutionary strategies with deterministic local searches.
We apply our approach to a network of about 15,000 Astronomy & Astrophysics papers published 2010 and their cited sources, and to a network of about 100,000 Astronomy & Astrophysics papers (published 2003--2010) which are linked through direct citations.
The construction of business process models has become an important requisite in the analysis and optimization of processes.
The success of the analysis and optimization efforts heavily depends on the quality of the models.
Therefore, a research domain emerged that studies the process of process modeling.
This paper contributes to this research by presenting a way of visualizing the different steps a modeler undertakes to construct a process model, in a so-called process of process modeling Chart.
The graphical representation lowers the cognitive efforts to discover properties of the modeling process, which facilitates the research and the development of theory, training and tool support for improving model quality.
The paper contains an extensive overview of applications of the tool that demonstrate its usefulness for research and practice and discusses the observations from the visualization in relation to other work.
The visualization was evaluated through a qualitative study that confirmed its usefulness and added value compared to the Dotted Chart on which the visualization was inspired.
Current paper reports the advantages of the application of GitHub and LaTeX for the MSc thesis writing.
The existing code-based program implemented in GitHub portal provides a great tool for scientists and students for data sharing and notification of the co-workers, tutors and supervisors involved in research about actual updates.
It enables to connect collaborators to share around current results, release datasets and updates and more.
Using standard command-line interface GitHub allows registered users to push repositories on the website.
The availability of both public and private repositories enables to share current data updates with target audience: e.g. unpublished research work only for co-authors or supervisors, or vice versa.
Therefore, there is a need in academic centres and universities to strongly popularize and increase the use of GitHub for student works.
The case study is given on the graduate study: an MSc work written and maintained using open source GitHub service at the University of Twente, Faculty of Geo-Information Science and Earth Observation (Netherlands).
It reports my successful experience of writing MSc thesis based on the effective combination of LaTeX and GitHub.
The safety of infinite state systems can be checked by a backward reachability procedure.
For certain classes of systems, it is possible to prove the termination of the procedure and hence conclude the decidability of the safety problem.
Although backward reachability is property-directed, it can unnecessarily explore (large) portions of the state space of a system which are not required to verify the safety property under consideration.
To avoid this, invariants can be used to dramatically prune the search space.
Indeed, the problem is to guess such appropriate invariants.
In this paper, we present a fully declarative and symbolic approach to the mechanization of backward reachability of infinite state systems manipulating arrays by Satisfiability Modulo Theories solving.
Theories are used to specify the topology and the data manipulated by the system.
We identify sufficient conditions on the theories to ensure the termination of backward reachability and we show the completeness of a method for invariant synthesis (obtained as the dual of backward reachability), again, under suitable hypotheses on the theories.
We also present a pragmatic approach to interleave invariant synthesis and backward reachability so that a fix-point for the set of backward reachable states is more easily obtained.
Finally, we discuss heuristics that allow us to derive an implementation of the techniques in the model checker MCMT, showing remarkable speed-ups on a significant set of safety problems extracted from a variety of sources.
This paper presents an integrated multi-agents architecture for indexing and retrieving video information.The focus of our work is to elaborate an extensible approach that gathers a priori almost of the mandatory tools which palliate to the major intertwining problems raised in the whole process of the video lifecycle (classification, indexing and retrieval).
In fact, effective and optimal retrieval video information needs a collaborative approach based on multimodal aspects.
Clearly, it must to take into account the distributed aspect of the data sources, the adaptation of the contents, semantic annotation, personalized request and active feedback which constitute the backbone of a vigorous system which improve its performances in a smart way
The problem of improving the efficiency of the teaching department through the development of teaching department work area is described.
Development of an automated workplace of a teaching department who allows to realize monitoring of progress of students, monitoring of mastering of disciplines by students, is synchronized with an automated workplace of the teacher of the higher school and autocompletes the report of movement of the contingent.
Besides, the designed system allows to increase efficiency and efficiency of activities of employees of a teaching department.
In a modern recommender system, it is important to understand how products relate to each other.
For example, while a user is looking for mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers.
These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other.
Here we develop a method to infer networks of substitutable and complementary products.
We formulate this as a supervised link prediction task, where we learn the semantics of substitutes and complements from data associated with products.
The primary source of data we use is the text of product reviews, though our method also makes use of features such as ratings, specifications, prices, and brands.
Methodologically, we build topic models that are trained to automatically discover topics from text that are successful at predicting and explaining such relationships.
Experimentally, we evaluate our system on the Amazon product catalog, a large dataset consisting of 9 million products, 237 million links, and 144 million reviews.
This short paper reports the algorithms we used and the evaluation performances for ISIC Challenge 2018.
Our team participates in all the tasks in this challenge.
In lesion segmentation task, the pyramid scene parsing network (PSPNet) is modified to segment the lesions.
In lesion attribute detection task, the modified PSPNet is also adopted in a multi-label way.
In disease classification task, the DenseNet-169 is adopted for multi-class classification.
Requirements about the quality of clinical guidelines can be represented by schemata borrowed from the theory of abductive diagnosis, using temporal logic to model the time-oriented aspects expressed in a guideline.
Previously, we have shown that these requirements can be verified using interactive theorem proving techniques.
In this paper, we investigate how this approach can be mapped to the facilities of a resolution-based theorem prover, Otter, and a complementary program that searches for finite models of first-order statements, Mace.
It is shown that the reasoning required for checking the quality of a guideline can be mapped to such fully automated theorem-proving facilities.
The medical quality of an actual guideline concerning diabetes mellitus 2 is investigated in this way.
Cloud Service Providers (CSPs) offer a wide variety of scalable, flexible, and cost-efficient services to cloud users on demand and pay-per-utilization basis.
However, vast diversity in available cloud service providers leads to numerous challenges for users to determine and select the best suitable service.
Also, sometimes users need to hire the required services from multiple CSPs which introduce difficulties in managing interfaces, accounts, security, supports, and Service Level Agreements (SLAs).
To circumvent such problems having a Cloud Service Broker (CSB) be aware of service offerings and users Quality of Service (QoS) requirements will benefit both the CSPs as well as users.
In this work, we proposed a Fuzzy Rough Set based Cloud Service Brokerage Architecture, which is responsible for ranking and selecting services based on users QoS requirements, and finally monitor the service execution.
We have used the fuzzy rough set technique for dimension reduction.
Used weighted Euclidean distance to rank the CSPs.
To prioritize user QoS request, we intended to use user assign weights, also incorporated system assigned weights to give the relative importance to QoS attributes.
We compared the proposed ranking technique with an existing method based on the system response time.
The case study experiment results show that the proposed approach is scalable, resilience, and produce better results with less searching time.
This paper evaluates eight parallel graph processing systems: Hadoop, HaLoop, Vertica, Giraph, GraphLab (PowerGraph), Blogel, Flink Gelly, and GraphX (SPARK) over four very large datasets (Twitter, World Road Network, UK 200705, and ClueWeb) using four workloads (PageRank, WCC, SSSP and K-hop).
The main objective is to perform an independent scale-out study by experimentally analyzing the performance, usability, and scalability (using up to 128 machines) of these systems.
In addition to performance results, we discuss our experiences in using these systems and suggest some system tuning heuristics that lead to better performance.
In this paper we propose an ensemble of local and deep features for object classification.
We also compare and contrast effectiveness of feature representation capability of various layers of convolutional neural network.
We demonstrate with extensive experiments for object classification that the representation capability of features from deep networks can be complemented with information captured from local features.
We also find out that features from various deep convolutional networks encode distinctive characteristic information.
We establish that, as opposed to conventional practice, intermediate layers of deep networks can augment the classification capabilities of features obtained from fully connected layers.
The Hospitals / Residents problem with Couples (HRC) models the allocation of intending junior doctors to hospitals where couples are allowed to submit joint preference lists over pairs of (typically geographically close) hospitals.
It is known that a stable matching need not exist, so we consider MIN BP HRC, the problem of finding a matching that admits the minimum number of blocking pairs (i.e., is "as stable as possible").
We show that this problem is NP-hard and difficult to approximate even in the highly restricted case that each couple finds only one hospital pair acceptable.
However if we further assume that the preference list of each single resident and hospital is of length at most 2, we give a polynomial-time algorithm for this case.
We then present the first Integer Programming (IP) and Constraint Programming (CP) models for MIN BP HRC.
Finally, we discuss an empirical evaluation of these models applied to randomly-generated instances of MIN BP HRC.
We find that on average, the CP model is about 1.15 times faster than the IP model, and when presolving is applied to the CP model, it is on average 8.14 times faster.
We further observe that the number of blocking pairs admitted by a solution is very small, i.e., usually at most 1, and never more than 2, for the (28,000) instances considered.
Disasters lead to devastating structural damage not only to buildings and transport infrastructure, but also to other critical infrastructure, such as the power grid and communication backbones.
Following such an event, the availability of minimal communication services is however crucial to allow efficient and coordinated disaster response, to enable timely public information, or to provide individuals in need with a default mechanism to post emergency messages.
The Internet of Things consists in the massive deployment of heterogeneous devices, most of which battery-powered, and interconnected via wireless network interfaces.
Typical IoT communication architectures enables such IoT devices to not only connect to the communication backbone (i.e. the Internet) using an infrastructure-based wireless network paradigm, but also to communicate with one another autonomously, without the help of any infrastructure, using a spontaneous wireless network paradigm.
In this paper, we argue that the vast deployment of IoT-enabled devices could bring benefits in terms of data network resilience in face of disaster.
Leveraging their spontaneous wireless networking capabilities, IoT devices could enable minimal communication services (e.g. emergency micro-message delivery) while the conventional communication infrastructure is out of service.
We identify the main challenges that must be addressed in order to realize this potential in practice.
These challenges concern various technical aspects, including physical connectivity requirements, network protocol stack enhancements, data traffic prioritization schemes, as well as social and political aspects.
Though deep neural network has hit a huge success in recent studies and applica- tions, it still remains vulnerable to adversarial perturbations which are imperceptible to humans.
To address this problem, we propose a novel network called ReabsNet to achieve high classification accuracy in the face of various attacks.
The approach is to augment an existing classification network with a guardian network to detect if a sample is natural or has been adversarially perturbed.
Critically, instead of simply rejecting adversarial examples, we revise them to get their true labels.
We exploit the observation that a sample containing adversarial perturbations has a possibility of returning to its true class after revision.
We demonstrate that our ReabsNet outperforms the state-of-the-art defense method under various adversarial attacks.
This paper presents a study of improvement in stability in a single machine connected to infinite bus (SMIB) power system by using static compensator (STATCOM).
The gains of Proportional-Integral-Derivative (PID) controller in STATCOM are being optimized by heuristic technique based on Particle swarm optimization (PSO).
Further, Bacterial Foraging Optimization (BFO) as an alternative heuristic method is also applied to select optimal gains of PID controller.
The performance of STATCOM with the above soft-computing techniques are studied and compared with the conventional PID controller under various scenarios.
The simulation results are accompanied with performance indices based quantitative analysis.
The analysis clearly signifies the robustness of the new scheme in terms of stability and voltage regulation when compared with conventional PID.
Compressive sensing (CS) is a new approach for the acquisition and recovery of sparse signals and images that enables sampling rates significantly below the classical Nyquist rate.
Despite significant progress in the theory and methods of CS, little headway has been made in compressive video acquisition and recovery.
Video CS is complicated by the ephemeral nature of dynamic events, which makes direct extensions of standard CS imaging architectures and signal models difficult.
In this paper, we develop a new framework for video CS for dynamic textured scenes that models the evolution of the scene as a linear dynamical system (LDS).
This reduces the video recovery problem to first estimating the model parameters of the LDS from compressive measurements, and then reconstructing the image frames.
We exploit the low-dimensional dynamic parameters (the state sequence) and high-dimensional static parameters (the observation matrix) of the LDS to devise a novel compressive measurement strategy that measures only the dynamic part of the scene at each instant and accumulates measurements over time to estimate the static parameters.
This enables us to lower the compressive measurement rate considerably.
We validate our approach with a range of experiments involving both video recovery, sensing hyper-spectral data, and classification of dynamic scenes from compressive data.
Together, these applications demonstrate the effectiveness of the approach.
We consider distributed elections, where there is a center and k sites.
In such distributed elections, each voter has preferences over some set of candidates, and each voter is assigned to exactly one site such that each site is aware only of the voters assigned to it.
The center is able to directly communicate with all sites.
We are interested in designing communication-efficient protocols, allowing the center to maintain a candidate which, with arbitrary high probability, is guaranteed to be a winner, or at least close to being a winner.
We consider various single-winner voting rules, such as variants of Approval voting and scoring rules, tournament-based voting rules, and several round-based voting rules.
For the voting rules we consider, we show that, using communication which is logarithmic in the number of voters, it is possible for the center to maintain such approximate winners; that is, upon a query at any time the center can immediately return a candidate which is guaranteed to be an approximate winner with high probability.
We complement our protocols with lower bounds.
Our results have implications in various scenarios, such as aggregating customer preferences in online shopping websites or supermarket chains and collecting votes from different polling stations of political elections.
A recurrent neural network model of phonological pattern learning is proposed.
The model is a relatively simple neural network with one recurrent layer, and displays biases in learning that mimic observed biases in human learning.
Single-feature patterns are learned faster than two-feature patterns, and vowel or consonant-only patterns are learned faster than patterns involving vowels and consonants, mimicking the results of laboratory learning experiments.
In non-recurrent models, capturing these biases requires the use of alpha features or some other representation of repeated features, but with a recurrent neural network, these elaborations are not necessary.
E-voting systems (EVS)are having potential advantages over many existing voting schemes.Security, transparency, accuracy and reliability are the major concern in these systems.EVS continues to grow as the technology advances.It is inexpensive and efficient as the resources become reusable.Fast and accurate computation of results with voter privacy is the added advantage.In the proposed system we make use of secret sharing technique and secure multi party computation(SMC) to achieve security and reliability.Secret sharing is an important technique used for SMC.
Multi-party computation is typically accomplished using secret sharing by making shares of the input and manipulating the shares to compute a typical function of the input.The proposed system make use of bitwise representation of votes and only the shares are used for transmission and computation of result.Secure sum evaluation can be done with shares distributed using Shamir's secret sharing scheme.The scheme is hence secure and reliable and does not make any number theoretic assumptions for security.We also propose a unique method which calculates the candidates individual votes keeping the anonymity.
We present a cheap, lightweight, and fast fruit counting pipeline that uses a single monocular camera.
Our pipeline that relies only on a monocular camera, achieves counting performance comparable to state-of-the-art fruit counting system that utilizes an expensive sensor suite including LiDAR and GPS/INS on a mango dataset.
Our monocular camera pipeline begins with a fruit detection component that uses a deep neural network.
It then uses semantic structure from motion (SFM) to convert these detections into fruit counts by estimating landmark locations of the fruit in 3D, and using these landmarks to identify double counting scenarios.
There are many benefits of developing a low cost and lightweight fruit counting system, including applicability to agriculture in developing countries, where monetary constraints or unstructured environments necessitate cheaper hardware solutions.
Computer algorithms are written with the intent that when run they perform a useful function.
Typically any information obtained is unknown until the algorithm is run.
However, if the behavior of an algorithm can be fully described by precomputing just once how this algorithm will respond when executed on any input, this precomputed result provides a complete specification for all solutions in the problem domain.
We apply this idea to a previous anomaly detection algorithm, and in doing so transform it from one that merely detects individual anomalies when asked to discover potentially anomalous values, into an algorithm also capable of generating a complete specification for those values it would deem to be anomalous.
This specification is derived by examining no more than a small training data, can be obtained in very small constant time, and is inherently far more useful than results obtained by repeated execution of this tool.
For example, armed with such a specification one can ask how close an anomaly is to being deemed normal, and can validate this answer not by exhaustively testing the algorithm but by examining if the specification so generated is indeed correct.
This powerful idea can be applied to any algorithm whose runtime behavior can be recovered from its construction and so has wide applicability.
As grids are in essence heterogeneous, dynamic, shared and distributed environments, managing these kinds of platforms efficiently is extremely complex.
A promising scalable approach to deal with these intricacies is the design of self-managing of autonomic applications.
Autonomic applications adapt their execution accordingly by considering knowledge about their own behaviour and environmental conditions.QoS based User Driven scheduling for grid that provides the self-optimizing ability in autonomic applications.
Computational grids to provide a user to solve large scale problem by spreading a single large computation across multiple machines of physical location.
QoS based User Driven scheduler for grid also provides reliability of the grid systems and increase the performance of the grid to reducing the execution time of job by applying scheduling policies defined by the user.
The main aim of this paper is to distribute the computational load among the available grid nodes and to developed a QoS based scheduling algorithm for grid and making grid more reliable.Grid computing system is different from conventional distributed computing systems by its focus on large scale resource sharing, where processors and communication have significant inuence on Grid computing reliability.
Reliability capabilities initiated by end users from within applications they submit to the grid for execution.
Reliability of infrastructure and management services that perform essential functions necessary for grid systems to operate, such as resource allocation and scheduling.
Cover song detection is a very relevant task in Music Information Retrieval (MIR) studies and has been mainly addressed using audio-based systems.
Despite its potential impact in industrial contexts, low performances and lack of scalability have prevented such systems from being adopted in practice for large applications.
In this work, we investigate whether textual music information (such as metadata and lyrics) can be used along with audio for large-scale cover identification problem in a wide digital music library.
We benchmark this problem using standard text and state of the art audio similarity measures.
Our studies shows that these methods can significantly increase the accuracy and scalability of cover detection systems on Million Song Dataset (MSD) and Second Hand Song (SHS) datasets.
By only leveraging standard tf-idf based text similarity measures on song titles and lyrics, we achieved 35.5% of absolute increase in mean average precision compared to the current scalable audio content-based state of the art methods on MSD.
These experimental results suggests that new methodologies can be encouraged among researchers to leverage and identify more sophisticated NLP-based techniques to improve current cover song identification systems in digital music libraries with metadata.
The key contribution of this work is to develop transmitter and receiver algorithms in discrete-time for turbo- coded offset QPSK signals.
The proposed synchronization and detection techniques perform effectively at an SNR per bit close to 1.5 dB, in the presence of a frequency offset as large as 30 % of the symbol-rate and a clock offset of 25 ppm (parts per million).
Due to the use of up-sampling and matched filtering and a feedforward approach, the acquisition time for clock recovery is just equal to the length of the preamble.
The carrier recovery algorithm does not exhibit any phase ambiguity, alleviating the need for differentially encoding the data at the transmitter.
The proposed techniques are well suited for discrete-time implementation.
Full-Duplex (FD) wireless and Device-to-Device (D2D) communication are two promising technologies that aspire to enhance the spectrum and energy efficiency of wireless networks, thus fulfilling key requirements of the 5th generation (5G) of mobile networks.
Both technologies, however, generate excessive interference, which, if not managed effectively, threatens to compromise system performance.
To this direction, we propose two transmission policies that enhance the communication of two interfering FD-enabled D2D pairs, derived from game theory and optimization theory.
The game-theoretic policy allows the pairs to choose their transmission modes independently and the optimal policy to maximize their throughput, achieving significant gains when the pairs interfere strongly with each other.
Kudekar et al. proved that the belief-propagation (BP) threshold for low-density parity-check codes can be boosted up to the maximum-a-posteriori (MAP) threshold by spatial coupling.
In this paper, spatial coupling is applied to randomly-spread code-division multiple-access (CDMA) systems in order to improve the performance of BP-based multiuser detection (MUD).
Spatially-coupled CDMA systems can be regarded as multi-code CDMA systems with two transmission phases.
The large-system analysis shows that spatial coupling can improve the BP performance, while there is a gap between the BP performance and the individually-optimal (IO) performance.
Guetzli is a new JPEG encoder that aims to produce visually indistinguishable images at a lower bit-rate than other common JPEG encoders.
It optimizes both the JPEG global quantization tables and the DCT coefficient values in each JPEG block using a closed-loop optimizer.
Guetzli uses Butteraugli, our perceptual distance metric, as the source of feedback in its optimization process.
We reach a 29-45% reduction in data size for a given perceptual distance, according to Butteraugli, in comparison to other compressors we tried.
Guetzli's computation is currently extremely slow, which limits its applicability to compressing static content and serving as a proof- of-concept that we can achieve significant reductions in size by combining advanced psychovisual models with lossy compression techniques.
In this paper, we make use of channel symmetry properties to determine the capacity region of three types of two-way networks: (a) two-user memoryless two-way channels (TWCs), (b) two-user TWCs with memory, and (c) three-user multiaccess/degraded broadcast (MA/DB) TWCs.
For each network, symmetry conditions under which Shannon's random coding inner bound is tight are given.
For two-user memoryless TWCs, prior results are substantially generalized by viewing a TWC as two interacting state-dependent one-way channels.
The capacity of symmetric TWCs with memory, whose outputs are functions of the inputs and independent stationary and ergodic noise processes, is also obtained.
Moreover, various channel symmetry properties under which Shannon's inner bound is tight are identified for three-user MA/DB TWCs.
The results not only enlarge the class of symmetric TWCs whose capacity region can be exactly determined but also imply that adaptive coding, not improving capacity, is unnecessary for such channels.
Convolutional neural networks belong to the most successul image classifiers, but the adaptation of their network architecture to a particular problem is computationally expensive.
We show that an evolutionary algorithm saves training time during the network architecture optimization, if learned network weights are inherited over generations by Lamarckian evolution.
Experiments on typical image datasets show similar or significantly better test accuracies and improved convergence speeds compared to two different baselines without weight inheritance.
On CIFAR-10 and CIFAR-100 a 75 % improvement in data efficiency is observed.
This work investigates a central problem in steganography, that is: How much data can safely be hidden without being detected?
To answer this question, a formal definition of steganographic capacity is presented.
Once this has been defined, a general formula for the capacity is developed.
The formula is applicable to a very broad spectrum of channels due to the use of an information-spectrum approach.
This approach allows for the analysis of arbitrary steganalyzers as well as non-stationary, non-ergodic encoder and attack channels.
After the general formula is presented, various simplifications are applied to gain insight into example hiding and detection methodologies.
Finally, the context and applications of the work are summarized in a general discussion.
Mobile manipulation tasks are one of the key challenges in the field of search and rescue (SAR) robotics requiring robots with flexible locomotion and manipulation abilities.
Since the tasks are mostly unknown in advance, the robot has to adapt to a wide variety of terrains and workspaces during a mission.
The centaur-like robot Centauro has a hybrid legged-wheeled base and an anthropomorphic upper body to carry out complex tasks in environments too dangerous for humans.
Due to its high number of degrees of freedom, controlling the robot with direct teleoperation approaches is challenging and exhausting.
Supervised autonomy approaches are promising to increase quality and speed of control while keeping the flexibility to solve unknown tasks.
We developed a set of operator assistance functionalities with different levels of autonomy to control the robot for challenging locomotion and manipulation tasks.
The integrated system was evaluated in disaster response scenarios and showed promising performance.
This article presents a new methodology called deep ToC that estimates the solutions of partial differential equations (PDEs) by combining neural networks with the Theory of Connections (ToC).
ToC is used to transform PDEs with boundary conditions into unconstrained optimization problems by embedding the boundary conditions into a "constrained expression" that contains a neural network.
The loss function for the unconstrained optimization problem is taken to be the square of the residual of the PDE.
Then, the neural network is trained in an unsupervised manner to solve the unconstrained optimization problem.
This methodology has two major advantages over other popular methods used to estimate the solutions of PDEs.
First, this methodology does not need to discretize the domain into a grid, which becomes prohibitive as the dimensionality of the PDE increases.
Instead, this methodology randomly samples points from the domain during the training phase.
Second, after training, this methodology represents a closed form, analytical, differentiable approximation of the solution throughout the entire training domain.
In contrast, other popular methods require interpolation if the estimated solution is desired at points that do not lie on the discretized grid.
The deep ToC method for estimating the solution of PDEs is demonstrated on four problems with a variety of boundary conditions.
Music recommender systems have become a key technology supporting the access to increasingly larger music catalogs in on-line music streaming services, on-line music shops, and private collections.
The interaction of users with large music catalogs is a complex phenomenon researched from different disciplines.
We survey our works investigating the machine learning and data mining aspects of hybrid music recommender systems (i.e., systems that integrate different recommendation techniques).
We proposed hybrid music recommender systems based solely on data and robust to the so-called "cold-start problem" for new music items, favoring the discovery of relevant but non-popular music.
We thoroughly studied the specific task of music playlist continuation, by analyzing fundamental playlist characteristics, song feature representations, and the relationship between playlists and the songs therein.
A subspace projection to improve channel estimation in massive multi-antenna systems is proposed and analyzed.
Together with power-controlled hand-off, it can mitigate the pilot contamination problem without the need for coordination among cells.
The proposed method is blind in the sense that it does not require pilot data to find the appropriate subspace.
It is based on the theory of large random matrices that predicts that the eigenvalue spectra of large sample covariance matrices can asymptotically decompose into disjoint bulks as the matrix size grows large.
Random matrix and free probability theory are utilized to predict under which system parameters such a bulk decomposition takes place.
Simulation results are provided to confirm that the proposed method outperforms conventional linear channel estimation if bulk separation occurs.
Recurrent neural networks (RNNs) are widely used to model sequential data but their non-linear dependencies between sequence elements prevent parallelizing training over sequence length.
We show the training of RNNs with only linear sequential dependencies can be parallelized over the sequence length using the parallel scan algorithm, leading to rapid training on long sequences even with small minibatch size.
We develop a parallel linear recurrence CUDA kernel and show that it can be applied to immediately speed up training and inference of several state of the art RNN architectures by up to 9x.
We abstract recent work on linear RNNs into a new framework of linear surrogate RNNs and develop a linear surrogate model for the long short-term memory unit, the GILR-LSTM, that utilizes parallel linear recurrence.
We extend sequence learning to new extremely long sequence regimes that were previously out of reach by successfully training a GILR-LSTM on a synthetic sequence classification task with a one million timestep dependency.
DNA sequencing is the physical or biochemical process of identifying the location of the four bases (Adenine, Guanine, Cytosine, Thymine) in a DNA strand.
As semiconductor technology revolutionized computing, DNA sequencing technology (termed Next Generation Sequencing, NGS) revolutionized genomic research.
Modern NGS platforms can sequence hundreds of millions of short DNA fragments in parallel.
The output short DNA fragments from NGS platforms are termed reads.
Mapping each output read to a reference genome of the same species (i.e., read mapping) is a common critical first step in a rich and diverse set of emerging bioinformatics applications.
The importance of read mapping motivated various sequence alignment and mapping algorithms, which start to fall short of tackling the growing scale of the problem.
Mapping represents a search-heavy memory-intensive operation and barely requires complex floating point arithmetic, therefore, can greatly benefit from in- or near-memory processing, where non-volatile memory can accommodate the large memory footprint in an area and energy efficient manner.
This paper introduces a scalable, energy-efficient high-throughput near (non-volatile) memory read mapping accelerator: BioMAP.
Instead of optimizing an algorithm developed for general-purpose computers or GPUs, BioMAP rethinks the algorithm and accelerator design together from the ground up.
Thereby BioMAP can improve the throughput of read mapping by 4.0 times while reducing the energy consumption by 26.2 times when compared to a highly-optimized algorithm for modern GPUs.
Today's wireless networks are characterized by fixed spectrum assignment policy.
The limited available spectrum and the inefficiency in the spectrum usage necessitate a new communication paradigm to exploit the existing wireless spectrum opportunistically.
Cognitive radio is a paradigm for wireless communication in which either a network or a wireless node changes its transmission or reception parameters to communicate efficiently avoiding interference with licensed or unlicensed users.
In this work, a fuzzy logic based system for spectrum management is proposed where the radio can share unused spectrum depending on some parameters like distance, signal strength, node velocity and availability of unused spectrum.
The system is simulated and is found to give satisfactory results.
Cars can nowadays record several thousands of signals through the CAN bus technology and potentially provide real-time information on the car, the driver and the surrounding environment.
This paper proposes a new method for the analysis and classification of driver behavior using a selected subset of CAN bus signals, specifically gas pedal position, brake pedal pressure, steering wheel angle, steering wheel momentum, velocity, RPM, frontal and lateral acceleration.
Data has been collected in a completely uncontrolled experiment, where 64 people drove 10 cars for or a total of over 2000 driving trips without any type of pre-determined driving instruction on a wide variety of road scenarios.
We propose an unsupervised learning technique that clusters drivers in different groups, and offers a validation method to test the robustness of clustering in a wide range of experimental settings.
The minimal amount of data needed to preserve robust driver clustering is also computed.
The presented study provides a new methodology for near-real-time classification of driver behavior in uncontrolled environments.
Driven by applications like Micro Aerial Vehicles (MAVs), driver-less cars, etc, localization solution has become an active research topic in the past decade.
In recent years, Ultra Wideband (UWB) emerged as a promising technology because of its impressive performance in both indoor and outdoor positioning.
But algorithms relying only on UWB sensor usually result in high latency and low bandwidth, which is undesirable in some situations such as controlling a MAV.
To alleviate this problem, an Extended Kalman Filter (EKF) based algorithm is proposed to fuse the Inertial Measurement Unit (IMU) and UWB, which achieved 80Hz 3D localization with significantly improved accuracy and almost no delay.
To verify the effectiveness and reliability of the proposed approach, a swarm of 6 MAVs is set up to perform a light show in an indoor exhibition hall.
Video and source codes are available at https://github.com/lijx10/uwb-localization
The course description provided by instructors is an important piece of information as it defines what is expected from the instructor and what he/she is going to deliver during a particular course.
One of the key components of a course description is the Learning Outcomes section.
The contents of this section are used by program managers who are tasked to compare and match two different courses during the development of Transfer Agreements between different institutions.
This research introduces the development of visual tools for understanding the two different courses and making comparisons.
We designed methods to extract the text from a course description document, developed an algorithm to perform semantic analysis, and displayed the results in a web interface.
We are able to achieve the intermediate results of the research which includes extracting, analyzing and visualizing the data.
Machine learning based solutions have been successfully employed for automatic detection of malware in Android applications.
However, machine learning models are known to lack robustness against inputs crafted by an adversary.
So far, the adversarial examples can only deceive Android malware detectors that rely on syntactic features, and the perturbations can only be implemented by simply modifying Android manifest.
While recent Android malware detectors rely more on semantic features from Dalvik bytecode rather than manifest, existing attacking/defending methods are no longer effective.
In this paper, we introduce a new highly-effective attack that generates adversarial examples of Android malware and evades being detected by the current models.
To this end, we propose a method of applying optimal perturbations onto Android APK using a substitute model.
Based on the transferability concept, the perturbations that successfully deceive the substitute model are likely to deceive the original models as well.
We develop an automated tool to generate the adversarial examples without human intervention to apply the attacks.
In contrast to existing works, the adversarial examples crafted by our method can also deceive recent machine learning based detectors that rely on semantic features such as control-flow-graph.
The perturbations can also be implemented directly onto APK's Dalvik bytecode rather than Android manifest to evade from recent detectors.
We evaluated the proposed manipulation methods for adversarial examples by using the same datasets that Drebin and MaMadroid (5879 malware samples) used.
Our results show that, the malware detection rates decreased from 96% to 1% in MaMaDroid, and from 97% to 1% in Drebin, with just a small distortion generated by our adversarial examples manipulation method.
We consider the provision of public goods on networks of strategic agents.
We study different effort outcomes of these network games, namely, the Nash equilibria, Pareto efficient effort profiles, and semi-cooperative equilibria (effort profiles resulting from interactions among coalitions of agents).
We identify necessary and sufficient conditions on the structure of the network for the uniqueness of the Nash equilibrium.
We show that our finding unifies (and strengthens) existing results in the literature.
We also identify conditions for the existence of Nash equilibria for the subclasses of games at the two extremes of our model, namely games of strategic complements and games of strategic substitutes.
We provide a graph-theoretical interpretation of agents' efforts at the Nash equilibrium, as well as the Pareto efficient outcomes and semi-cooperative equilibria, by linking an agent's decision to her centrality in the interaction network.
Using this connection, we separate the effects of incoming and outgoing edges on agents' efforts and uncover an alternating effect over walks of different length in the network.
In this paper, we study a nonconvex continuous relaxation of MAP inference in discrete Markov random fields (MRFs).
We show that for arbitrary MRFs, this relaxation is tight, and a discrete stationary point of it can be easily reached by a simple block coordinate descent algorithm.
In addition, we study the resolution of this relaxation using popular gradient methods, and further propose a more effective solution using a multilinear decomposition framework based on the alternating direction method of multipliers (ADMM).
Experiments on many real-world problems demonstrate that the proposed ADMM significantly outperforms other nonconvex relaxation based methods, and compares favorably with state of the art MRF optimization algorithms in different settings.
Given a set V of n elements on m attributes, we want to find a partition of V on the minimum number of clusters such that the associated R-squared ratio is at least a given threshold.
We denote this problem as Goal Clustering (GC).
This problem represents a new perspective, characterizing a different methodology within unsupervised non-hierarchical clustering.
In effect, while in the k-means we set the number of clusters in advance and then test the associated R-squared ratio; in the GC we set an R-squared threshold lower limit in advance and minimize k. We present two Variable Neighborhood Search (VNS) based heuristics for the GC problem.
The two heuristics use different methodologies to start the VNS algorithms.
One is based on the Ward's construction and the other one resorts to the k-means method.
Computational tests are conducted over a set of large sized instances in order to show the performance of the two proposed heuristics.
Assisted text input techniques can save time and effort and improve text quality.
In this paper, we investigate how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word prediction and completion.
These extensions incorporate a structured knowledge base and numerical values from the text into the context used to predict the next word.
Our automated evaluation on a clinical dataset shows extended models significantly outperform standard models.
Our best system uses both conditioning and grounding, because of their orthogonal benefits.
For word prediction with a list of 5 suggestions, it improves recall from 25.03% to 71.28% and for word completion it improves keystroke savings from 34.35% to 44.81%, where theoretical bound for this dataset is 58.78%.
We also perform a qualitative investigation of how models with lower perplexity occasionally fare better at the tasks.
We found that at test time numbers have more influence on the document level than on individual word probabilities.
Temporal object detection has attracted significant attention, but most popular detection methods can not leverage the rich temporal information in videos.
Very recently, many different algorithms have been developed for video detection task, but real-time online approaches are frequently deficient.
In this paper, based on attention mechanism and convolutional long short-term memory (ConvLSTM), we propose a temporal signal-shot detector (TSSD) for real-world detection.
Distinct from previous methods, we take aim at temporally integrating pyramidal feature hierarchy using ConvLSTM, and design a novel structure including a low-level temporal unit as well as a high-level one (HL-TU) for multi-scale feature maps.
Moreover, we develop a creative temporal analysis unit, namely, attentional ConvLSTM (AC-LSTM), in which a temporal attention module is specially tailored for background suppression and scale suppression while a ConvLSTM integrates attention-aware features through time.
An association loss is designed for temporal coherence.
Besides, online tubelet analysis (OTA) is exploited for identification.
Finally, our method is evaluated on ImageNet VID dataset and 2DMOT15 dataset.
Extensive comparisons on the detection and tracking capability validate the superiority of the proposed approach.
Consequently, the developed TSSD-OTA is fairly faster and achieves an overall competitive performance in terms of detection and tracking.
The source code will be made available.
An energy-harvesting sensor node that is sending status updates to a destination is considered.
The sensor is equipped with a battery of finite size to save its incoming energy, and consumes one unit of energy per status update transmission, which is delivered to the destination instantly over an error-free channel.
The setting is online in which the harvested energy is revealed to the sensor causally over time, and the goal is to design status update transmission policy such that the long term average age of information (AoI) is minimized.
AoI is defined as the time elapsed since the latest update has reached at the destination.
Two energy arrival models are considered: a random battery recharge (RBR) model, and an incremental battery recharge (IBR) model.
In both models, energy arrives according to a Poisson process with unit rate, with values that completely fill up the battery in the RBR model, and with values that fill up the battery incrementally, unit-by-unit, in the IBR model.
The key approach to characterizing the optimal status update policy for both models is showing the optimality of renewal policies, in which the inter-update times follow a specific renewal process that depends on the energy arrival model and the battery size.
It is then shown that the optimal renewal policy has an energy-dependent threshold structure, in which the sensor sends a status update only if the AoI grows above a certain threshold that depends on the energy available.
For both the RBR and the IBR models, the optimal energy-dependent thresholds are characterized explicitly, i.e., in closed-form, in terms of the optimal long term average AoI.
It is also shown that the optimal thresholds are monotonically decreasing in the energy available in the battery, and that the smallest threshold, which comes in effect when the battery is full, is equal to the optimal long term average AoI.
In this note, we generalize the results of arXiv:0901.2703v1 We show that all one-way quantum finite automaton (QFA) models that are at least as general as Kondacs-Watrous QFA's are equivalent in power to classical probabilistic finite automata in this setting.
Unlike their probabilistic counterparts, allowing the tape head to stay put for some steps during its traversal of the input does enlarge the class of languages recognized by such QFA's with unbounded error.
(Note that, the proof of Theorem 1 in the abstract was presented in the previous version (arXiv:0901.2703v1).)
Information hiding is an active area of research where secret information is embedded in innocent-looking carriers such as images and videos for hiding its existence while maintaining their visual quality.
Researchers have presented various image steganographic techniques since the last decade, focusing on payload and image quality.
However, there is a trade-off between these two metrics and keeping a better balance between them is still a challenging issue.
In addition, the existing methods fail to achieve better security due to direct embedding of secret data inside images without encryption consideration, making data extraction relatively easy for adversaries.
Therefore, in this work, we propose a secure image steganographic framework based on stego key-directed adaptive least significant bit (SKA-LSB) substitution method and multi-level cryptography.
In the proposed scheme, stego key is encrypted using a two-level encryption algorithm (TLEA); secret data is encrypted using a multi-level encryption algorithm (MLEA), and the encrypted information is then embedded in the host image using an adaptive LSB substitution method, depending on secret key, red channel, MLEA, and sensitive contents.
The quantitative and qualitative experimental results indicate that the proposed framework maintains a better balance between image quality and security, achieving a reasonable payload with relatively less computational complexity, which confirms its effectiveness compared to other state-of-the-art techniques.
Program transformations are widely used in synthesis, optimization, and maintenance of software.
Correctness of program transformations depends on preservation of some important properties of the input program.
By regarding programs as Kripke structures, many interesting properties of programs can be expressed in temporal logics.
In temporal logic, a formula is interpreted on a single program.
However, to prove correctness of transformations, we encounter formulae which contain some subformulae interpreted on the input program and some on the transformed program.
An example where such a situation arises is verification of optimizing program transformations applied by compilers.
In this paper, we present a logic called Temporal Transformation Logic (TTL) to reason about such formulae.
We consider different types of primitive transformations and present TTL inference rules for them.
Our definitions of program transformations and temporal logic operators are novel in their use of the boolean matrix algebra.
This results in specifications that are succinct and constructive.
Further, we use the boolean matrix algebra in a uniform manner to prove soundness of the TTL inference rules.
Multilingual spoken dialogue systems have gained prominence in the recent past necessitating the requirement for a front-end Language Identification (LID) system.
Most of the existing LID systems rely on modeling the language discriminative information from low-level acoustic features.
Due to the variabilities of speech (speaker and emotional variabilities, etc.), large-scale LID systems developed using low-level acoustic features suffer from a degradation in the performance.
In this approach, we have attempted to model the higher level language discriminative phonotactic information for developing an LID system.
In this paper, the input speech signal is tokenized to phone sequences by using a language independent phone recognizer.
The language discriminative phonotactic information in the obtained phone sequences are modeled using statistical and recurrent neural network based language modeling approaches.
As this approach, relies on higher level phonotactical information it is more robust to variabilities of speech.
Proposed approach is computationally light weight, highly scalable and it can be used in complement with the existing LID systems.
The idea of style transfer has largely only been explored in image-based tasks, which we attribute in part to the specific nature of loss functions used for style transfer.
We propose a general formulation of style transfer as an extension of generative adversarial networks, by using a discriminator to regularize a generator with an otherwise separate loss function.
We apply our approach to the task of learning to play chess in the style of a specific player, and present empirical evidence for the viability of our approach.
In this paper, we present UNet++, a new, more powerful architecture for medical image segmentation.
Our architecture is essentially a deeply-supervised encoder-decoder network where the encoder and decoder sub-networks are connected through a series of nested, dense skip pathways.
The re-designed skip pathways aim at reducing the semantic gap between the feature maps of the encoder and decoder sub-networks.
We argue that the optimizer would deal with an easier learning task when the feature maps from the decoder and encoder networks are semantically similar.
We have evaluated UNet++ in comparison with U-Net and wide U-Net architectures across multiple medical image segmentation tasks: nodule segmentation in the low-dose CT scans of chest, nuclei segmentation in the microscopy images, liver segmentation in abdominal CT scans, and polyp segmentation in colonoscopy videos.
Our experiments demonstrate that UNet++ with deep supervision achieves an average IoU gain of 3.9 and 3.4 points over U-Net and wide U-Net, respectively.
Neural network accelerators with low latency and low energy consumption are desirable for edge computing.
To create such accelerators, we propose a design flow for accelerating the extremely low bit-width neural network (ELB-NN) in embedded FPGAs with hybrid quantization schemes.
This flow covers both network training and FPGA-based network deployment, which facilitates the design space exploration and simplifies the tradeoff between network accuracy and computation efficiency.
Using this flow helps hardware designers to deliver a network accelerator in edge devices under strict resource and power constraints.
We present the proposed flow by supporting hybrid ELB settings within a neural network.
Results show that our design can deliver very high performance peaking at 10.3 TOPS and classify up to 325.3 image/s/watt while running large-scale neural networks for less than 5W using embedded FPGA.
To the best of our knowledge, it is the most energy efficient solution in comparison to GPU or other FPGA implementations reported so far in the literature.
A temporal graph is a data structure, consisting of nodes and edges in which the edges are associated with time labels.
To analyze the temporal graph, the first step is to find a proper graph dataset/benchmark.
While many temporal graph datasets exist online, none could be found that used the interval labels in which each edge is associated with a starting and ending time.
Therefore we create a temporal graph data based on Wikipedia reference graph for temporal analysis.
This report aims to provide more details of this graph benchmark to those who are interested in using it.
In order to boost the performance of data-intensive computing on HPC systems, in-memory computing frameworks, such as Apache Spark and Flink, use local DRAM for data storage.
Optimizing the memory allocation to data storage is critical to delivering performance to traditional HPC compute jobs and throughput to data-intensive applications sharing the HPC resources.
Current practices that statically configure in-memory storage may leave inadequate space for compute jobs or lose the opportunity to utilize more available space for data-intensive applications.
In this paper, we explore techniques to dynamically adjust in-memory storage and make the right amount of space for compute jobs.
We have developed a dynamic memory controller, DynIMS, which infers memory demands of compute tasks online and employs a feedback-based control model to adapt the capacity of in-memory storage.
We test DynIMS using mixed HPCC and Spark workloads on a HPC cluster.
Experimental results show that DynIMS can achieve up to 5X performance improvement compared to systems with static memory allocations.
This paper discusses sample allocation problem (SAP) in frequency-domain Compressive Sampling (CS) of time-domain signals.
An analysis that is relied on two fundamental CS principles; the Uniform Random Sampling (URS) and the Uncertainty Principle (UP), is presented.
We show that CS on a single- and multi-band signals performs better if the URS is done only within the band and suppress the out-band parts, compared to ordinary URS that ignore the band limits.
It means that sampling should only be done at the signal support, while the non-support should be masked and suppressed in the reconstruction process.
We also show that for an N-length discrete time signal with K-number of frequency components (Fourier coefficients), given the knowledge of the spectrum, URS leads to exact sampling on the location of the K-spectral peaks.
These results are used to formulate a sampling scheme when the boundaries of the bands are not sharply distinguishable, such as in a triangular- or a stacked-band- spectral signals.
When analyzing these cases, CS will face a paradox; in which narrowing the band leads to a more number of required samples, whereas widening it leads to lessen the number.
Accordingly; instead of signal analysis by dividing the signal's spectrum vertically into bands of frequencies, slicing horizontally magnitude-wise yields less number of required sample and better reconstruction results.
Moreover, it enables sample reuse that reduces the sample number even further.
The horizontal slicing and sample reuse methods imply non-uniform random sampling, where larger-magnitude part of the spectrum should be allocated more sample than the lower ones.
An IC-plane graph is a topological graph where every edge is crossed at most once and no two crossed edges share a vertex.
We show that every IC-plane graph has a visibility drawing where every vertex is an L-shape, and every edge is either a horizontal or vertical segment.
As a byproduct of our drawing technique, we prove that an IC-plane graph has a RAC drawing in quadratic area with at most two bends per edge.
Here a method is presented for detecting precursors of earthquakes from time series data on earthquakes in a target region.
Regional Entropy of Seismic Information, a quantity representing the average influence of an earthquake in the target region to the diversity of clusters to which earthquakes distribute, is introduced.
Based on a rough qualitative model of the dynamics of land crust, it is hypothesized that the saturation after the increase in the Regional Entropy of Seismic Information precedes the activation of earthquakes.
On the open earthquake catalog, this hypothesis is validated.
This temporal change turned out to correlate more with the activation of earthquakes in Japanese regions, by one to two years precedence, than the compared baseline methods.
An important challenge in the process of tracking and detecting the dissemination of misinformation is to understand the political gap between people that engage with the so called "fake news".
A possible factor responsible for this gap is opinion polarization, which may prompt the general public to classify content that they disagree or want to discredit as fake.
In this work, we study the relationship between political polarization and content reported by Twitter users as related to "fake news".
We investigate how polarization may create distinct narratives on what misinformation actually is.
We perform our study based on two datasets collected from Twitter.
The first dataset contains tweets about US politics in general, from which we compute the degree of polarization of each user towards the Republican and Democratic Party.
In the second dataset, we collect tweets and URLs that co-occurred with "fake news" related keywords and hashtags, such as #FakeNews and #AlternativeFact, as well as reactions towards such tweets and URLs.
We then analyze the relationship between polarization and what is perceived as misinformation, and whether users are designating information that they disagree as fake.
Our results show an increase in the polarization of users and URLs associated with fake-news keywords and hashtags, when compared to information not labeled as "fake news".
We discuss the impact of our findings on the challenges of tracking "fake news" in the ongoing battle against misinformation.
In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN).
However, most of the works only exploit single type of training data.
In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far.
Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains.
The advantage of our proposed method lies on the fact that We can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities.
In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species.
We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data.
We also show that transfer learning can significantly increase the classification performance.
Poor road conditions are a public nuisance, causing passenger discomfort, damage to vehicles, and accidents.
In the U.S., road-related conditions are a factor in 22,000 of the 42,000 traffic fatalities each year.
Although we often complain about bad roads, we have no way to detect or report them at scale.
To address this issue, we developed a system to detect potholes and assess road conditions in real-time.
Our solution is a mobile application that captures data on a car's movement from gyroscope and accelerometer sensors in the phone.
To assess roads using this sensor data, we trained SVM models to classify road conditions with 93% accuracy and potholes with 92% accuracy, beating the base rate for both problems.
As the user drives, the models use the sensor data to classify whether the road is good or bad, and whether it contains potholes.
Then, the classification results are used to create data-rich maps that illustrate road conditions across the city.
Our system will empower civic officials to identify and repair damaged roads which inconvenience passengers and cause accidents.
This paper details our data science process for collecting training data on real roads, transforming noisy sensor data into useful signals, training and evaluating machine learning models, and deploying those models to production through a real-time classification app.
It also highlights how cities can use our system to crowdsource data and deliver road repair resources to areas in need.
In this paper, we propose a novel approach for verification of on-line signatures based on user dependent feature selection and symbolic representation.
Unlike other signature verification methods, which work with same features for all users, the proposed approach introduces the concept of user dependent features.
It exploits the typicality of each and every user to select different features for different users.
Initially all possible features are extracted for all users and a method of feature selection is employed for selecting user dependent features.
The selected features are clustered using Fuzzy C means algorithm.
In order to preserve the intra-class variation within each user, we recommend to represent each cluster in the form of an interval valued symbolic feature vector.
A method of signature verification based on the proposed cluster based symbolic representation is also presented.
Extensive experimentations are conducted on MCYT-100 User (DB1) and MCYT-330 User (DB2) online signature data sets to demonstrate the effectiveness of the proposed novel approach.
We consider the problem of stealth communication over a multipath network in the presence of an active adversary.
The multipath network consists of multiple parallel noiseless links, and the adversary is able to eavesdrop and jam a subset of links.
We consider two types of jamming --- erasure jamming and overwrite jamming.
We require the communication to be both stealthy and reliable, i.e., the adversary should be unable to detect whether or not meaningful communication is taking place, while the legitimate receiver should reconstruct any potential messages from the transmitter with high probability simultaneously.
We provide inner bounds on the robust stealth capacities under both adversarial erasure and adversarial overwrite jamming.
Nature has always been an inspiration to researchers with its diversity and robustness of its systems, and Artificial Immune Systems are one of them.
Many algorithms were inspired by ongoing discoveries of biological immune systems techniques and approaches.
One of the basic and most common approach is the Negative Selection Approach, which is simple and easy to implement.
It was applied in many fields, but mostly in anomaly detection for the similarity of its basic idea.
In this paper, a review is given on the application of negative selection approach in network security, specifically the intrusion detection system.
As the work in this field is limited, we need to understand what the challenges of this approach are.
Recommendations are given by the end of the paper for future work.
We introduce an algorithm for detection of bugs in sequential circuits.
This algorithm is incomplete i.e. its failure to find a bug breaking a property P does not imply that P holds.
The appeal of incomplete algorithms is that they scale better than their complete counterparts.
However, to make an incomplete algorithm effective one needs to guarantee that the probability of finding a bug is reasonably high.
We try to achieve such effectiveness by employing the Test-As-Proofs (TAP) paradigm.
In our TAP based approach, a counterexample is built as a sequence of states extracted from proofs that some local variations of property P hold.
This increases the probability that a) a representative set of states is examined and that b) the considered states are relevant to property P.   We describe an algorithm of test generation based on the TAP paradigm and give preliminary experimental results.
Calculi of string diagrams are increasingly used to present the syntax and algebraic structure of various families of circuits, including signal flow graphs, electrical circuits and quantum processes.
In many such approaches, the semantic interpretation for diagrams is given in terms of relations or corelations (generalised equivalence relations) of some kind.
In this paper we show how semantic categories of both relations and corelations can be characterised as colimits of simpler categories.
This modular perspective is important as it simplifies the task of giving a complete axiomatisation for semantic equivalence of string diagrams.
Moreover, our general result unifies various theorems that are independently found in literature and are relevant for program semantics, quantum computation and control theory.
This article presents a measure of semantic similarity in an IS-A taxonomy based on the notion of shared information content.
Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach.
The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their effectiveness.
The basic idea behind information algebras is that information comes in pieces, each referring to a certain question, that these pieces can be combined or aggregated and that the part relating to a given question can be extracted.
This algebraic structure can be given different forms.
Questions were originally represented by subsets of variables.
Pieces of information were then represented by valuations associated with the domains of variables.
This leads to an algebraic structure called valuation algebras.
The basic axiomatics of this algebraic structure was in essence proposed by Shenoy and Shafer.
Here a much more general view of systems of questions is proposed and pieces of information are related to the elements of this system of questions.
This leads to a new and extended system of axioms for information algebras.
Classical valuation algebras are essentially a special case of this new system.
A full discussion of the algebraic theory of this new information algebras is given, including local computation, duality between labeled and domain-free versions of the algebras, order of information, finiteness of information and approximation, compact and continuous information algebras.
Finally a rather complete discussion of uncertain information, based on random maps into information algebras is presented.
This is shown to represent a generalisation of classical Dempster-Shafer theory.
Recent networking research has identified that data-driven congestion control (CC) can be more efficient than traditional CC in TCP.
Deep reinforcement learning (RL), in particular, has the potential to learn optimal network policies.
However, RL suffers from instability and over-fitting, deficiencies which so far render it unacceptable for use in datacenter networks.
In this paper, we analyze the requirements for RL to succeed in the datacenter context.
We present a new emulator, Iroko, which we developed to support different network topologies, congestion control algorithms, and deployment scenarios.
Iroko interfaces with the OpenAI gym toolkit, which allows for fast and fair evaluation of different RL and traditional CC algorithms under the same conditions.
We present initial benchmarks on three deep RL algorithms compared to TCP New Vegas and DCTCP.
Our results show that these algorithms are able to learn a CC policy which exceeds the performance of TCP New Vegas on a dumbbell and fat-tree topology.
We make our emulator open-source and publicly available: https://github.com/dcgym/iroko
We consider the problem of minimizing a linear function over an affine section of the cone of positive semidefinite matrices, with the additional constraint that the feasible matrix has prescribed rank.
When the rank constraint is active, this is a non-convex optimization problem, otherwise it is a semidefinite program.
Both find numerous applications especially in systems control theory and combinatorial optimization, but even in more general contexts such as polynomial optimization or real algebra.
While numerical algorithms exist for solving this problem, such as interior-point or Newton-like algorithms, in this paper we propose an approach based on symbolic computation.
We design an exact algorithm for solving rank-constrained semidefinite programs, whose complexity is essentially quadratic on natural degree bounds associated to the given optimization problem: for subfamilies of the problem where the size of the feasible matrix is fixed, the complexity is polynomial in the number of variables.
The algorithm works under assumptions on the input data: we prove that these assumptions are generically satisfied.
We also implement it in Maple and discuss practical experiments.
Weakly supervised temporal action detection is a Herculean task in understanding untrimmed videos, since no supervisory signal except the video-level category label is available on training data.
Under the supervision of category labels, weakly supervised detectors are usually built upon classifiers.
However, there is an inherent contradiction between classifier and detector; i.e., a classifier in pursuit of high classification performance prefers top-level discriminative video clips that are extremely fragmentary, whereas a detector is obliged to discover the whole action instance without missing any relevant snippet.
To reconcile this contradiction, we train a detector by driving a series of classifiers to find new actionness clips progressively, via step-by-step erasion from a complete video.
During the test phase, all we need to do is to collect detection results from the one-by-one trained classifiers at various erasing steps.
To assist in the collection process, a fully connected conditional random field is established to refine the temporal localization outputs.
We evaluate our approach on two prevailing datasets, THUMOS'14 and ActivityNet.
The experiments show that our detector advances state-of-the-art weakly supervised temporal action detection results, and even compares with quite a few strongly supervised methods.
With the increase in interchange of data, there is a growing necessity of security.
Considering the volumes of digital data that is transmitted, they are in need to be secure.
Among the many forms of tampering possible, one widespread technique is Copy Move Forgery CMF.
This forgery occurs when parts of the image are copied and duplicated elsewhere in the same image.
There exist a number of algorithms to detect such a forgery in which the primary step involved is feature extraction.
The feature extraction techniques employed must have lesser time and space complexity involved for an efficient and faster processing of media.
Also, majority of the existing state of art techniques often tend to falsely match similar genuine objects as copy move forged during the detection process.
To tackle these problems, the paper proposes a novel algorithm that recognizes a unique approach of using Hus Invariant Moments and Log polar Transformations to reduce feature vector dimension to one feature per block simultaneously detecting CMF among genuine similar objects in an image.
The qualitative and quantitative results obtained demonstrate the effectiveness of this algorithm.
Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT).
However, SMT is usually better than NMT in translation adequacy.
It is therefore a promising direction to combine the advantages of both NMT and SMT.
In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation.
Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.
Deep Neural Networks (DNNs) often struggle with one-shot learning where we have only one or a few labeled training examples per category.
In this paper, we argue that by using side information, we may compensate the missing information across classes.
We introduce two statistical approaches for fusing side information into data representation learning to improve one-shot learning.
First, we propose to enforce the statistical dependency between data representations and multiple types of side information.
Second, we introduce an attention mechanism to efficiently treat examples belonging to the 'lots-of-examples' classes as quasi-samples (additional training samples) for 'one-example' classes.
We empirically show that our learning architecture improves over traditional softmax regression networks as well as state-of-the-art attentional regression networks on one-shot recognition tasks.
In recent years, the importance of deep learning has significantly increased in pattern recognition, computer vision, and artificial intelligence research, as well as in industry.
However, despite the existence of multiple deep learning frameworks, there is a lack of comprehensible and easy-to-use high-level tools for the design, training, and testing of deep neural networks (DNNs).
In this paper, we introduce Barista, an open-source graphical high-level interface for the Caffe deep learning framework.
While Caffe is one of the most popular frameworks for training DNNs, editing prototext files in order to specify the net architecture and hyper parameters can become a cumbersome and error-prone task.
Instead, Barista offers a fully graphical user interface with a graph-based net topology editor and provides an end-to-end training facility for DNNs, which allows researchers to focus on solving their problems without having to write code, edit text files, or manually parse logged data.
Privacy definitions provide ways for trading-off the privacy of individuals in a statistical database for the utility of downstream analysis of the data.
In this paper, we present Blowfish, a class of privacy definitions inspired by the Pufferfish framework, that provides a rich interface for this trade-off.
In particular, we allow data publishers to extend differential privacy using a policy, which specifies (a) secrets, or information that must be kept secret, and (b) constraints that may be known about the data.
While the secret specification allows increased utility by lessening protection for certain individual properties, the constraint specification provides added protection against an adversary who knows correlations in the data (arising from constraints).
We formalize policies and present novel algorithms that can handle general specifications of sensitive information and certain count constraints.
We show that there are reasonable policies under which our privacy mechanisms for k-means clustering, histograms and range queries introduce significantly lesser noise than their differentially private counterparts.
We quantify the privacy-utility trade-offs for various policies analytically and empirically on real datasets.
This paper shows a vulnerability of the pay-per-click accounting of Google Ads and proposes a statistical tradeoff-based approach to manage this vulnerability.
The result of this paper is a model to calculate the overhead cost per click necessary to protect the subscribers and a simple algorithm to implement this protection.
Simulations validate the correctness of the model and the economical applicability.
We propose a simple solution to the uncertain delay problem in USRP (Universal Software Radio Peripheral)-based SDR (Software-Defined Radio)-radar systems.
Instead of time-synchronization as employed in (pseudo-) passive radar configurations, which require at least two synchronized receivers, we use direct reception signal in a single receiver system as a reference to the exact location of the target echoes.
After finding the reference position, reordering of the echoes is conducted by circular shift so that the reference moved to the origin.
We demonstrate the effectiveness of the proposed method by simulating the problem on Matlab and implementing a 128 length random code radar on a USRP.
The random code is constructed from zero padded Barker sequence product.
Experiments on measuring multiple echoes of the targets at precise range bins confirm the applicability of the proposed method.
In this paper, we propose a novel medical image segmentation using iterative deep learning framework.
We have combined an iterative learning approach and an encoder-decoder network to improve segmentation results, which enables to precisely localize the regions of interest (ROIs) including complex shapes or detailed textures of medical images in an iterative manner.
The proposed iterative deep convolutional encoder-decoder network consists of two main paths: convolutional encoder path and convolutional decoder path with iterative learning.
Experimental results show that the proposed iterative deep learning framework is able to yield excellent medical image segmentation performances for various medical images.
The effectiveness of the proposed method has been proved by comparing with other state-of-the-art medical image segmentation methods.
Autonomous dual-arm manipulation is an essential skill to deploy robots in unstructured scenarios.
However, this is a challenging undertaking, particularly in terms of perception and planning.
Unstructured scenarios are full of objects with different shapes and appearances that have to be grasped in a very specific manner so they can be functionally used.
In this paper we present an integrated approach to perform dual-arm pick tasks autonomously.
Our method consists of semantic segmentation, object pose estimation, deformable model registration, grasp planning and arm trajectory optimization.
The entire pipeline can be executed on-board and is suitable for on-line grasping scenarios.
For this, our approach makes use of accumulated knowledge expressed as convolutional neural network models and low-dimensional latent shape spaces.
For manipulating objects, we propose a stochastic trajectory optimization that includes a kinematic chain closure constraint.
Evaluation in simulation and on the real robot corroborates the feasibility and applicability of the proposed methods on a task of picking up unknown watering cans and drills using both arms.
Learning from a real-world data stream and continuously updating the model without explicit supervision is a new challenge for NLP applications with machine learning components.
In this work, we have developed an adaptive learning system for text simplification, which improves the underlying learning-to-rank model from usage data, i.e.how users have employed the system for the task of simplification.
Our experimental result shows that, over a period of time, the performance of the embedded paraphrase ranking model increases steadily improving from a score of 62.88% up to 75.70% based on the NDCG@10 evaluation metrics.
To our knowledge, this is the first study where an NLP component is adaptively improved through usage.
Minimisation of discrete energies defined over factors is an important problem in computer vision, and a vast number of MAP inference algorithms have been proposed.
Different inference algorithms perform better on factor graph models (GMs) from different underlying problem classes, and in general it is difficult to know which algorithm will yield the lowest energy for a given GM.
To mitigate this difficulty, survey papers advise the practitioner on what algorithms perform well on what classes of models.
We take the next step forward, and present a technique to automatically select the best inference algorithm for an input GM.
We validate our method experimentally on an extended version of the OpenGM2 benchmark, containing a diverse set of vision problems.
On average, our method selects an inference algorithm yielding labellings with 96% of variables the same as the best available algorithm.
Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning.
In this study, we ask whether sample complexity can be reduced for systems where the structure of the domain is unknown beforehand, and the structure and parameters must both be learned from the data.
We show that sample complexity reduction through learning structure is possible for at least two simple cases.
In studying these cases, we also gain insight into how this might be done for more complex domains.
In this paper, we present "FabSearch", a prototype search engine for sourcing manufacturer service providers, by making use of the product manufacturing information contained within a 3D digital file of a product.
FabSearch is designed to take in a query 3D model, such as the .STEP file of a part model which then produces a ranked list of job shop service providers who are best suited to fabricate the part.
Service providers may have potentially built hundreds to thousands of parts with associated part 3D models over time.
FabSearch assumes that these service providers have shared shape signatures of the part models built previously to enable the algorithm to most effectively rank the service providers who have the most experience to build the query part model.
FabSearch has two important features that helps it produce relevant results.
First, it makes use of the shape characteristics of the 3D part by calculating the Spherical Harmonics signature of the part to calculate the most similar shapes built previously be job shop service providers.
Second, FabSearch utilizes meta-data about each part, such as material specification, tolerance requirements to help improve the search results based on the specific query model requirements.
The algorithm is tested against a repository containing more than 2000 models distributed across various job shop service providers.
For the first time, we show the potential for utilizing the rich information contained within a 3D part model to automate the sourcing and eventual selection of manufacturing service providers.
We slightly improve the known lower bound on the asymptotic competitive ratio for online bin packing of rectangles.
We present a complete proof for the new lower bound, whose value is above 1.91.
Provided significant future progress in artificial intelligence and computing, it may ultimately be possible to create multiple Artificial General Intelligences (AGIs), and possibly entire societies living within simulated environments.
In that case, it should be possible to improve the problem solving capabilities of the system by increasing the speed of the simulation.
If a minimal simulation with sufficient capabilities is created, it might manage to increase its own speed by accelerating progress in science and technology, in a way similar to the Technological Singularity.
This may ultimately lead to large simulated civilizations unfolding at extreme temporal speedups, achieving what from the outside would look like a Temporal Singularity.
Here we discuss the feasibility of the minimal simulation and the potential advantages, dangers, and connection to the Fermi paradox of the Temporal Singularity.
The medium-term importance of the topic derives from the amount of computational power required to start the process, which could be available within the next decades, making the Temporal Singularity theoretically possible before the end of the century.
Modal analysis is the process of estimating a system's modal parameters such as its natural frequencies and mode shapes.
One application of modal analysis is in structural health monitoring (SHM), where a network of sensors may be used to collect vibration data from a physical structure such as a building or bridge.
There is a growing interest in developing automated techniques for SHM based on data collected in a wireless sensor network.
In order to conserve power and extend battery life, however, it is desirable to minimize the amount of data that must be collected and transmitted in such a sensor network.
In this paper, we highlight the fact that modal analysis can be formulated as an atomic norm minimization (ANM) problem, which can be solved efficiently and in some cases recover perfectly a structure's mode shapes and frequencies.
We survey a broad class of sampling and compression strategies that one might consider in a physical sensor network, and we provide bounds on the sample complexity of these compressive schemes in order to recover a structure's mode shapes and frequencies via ANM.
A main contribution of our paper is to establish a bound on the sample complexity of modal analysis with random temporal compression, and in this scenario we prove that the samples per sensor can actually decrease as the number of sensors increases.
We also extend an atomic norm denoising problem to the multiple measurement vector (MMV) setting in the case of uniform sampling.
In order to handle the complexity and heterogeneity of mod- ern instruction set architectures, analysis platforms share a common design, the adoption of hardware-independent intermediate representa- tions.
The usage of these platforms to verify systems down to binary-level is appealing due to the high degree of automation they provide.
How- ever, it introduces the need for trusting the correctness of the translation from binary code to intermediate language.
Achieving a high degree of trust is challenging since this transpilation must handle (i) all the side effects of the instructions, (ii) multiple instruction encoding (e.g.ARM Thumb), and (iii) variable instruction length (e.g.Intel).
We overcome these problems by formally modeling one of such intermediate languages in the interactive theorem prover HOL4 and by implementing a proof- producing transpiler.
This tool translates ARMv8 programs to the in- termediate language and generates a HOL4 proof that demonstrates the correctness of the translation in the form of a simulation theorem.
We also show how the transpiler theorems can be used to transfer properties verified on the intermediate language to the binary code.
Systems for symbolic event recognition infer occurrences of events in time using a set of event definitions in the form of first-order rules.
The Event Calculus is a temporal logic that has been used as a basis in event recognition applications, providing among others, direct connections to machine learning, via Inductive Logic Programming (ILP).
We present an ILP system for online learning of Event Calculus theories.
To allow for a single-pass learning strategy, we use the Hoeffding bound for evaluating clauses on a subset of the input stream.
We employ a decoupling scheme of the Event Calculus axioms during the learning process, that allows to learn each clause in isolation.
Moreover, we use abductive-inductive logic programming techniques to handle unobserved target predicates.
We evaluate our approach on an activity recognition application and compare it to a number of batch learning techniques.
We obtain results of comparable predicative accuracy with significant speed-ups in training time.
We also outperform hand-crafted rules and match the performance of a sound incremental learner that can only operate on noise-free datasets.
This paper is under consideration for acceptance in TPLP.
The measurement of the biological tissue's electrical impedance is an active research field that has attracted a lot of attention during the last decades.
Bio-impedances are closely related to a large variety of physiological conditions; therefore, they are useful for diagnosis and monitoring in many medical applications.
Measuring living tissues, however, is a challenging task that poses countless technical and practical problems, in particular if the tissues need to be measured under the skin.
This paper presents a bio-impedance sensor ASIC targeting a battery-free, miniature size, implantable device, which performs accurate 4-point complex impedance extraction in the frequency range from 2 kHz to 2 MHz.
The ASIC is fabricated in 150 nm CMOS, has a size of 1.22 mm x 1.22 mm and consumes 165 uA from a 1.8 V power supply.
The ASIC is embedded in a prototype which communicates with, and is powered by an external reader device through inductive coupling.
The prototype is validated by measuring the impedances of different combinations of discrete components, measuring the electrochemical impedance of physiological solution, and performing ex vivo measurements on animal organs.
The proposed ASIC is able to extract complex impedances with around 1 Ohm resolution; therefore enabling accurate wireless tissue measurements.
The paper approaches the problem of image-to-text with attention-based encoder-decoder networks that are trained to handle sequences of characters rather than words.
We experiment on lines of text from a popular handwriting database with different attention mechanisms for the decoder.
The model trained with softmax attention achieves the lowest test error, outperforming several other RNN-based models.
Our results show that softmax attention is able to learn a linear alignment whereas the alignment generated by sigmoid attention is linear but much less precise.
Strongly multiplicative linear secret sharing schemes (LSSS) have been a powerful tool for constructing secure multiparty computation protocols.
However, it remains open whether or not there exist efficient constructions of strongly multiplicative LSSS from general LSSS.
In this paper, we propose the new concept of a 3-multiplicative LSSS, and establish its relationship with strongly multiplicative LSSS.
More precisely, we show that any 3-multiplicative LSSS is a strongly multiplicative LSSS, but the converse is not true; and that any strongly multiplicative LSSS can be efficiently converted into a 3-multiplicative LSSS.
Furthermore, we apply 3-multiplicative LSSS to the computation of unbounded fan-in multiplication, which reduces its round complexity to four (from five of the previous protocol based on strongly multiplicative LSSS).
We also give two constructions of 3-multiplicative LSSS from Reed-Muller codes and algebraic geometric codes.
We believe that the construction and verification of 3-multiplicative LSSS are easier than those of strongly multiplicative LSSS.
This presents a step forward in settling the open problem of efficient constructions of strongly multiplicative LSSS from general LSSS.
Training models for the automatic correction of machine-translated text usually relies on data consisting of (source, MT, human post- edit) triplets providing, for each source sentence, examples of translation errors with the corresponding corrections made by a human post-editor.
Ideally, a large amount of data of this kind should allow the model to learn reliable correction patterns and effectively apply them at test stage on unseen (source, MT) pairs.
In practice, however, their limited availability calls for solutions that also integrate in the training process other sources of knowledge.
Along this direction, state-of-the-art results have been recently achieved by systems that, in addition to a limited amount of available training data, exploit artificial corpora that approximate elements of the "gold" training instances with automatic translations.
Following this idea, we present eSCAPE, the largest freely-available Synthetic Corpus for Automatic Post-Editing released so far. eSCAPE consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit.
Translations are obtained both with phrase-based and neural models.
For each MT paradigm, eSCAPE contains 7.2 million triplets for English-German and 3.3 millions for English-Italian, resulting in a total of 14,4 and 6,6 million instances respectively.
The usefulness of eSCAPE is proved through experiments in a general-domain scenario, the most challenging one for automatic post-editing.
For both language directions, the models trained on our artificial data always improve MT quality with statistically significant gains.
The current version of eSCAPE can be freely downloaded from: http://hltshare.fbk.eu/QT21/eSCAPE.html.
Visual reasoning with compositional natural language instructions, e.g., based on the newly-released Cornell Natural Language Visual Reasoning (NLVR) dataset, is a challenging task, where the model needs to have the ability to create an accurate mapping between the diverse phrases and the several objects placed in complex arrangements in the image.
Further, this mapping needs to be processed to answer the question in the statement given the ordering and relationship of the objects across three similar images.
In this paper, we propose a novel end-to-end neural model for the NLVR task, where we first use joint bidirectional attention to build a two-way conditioning between the visual information and the language phrases.
Next, we use an RL-based pointer network to sort and process the varying number of unordered objects (so as to match the order of the statement phrases) in each of the three images and then pool over the three decisions.
Our model achieves strong improvements (of 4-6% absolute) over the state-of-the-art on both the structured representation and raw image versions of the dataset.
In this paper, two robust model predictive control (MPC) schemes are proposed for tracking control of nonholonomic systems with bounded disturbances: tube-MPC and nominal robust MPC (NRMPC).
In tube-MPC, the control signal consists of a control action and a nonlinear feedback law based on the deviation of the actual states from the states of a nominal system.
It renders the actual trajectory within a tube centered along the optimal trajectory of the nominal system.
Recursive feasibility and input-to-state stability are established and the constraints are ensured by tightening the input domain and the terminal region.
While in NRMPC, an optimal control sequence is obtained by solving an optimization problem based on the current state, and the first portion of this sequence is applied to the real system in an open-loop manner during each sampling period.
The state of nominal system model is updated by the actual state at each step, which provides additional a feedback.
By introducing a robust state constraint and tightening the terminal region, recursive feasibility and input-to-state stability are guaranteed.
Simulation results demonstrate the effectiveness of both strategies proposed.
The unprecedented growth of Internet users in recent years has resulted in an abundance of unstructured information in the form of social media text.
A large percentage of this population is actively engaged in health social networks to share health-related information.
In this paper, we address an important and timely topic by analyzing the users' sentiments and emotions w.r.t their medical conditions.
Towards this, we examine users on popular medical forums (Patient.info,dailystrength.org), where they post on important topics such as asthma, allergy, depression, and anxiety.
First, we provide a benchmark setup for the task by crawling the data, and further define the sentiment specific fine-grained medical conditions (Recovered, Exist, Deteriorate, and Other).
We propose an effective architecture that uses a Convolutional Neural Network (CNN) as a data-driven feature extractor and a Support Vector Machine (SVM) as a classifier.
We further develop a sentiment feature which is sensitive to the medical context.
Here, we show that the use of medical sentiment feature along with extracted features from CNN improves the model performance.
In addition to our dataset, we also evaluate our approach on the benchmark "CLEF eHealth 2014" corpora and show that our model outperforms the state-of-the-art techniques.
Unlike many complex networks studied in the literature, social networks rarely exhibit unanimous behavior, or consensus.
This requires a development of mathematical models that are sufficiently simple to be examined and capture, at the same time, the complex behavior of real social groups, where opinions and actions related to them may form clusters of different size.
One such model, proposed by Friedkin and Johnsen, extends the idea of conventional consensus algorithm (also referred to as the iterative opinion pooling) to take into account the actors' prejudices, caused by some exogenous factors and leading to disagreement in the final opinions.
In this paper, we offer a novel multidimensional extension, describing the evolution of the agents' opinions on several topics.
Unlike the existing models, these topics are interdependent, and hence the opinions being formed on these topics are also mutually dependent.
We rigorous examine stability properties of the proposed model, in particular, convergence of the agents' opinions.
Although our model assumes synchronous communication among the agents, we show that the same final opinions may be reached "on average" via asynchronous gossip-based protocols.
Directional or Circular statistics are pertaining to the analysis and interpretation of directions or rotations.
In this work, a novel probability distribution is proposed to model multidimensional sparse directional data.
The Generalised Directional Laplacian Distribution (DLD) is a hybrid between the Laplacian distribution and the von Mises-Fisher distribution.
The distribution's parameters are estimated using Maximum-Likelihood Estimation over a set of training data points.
Mixtures of Directional Laplacian Distributions (MDLD) are also introduced in order to model multiple concentrations of sparse directional data.
The author explores the application of the derived DLD mixture model to cluster sound sources that exist in an underdetermined instantaneous sound mixture.
The proposed model can solve the general K x L (K<L) underdetermined instantaneous source separation problem, offering a fast and stable solution.
Encoder-decoder models typically only employ words that are frequently used in the training corpus to reduce the computational costs and exclude noise.
However, this vocabulary set may still include words that interfere with learning in encoder-decoder models.
This paper proposes a method for selecting more suitable words for learning encoders by utilizing not only frequency, but also co-occurrence information, which we capture using the HITS algorithm.
We apply our proposed method to two tasks: machine translation and grammatical error correction.
For Japanese-to-English translation, this method achieves a BLEU score that is 0.56 points more than that of a baseline.
It also outperforms the baseline method for English grammatical error correction, with an F0.5-measure that is 1.48 points higher.
In this paper we are going to introduce a new nearest neighbours based approach to clustering, and compare it with previous solutions; the resulting algorithm, which takes inspiration from both DBscan and minimum spanning tree approaches, is deterministic but proves simpler, faster and doesnt require to set in advance a value for k, the number of clusters.
We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps.
We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects.
This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique.
In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation.
In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects.
Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps.
Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping.
We also draw a parallel to the classical approaches that rely on analytic formulations.
This paper proposes a new approach to a novel value network architecture for the game Go, called a multi-labelled (ML) value network.
In the ML value network, different values (win rates) are trained simultaneously for different settings of komi, a compensation given to balance the initiative of playing first.
The ML value network has three advantages, (a) it outputs values for different komi, (b) it supports dynamic komi, and (c) it lowers the mean squared error (MSE).
This paper also proposes a new dynamic komi method to improve game-playing strength.
This paper also performs experiments to demonstrate the merits of the architecture.
First, the MSE of the ML value network is generally lower than the value network alone.
Second, the program based on the ML value network wins by a rate of 67.6% against the program based on the value network alone.
Third, the program with the proposed dynamic komi method significantly improves the playing strength over the baseline that does not use dynamic komi, especially for handicap games.
To our knowledge, up to date, no handicap games have been played openly by programs using value networks.
This paper provides these programs with a useful approach to playing handicap games.
Public scientists (scientists only from now onwards), understood as a member of the teaching and/or research staff of a public university or a public research organization (including humanities and social sciences), benefit the academic community, industry and other social collectives through teaching and research.
Active involvement of scientists in culture is part of the richness of developed societies.
Some voices in current debates on the evaluation of societal impact and the role of universities towards social development are claiming a refocus from a socioeconomic perspective to also including sociocultural benefits from academiaIn this paper we will focus in one facet of cultural engagement; writing literary fiction.
We will narrow our general objective to local activities, due to the interest in the engagement of scientist on this geographic dimension.
Do local publishers include the literary work of scientists?
Are works written by scientists more likely to be local than works not written by scientists?
Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations.
In this paper, we consider one aspect of embedding spaces, namely their stability.
We show that even relatively high frequency words (100-200 occurrences) are often unstable.
We provide empirical evidence for how various factors contribute to the stability of word embeddings, and we analyze the effects of stability on downstream tasks.
Graphics Processing Units (GPUs) maintain a large register file to increase the thread level parallelism (TLP).
To increase the TLP further, recent GPUs have increased the number of on-chip registers in every generation.
However, with the increase in the register file size, the leakage power increases.
Also, with the technology advances, the leakage power component has increased and has become an important consideration for the manufacturing process.
The leakage power of a register file can be reduced by turning infrequently used registers into low power (drowsy or off) state after accessing them.
A major challenge in doing so is the lack of runtime register access information.
This paper proposes GREENER (GPU REgister file ENErgy Reducer): a system to minimize leakage energy of the register file of GPUs.
GREENER employs a compile-time analysis to estimate the run-time register access information.
The result of the analysis is used to determine the power state of the registers (ON, SLEEP, or OFF) after each instruction.
We propose a power optimized assembly instruction set that allows GREENER to encode the power state of the registers in the executable itself.
The modified assembly, along with a run-time optimization to update the power state of a register during execution, results in significant power reduction.
We implemented GREENER in GPGPU-Sim simulator, and used GPUWattch framework to measure the register file's leakage power.
Evaluation of GREENER on 21 kernels from CUDASDK, GPGPU-SIM, Parboil, and Rodinia benchmarks suites shows an average reduction of register leakage energy by 69.04% and maximum reduction of 87.95% with a negligible number of simulation cycles overhead (0.53% on average).
3D pose estimation is a key component of many important computer vision tasks such as autonomous navigation and 3D scene understanding.
Most state-of-the-art approaches to 3D pose estimation solve this problem as a pose-classification problem in which the pose space is discretized into bins and a CNN classifier is used to predict a pose bin.
We argue that the 3D pose space is continuous and propose to solve the pose estimation problem in a CNN regression framework with a suitable representation, data augmentation and loss function that captures the geometry of the pose space.
Experiments on PASCAL3D+ show that the proposed 3D pose regression approach achieves competitive performance compared to the state-of-the-art.
The NIPS 2018 Adversarial Vision Challenge is a competition to facilitate measurable progress towards robust machine vision models and more generally applicable adversarial attacks.
This document is an updated version of our competition proposal that was accepted in the competition track of 32nd Conference on Neural Information Processing Systems (NIPS 2018).
Estimation of surface curvature from range data is important for a range of tasks in computer vision and robotics, object segmentation, object recognition and robotic grasping estimation.
This work presents a fast method of robustly computing accurate metric principal curvature values from noisy point clouds which was implemented on GPU.
In contrast to existing readily available solutions which first differentiate the surface to estimate surface normals and then differentiate these to obtain curvature, amplifying noise, our method iteratively fits parabolic quadric surface patches to the data.
Additionally previous methods with a similar formulation use less robust techniques less applicable to a high noise sensor.
We demonstrate that our method is fast and provides better curvature estimates than existing techniques.
In particular we compare our method to several alternatives to demonstrate the improvement.
Many machine learning image classifiers are vulnerable to adversarial attacks, inputs with perturbations designed to intentionally trigger misclassification.
Current adversarial methods directly alter pixel colors and evaluate against pixel norm-balls: pixel perturbations smaller than a specified magnitude, according to a measurement norm.
This evaluation, however, has limited practical utility since perturbations in the pixel space do not correspond to underlying real-world phenomena of image formation that lead to them and has no security motivation attached.
Pixels in natural images are measurements of light that has interacted with the geometry of a physical scene.
As such, we propose the direct perturbation of physical parameters that underly image formation: lighting and geometry.
As such, we propose a novel evaluation measure, parametric norm-balls, by directly perturbing physical parameters that underly image formation.
One enabling contribution we present is a physically-based differentiable renderer that allows us to propagate pixel gradients to the parametric space of lighting and geometry.
Our approach enables physically-based adversarial attacks, and our differentiable renderer leverages models from the interactive rendering literature to balance the performance and accuracy trade-offs necessary for a memory-efficient and scalable adversarial data augmentation workflow.
This work focuses on the construction of optimized binary signaling schemes for two-sender uncoded transmission of correlated sources over non-orthogonal Gaussian multiple access channels.
Specifically, signal constellations with binary pulse-amplitude-modulation are designed for two senders to optimize the overall system performance.
Although the two senders transmit their own messages independently, it is observed that the correlation between message sources can be exploited to mitigate the interference present in the non-orthogonal multiple access channel.
Based on a performance analysis under joint maximum-a-posteriori decoding, optimized constellations for various basic waveform correlations between the senders are derived.
Numerical results further confirm the effectiveness of the proposed design.
This paper introduces CLEO, a novel preference elicitation algorithm capable of recommending complex objects in hybrid domains, characterized by both discrete and continuous attributes and constraints defined over them.
The algorithm assumes minimal initial information, i.e., a set of catalog attributes, and defines decisional features as logic formulae combining Boolean and algebraic constraints over the attributes.
The (unknown) utility of the decision maker (DM) is modelled as a weighted combination of features.
CLEO iteratively alternates a preference elicitation step, where pairs of candidate solutions are selected based on the current utility model, and a refinement step where the utility is refined by incorporating the feedback received.
The elicitation step leverages a Max-SMT solver to return optimal hybrid solutions according to the current utility model.
The refinement step is implemented as learning to rank, and a sparsifying norm is used to favour the selection of few informative features in the combinatorial space of candidate decisional features.
CLEO is the first preference elicitation algorithm capable of dealing with hybrid domains, thanks to the use of Max-SMT technology, while retaining uncertainty in the DM utility and noisy feedback.
Experimental results on complex recommendation tasks show the ability of CLEO to quickly focus towards optimal solutions, as well as its capacity to recover from suboptimal initial choices.
While no competitors exist in the hybrid setting, CLEO outperforms a state-of-the-art Bayesian preference elicitation algorithm when applied to a purely discrete task.
For reliable transmission across a noisy communication channel, classical results from information theory show that it is asymptotically optimal to separate out the source and channel coding processes.
However, this decomposition can fall short in the finite bit-length regime, as it requires non-trivial tuning of hand-crafted codes and assumes infinite computational power for decoding.
In this work, we propose to jointly learn the encoding and decoding processes using a new discrete variational autoencoder model.
By adding noise into the latent codes to simulate the channel during training, we learn to both compress and error-correct given a fixed bit-length and computational budget.
We obtain codes that are not only competitive against several separation schemes, but also learn useful robust representations of the data for downstream tasks such as classification.
Finally, inference amortization yields an extremely fast neural decoder, almost an order of magnitude faster compared to standard decoding methods based on iterative belief propagation.
We present a mathematical model to predict pedestrian motion over a finite horizon, intended for use in collision avoidance algorithms for autonomous driving.
The model is based on a road map structure, and assumes a rational pedestrian behavior.
We compare our model with the state-of-the art and discuss its accuracy, and limitations, both in simulations and in comparison to real data.
Markerless tracking of hands and fingers is a promising enabler for human-computer interaction.
However, adoption has been limited because of tracking inaccuracies, incomplete coverage of motions, low framerate, complex camera setups, and high computational requirements.
In this paper, we present a fast method for accurately tracking rapid and complex articulations of the hand using a single depth camera.
Our algorithm uses a novel detection-guided optimization strategy that increases the robustness and speed of pose estimation.
In the detection step, a randomized decision forest classifies pixels into parts of the hand.
In the optimization step, a novel objective function combines the detected part labels and a Gaussian mixture representation of the depth to estimate a pose that best fits the depth.
Our approach needs comparably less computational resources which makes it extremely fast (50 fps without GPU support).
The approach also supports varying static, or moving, camera-to-scene arrangements.
We show the benefits of our method by evaluating on public datasets and comparing against previous work.
In general, professionals still ignore scientific evidence in place of expert opinions in most of their decision-making.
For this reason, it is still common to see the adoption of new software technologies in the field without any scientific basis or well-grounded criteria, but on the opinions of experts.
Experimental Software Engineering is of paramount importance to provide the foundations to understand the limits and applicability of software technologies.
The need to better observe and understand the practice of Software Engineering leads us to look for alternative experimental approaches to support our studies.
Different research strategies can be used to explore different Software Engineering practices.
Action Research can be seen as one alternative to intensify the conducting of important experimental studies with results of great value while investigating the Software Engineering practices in depth.
In this paper, a discussion on the use of Action Research in Software Engineering is presented.
Aiming at better explaining the application of Action Research, an experimental study (in vivo) on the investigation of the subjective decisions of software developers, concerned with the refactoring of source code to improve source code quality in a distributed software development context is depicted.
In addition, some guidance on how to accomplish an Action Research study in Software Engineering supplement the discussions.
In human face-based biometrics, gender classification and age estimation are two typical learning tasks.
Although a variety of approaches have been proposed to handle them, just a few of them are solved jointly, even so, these joint methods do not yet specifically concern the semantic difference between human gender and age, which is intuitively helpful for joint learning, consequently leaving us a room of further improving the performance.
To this end, in this work we firstly propose a general learning framework for jointly estimating human gender and age by specially attempting to formulate such semantic relationships as a form of near-orthogonality regularization and then incorporate it into the objective of the joint learning framework.
In order to evaluate the effectiveness of the proposed framework, we exemplify it by respectively taking the widely used binary-class SVM for gender classification, and two threshold-based ordinal regression methods (i.e., the discriminant learning for ordinal regression and support vector ordinal regression) for age estimation, and crucially coupling both through the proposed semantic formulation.
Moreover, we develop its kernelized nonlinear counterpart by deriving a representer theorem for the joint learning strategy.
Finally, through extensive experiments on three aging datasets FG-NET, Morph Album I and Morph Album II, we demonstrate the effectiveness and superiority of the proposed joint learning strategy.
Phase retrieval refers to the problem of recovering real- or complex-valued vectors from magnitude measurements.
The best-known algorithms for this problem are iterative in nature and rely on so-called spectral initializers that provide accurate initialization vectors.
We propose a novel class of estimators suitable for general nonlinear measurement systems, called linear spectral estimators (LSPEs), which can be used to compute accurate initialization vectors for phase retrieval problems.
The proposed LSPEs not only provide accurate initialization vectors for noisy phase retrieval systems with structured or random measurement matrices, but also enable the derivation of sharp and nonasymptotic mean-squared error bounds.
We demonstrate the efficacy of LSPEs on synthetic and real-world phase retrieval problems, and show that our estimators significantly outperform existing methods for structured measurement systems that arise in practice.
Manipulations of return addresses on the stack are the basis for a variety of attacks on programs written in memory unsafe languages.
Dual stack schemes for protecting return addresses promise an efficient and effective defense against such attacks.
By introducing a second, safe stack to separate return addresses from potentially unsafe stack objects, they prevent attacks that, for example, maliciously modify a return address by overflowing a buffer.
However, the security of dual stacks is based on the concealment of the safe stack in memory.
Unfortunately, all current dual stack schemes are vulnerable to information disclosure attacks that are able to reveal the safe stack location, and therefore effectively break their promised security properties.
In this paper, we present a new, leak-resilient dual stack scheme capable of withstanding sophisticated information disclosure attacks.
We carefully study previous dual stack schemes and systematically develop a novel design for stack separation that eliminates flaws leading to the disclosure of safe stacks.
We show the feasibility and practicality of our approach by presenting a full integration into the LLVM compiler framework with support for the x86-64 and ARM64 architectures.
With an average of 2.7% on x86-64 and 0.0% on ARM64, the performance overhead of our implementation is negligible.
We live in a world where our personal data are both valuable and vulnerable to misappropriation through exploitation of security vulnerabilities in online services.
For instance, Dropbox, a popular cloud storage tool, has certain security flaws that can be exploited to compromise a user's data, one of which being that a user's access pattern is unprotected.
We have thus created an implementation of Path Oblivious RAM (Path ORAM) for Dropbox users to obfuscate path access information to patch this vulnerability.
This implementation differs significantly from the standard usage of Path ORAM, in that we introduce several innovations, including a dynamically growing and shrinking tree architecture, multi-block fetching, block packing and the possibility for multi-client use.
Our optimizations together produce about a 77% throughput increase and a 60% reduction in necessary tree size; these numbers vary with file size distribution.
Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems.
In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants.
We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Word2Vec.
We evaluated them intrinsically on syntactic and semantic analogies and extrinsically on POS tagging and sentence semantic similarity tasks.
The obtained results suggest that word analogies are not appropriate for word embedding evaluation; task-specific evaluations appear to be a better option.
This paper introduces a new IEEE 802.15.4 simulation model for OMNeT++ / INET.
802.15.4 is an important underlying standard for wireless sensor networks and Internet of Things scenarios.
The presented implementation is designed to be compatible with OMNeT++ 4.x and INET 2.x and laid-out to be expandable for newer revisions of the 802.15.4 standard.
The source code is available online https://github.com/michaelkirsche/IEEE802154INET-Standalone
Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications.
Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups.
Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.).
However, this type of comparison neglects the topological properties of the communities.
In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation.
Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure.
In order to mimic real-world systems, we use artificially generated realistic networks.
It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa.
They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment.
This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input.
When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class.
It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers.
Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine.
We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).
Computer vision has made remarkable progress in recent years.
Deep neural network (DNN) models optimized to identify objects in images exhibit unprecedented task-trained accuracy and, remarkably, some generalization ability: new visual problems can now be solved more easily based on previous learning.
Biological vision (learned in life and through evolution) is also accurate and general-purpose.
Is it possible that these different learning regimes converge to similar problem-dependent optimal computations?
We therefore asked whether the human system-level computation of visual perception has DNN correlates and considered several anecdotal test cases.
We found that perceptual sensitivity to image changes has DNN mid-computation correlates, while sensitivity to segmentation, crowding and shape has DNN end-computation correlates.
Our results quantify the applicability of using DNN computation to estimate perceptual loss, and are consistent with the fascinating theoretical view that properties of human perception are a consequence of architecture-independent visual learning.
In this paper, we propose a novel deep convolutional neural network (CNN)-based algorithm for solving ill-posed inverse problems.
Regularized iterative algorithms have emerged as the standard approach to ill-posed inverse problems in the past few decades.
These methods produce excellent results, but can be challenging to deploy in practice due to factors including the high computational cost of the forward and adjoint operators and the difficulty of hyper parameter selection.
The starting point of our work is the observation that unrolled iterative methods have the form of a CNN (filtering followed by point-wise non-linearity) when the normal operator (H*H, the adjoint of H times H) of the forward model is a convolution.
Based on this observation, we propose using direct inversion followed by a CNN to solve normal-convolutional inverse problems.
The direct inversion encapsulates the physical model of the system, but leads to artifacts when the problem is ill-posed; the CNN combines multiresolution decomposition and residual learning in order to learn to remove these artifacts while preserving image structure.
We demonstrate the performance of the proposed network in sparse-view reconstruction (down to 50 views) on parallel beam X-ray computed tomography in synthetic phantoms as well as in real experimental sinograms.
The proposed network outperforms total variation-regularized iterative reconstruction for the more realistic phantoms and requires less than a second to reconstruct a 512 x 512 image on GPU.
The heterogeneous cloud radio access network (H-CRAN) is a promising paradigm which incorporates the cloud computing into heterogeneous networks (HetNets), thereby taking full advantage of cloud radio access networks (C-RANs) and HetNets.
Characterizing the cooperative beamforming with fronthaul capacity and queue stability constraints is critical for multimedia applications to improving energy efficiency (EE) in H-CRANs.
An energy-efficient optimization objective function with individual fronthaul capacity and inter-tier interference constraints is presented in this paper for queue-aware multimedia H-CRANs.
To solve this non-convex objective function, a stochastic optimization problem is reformulated by introducing the general Lyapunov optimization framework.
Under the Lyapunov framework, this optimization problem is equivalent to an optimal network-wide cooperative beamformer design algorithm with instantaneous power, average power and inter-tier interference constraints, which can be regarded as the weighted sum EE maximization problem and solved by a generalized weighted minimum mean square error approach.
The mathematical analysis and simulation results demonstrate that a tradeoff between EE and queuing delay can be achieved, and this tradeoff strictly depends on the fronthaul constraint.
Transferring the knowledge of pretrained networks to new domains by means of finetuning is a widely used practice for applications based on discriminative models.
To the best of our knowledge this practice has not been studied within the context of generative deep networks.
Therefore, we study domain adaptation applied to image generation with generative adversarial networks.
We evaluate several aspects of domain adaptation, including the impact of target domain size, the relative distance between source and target domain, and the initialization of conditional GANs.
Our results show that using knowledge from pretrained networks can shorten the convergence time and can significantly improve the quality of the generated images, especially when the target data is limited.
We show that these conclusions can also be drawn for conditional GANs even when the pretrained model was trained without conditioning.
Our results also suggest that density may be more important than diversity and a dataset with one or few densely sampled classes may be a better source model than more diverse datasets such as ImageNet or Places.
Many tasks in music information retrieval, such as recommendation, and playlist generation for online radio, fall naturally into the query-by-example setting, wherein a user queries the system by providing a song, and the system responds with a list of relevant or similar song recommendations.
Such applications ultimately depend on the notion of similarity between items to produce high-quality results.
Current state-of-the-art systems employ collaborative filter methods to represent musical items, effectively comparing items in terms of their constituent users.
While collaborative filter techniques perform well when historical data is available for each item, their reliance on historical data impedes performance on novel or unpopular items.
To combat this problem, practitioners rely on content-based similarity, which naturally extends to novel items, but is typically out-performed by collaborative filter methods.
In this article, we propose a method for optimizing contentbased similarity by learning from a sample of collaborative filter data.
The optimized content-based similarity metric can then be applied to answer queries on novel and unpopular items, while still maintaining high recommendation accuracy.
The proposed system yields accurate and efficient representations of audio content, and experimental results show significant improvements in accuracy over competing content-based recommendation techniques.
The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention.
A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness.
If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understanding of real-world needs.
Through 35 semi-structured interviews and an anonymous survey of 267 ML practitioners, we conduct the first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems.
We identify areas of alignment and disconnect between the challenges faced by industry practitioners and solutions proposed in the fair ML research literature.
Based on these findings, we highlight directions for future ML and HCI research that will better address industry practitioners' needs.
Algorithmic and data bias are gaining attention as a pressing issue in popular press - and rightly so.
However, beyond these calls to action, standard processes and tools for practitioners do not readily exist to assess and address unfair algorithmic and data biases.
The literature is relatively scattered and the needed interdisciplinary approach means that very different communities are working on the topic.
We here provide a number of challenges encountered in assessing and addressing algorithmic and data bias in practice.
We describe an early approach that attempts to translate the literature into processes for (production) teams wanting to assess both intended data and algorithm characteristics and unintended, unfair biases.
In this study, a shell-and-tube heat exchanger (STHX) design based on seven continuous independent design variables is proposed.
Delayed Rejection Adaptive Metropolis hasting (DRAM) was utilized as a powerful tool in the Markov chain Monte Carlo (MCMC) sampling method.
This Reverse Sampling (RS) method was used to find the probability distribution of design variables of the shell and tube heat exchanger.
Thanks to this probability distribution, an uncertainty analysis was also performed to find the quality of these variables.
In addition, a decision-making strategy based on confidence intervals of design variables and on the Total Annual Cost (TAC) provides the final selection of design variables.
Results indicated high accuracies for the estimation of design variables which leads to marginally improved performance compared to commonly used optimization methods.
In order to verify the capability of the proposed method, a case of study is also presented, it shows that a significant cost reduction is feasible with respect to multi-objective and single-objective optimization methods.
Furthermore, the selected variables have good quality (in terms of probability distribution) and a lower TAC was also achieved.
Results show that the costs of the proposed design are lower than those obtained from optimization method reported in previous studies.
The algorithm was also used to determine the impact of using probability values for the design variables rather than single values to obtain the best heat transfer area and pumping power.
In particular, a reduction of the TAC up to 3.5% was achieved in the case considered.
Convolutional Neural Networks (CNNs) have gained a remarkable success on many real-world problems in recent years.
However, the performance of CNNs is highly relied on their architectures.
For some state-of-the-art CNNs, their architectures are hand-crafted with expertise in both CNNs and the investigated problems.
To this end, it is difficult for researchers, who have no extended expertise in CNNs, to explore CNNs for their own problems of interest.
In this paper, we propose an automatic architecture design method for CNNs by using genetic algorithms, which is capable of discovering a promising architecture of a CNN on handling image classification tasks.
The proposed algorithm does not need any pre-processing before it works, nor any post-processing on the discovered CNN, which means it is completely automatic.
The proposed algorithm is validated on widely used benchmark datasets, by comparing to the state-of-the-art peer competitors covering eight manually designed CNNs, four semi-automatically designed CNNs and additional four automatically designed CNNs.
The experimental results indicate that the proposed algorithm achieves the best classification accuracy consistently among manually and automatically designed CNNs.
Furthermore, the proposed algorithm also shows the competitive classification accuracy to the semi-automatic peer competitors, while reducing 10 times of the parameters.
In addition, on the average the proposed algorithm takes only one percentage of computational resource compared to that of all the other architecture discovering algorithms.
SEMAT/OMG Essence provides a powerful Language and a Kernel for describing software development processes.
How can it be tweaked to apply it to systems engineering methods description?
We must harmonize Essence and various systems engineering standards in order to provide a more formal system approach to obtaining a Systems Engineering Essence.
In this paper, an approach of using Essence for systems engineering is presented.
In this approach we partly modified a Kernel only within engineering solution area of concerns and completely preserved Language as an excellent situational method engineering foundation.
Cyberbullying is a disturbing online misbehaviour with troubling consequences.
It appears in different forms, and in most of the social networks, it is in textual format.
Automatic detection of such incidents requires intelligent systems.
Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time.
In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance.
In this paper, we investigate the findings of a recent literature in this regard.
We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors.
Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms.
We also transferred and evaluated the performance of the models trained on one platform to another platform.
Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset.
We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks.
To improve the accuracy of existing dust concentration measurements, a dust concentration measurement based on Moment of inertia in Gray level-Rank Co-occurrence Matrix (GRCM), which is from the dust image sample measured by a machine vision system is proposed in this paper.
Firstly, a Polynomial computational model between dust Concentration and Moment of inertia (PCM) is established by experimental methods and fitting methods.
Then computing methods for GRCM and its Moment of inertia are constructed by theoretical and mathematical analysis methods.
And then developing an on-line dust concentration vision measurement experimental system, the cement dust concentration measurement in a cement production workshop is taken as a practice example with the system and the PCM measurement.
The results show that measurement error is within 9%, and the measurement range is 0.5-1000 mg/m3.
Finally, comparing with the filter membrane weighing measurement, light scattering measurement and laser measurement, the proposed PCM measurement has advantages on error and cost, which can be provided a valuable reference for the dust concentration vision measurements.
In this paper we consider the problem of robot navigation in simple maze-like environments where the robot has to rely on its onboard sensors to perform the navigation task.
In particular, we are interested in solutions to this problem that do not require localization, mapping or planning.
Additionally, we require that our solution can quickly adapt to new situations (e.g., changing navigation goals and environments).
To meet these criteria we frame this problem as a sequence of related reinforcement learning tasks.
We propose a successor feature based deep reinforcement learning algorithm that can learn to transfer knowledge from previously mastered navigation tasks to new problem instances.
Our algorithm substantially decreases the required learning time after the first task instance has been solved, which makes it easily adaptable to changing environments.
We validate our method in both simulated and real robot experiments with a Robotino and compare it to a set of baseline methods including classical planning-based navigation.
This is an intuitive introduction to classic sliding mode control that shows how the associated assumptions and condition for its use arise in the context of a derivation of the method.
It derives a controller that obviates the need for the assumption of any sign for the control input vector, answers why it is said that it deals only with matched disturbances and why a system that it may apply to 'must' be linear in the control signal.
Additionally, it may be viewed as an example of how a control design method might be developed, adding to its pedagogical usefulness.
We present new algorithms for inference in credal networks --- directed acyclic graphs associated with sets of probabilities.
Credal networks are here interpreted as encoding strong independence relations among variables.
We first present a theory of credal networks based on separately specified sets of probabilities.
We also show that inference with polytrees is NP-hard in this setting.
We then introduce new techniques that reduce the computational effort demanded by inference, particularly in polytrees, by exploring separability of credal sets.
Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models.
However their modelling formulation is overly simplistic, and omits several key inductive biases built into traditional models.
In this paper we extend the attentional neural translation model to include structural biases from word based alignment models, including positional bias, Markov conditioning, fertility and agreement over translation directions.
We show improvements over a baseline attentional model and standard phrase-based model over several language pairs, evaluating on difficult languages in a low resource setting.
The prevention of domestic violence (DV) have aroused serious concerns in Taiwan because of the disparity between the increasing amount of reported DV cases that doubled over the past decade and the scarcity of social workers.
Additionally, a large amount of data was collected when social workers use the predominant case management approach to document case reports information.
However, these data were not properly stored or organized.
To improve the efficiency of DV prevention and risk management, we worked with Taipei City Government and utilized the 2015 data from its DV database to perform a spatial pattern analysis of the reports of DV cases to build a DV risk map.
However, during our map building process, the issue of confounding bias arose because we were not able to verify if reported cases truly reflected real violence occurrence or were simply false reports from potential victim's neighbors.
Therefore, we used the random forest method to build a repeat victimization risk prediction model.
The accuracy and F1-measure of our model were 96.3% and 62.8%.
This model helped social workers differentiate the risk level of new cases, which further reduced their major workload significantly.
To our knowledge, this is the first project that utilized machine learning in DV prevention.
The research approach and results of this project not only can improve DV prevention process, but also be applied to other social work or criminal prevention areas.
Publishing fast changing dynamic data as open data on the web in a scalable manner is not trivial.
So far the only approaches describe publishing as much data as possible, which then leads to problems, like server capacity overload, network latency or unwanted knowledge disclosure.
With this paper we show ways how to publish dynamic data in a scalable, meaningful manner by applying context-dependent publication heuristics.
The outcome shows that the application of the right publication heuristics in the right domain can improve the publication performance significantly.
Good knowledge about the domain help choosing the right publication heuristic and hence lead to very good publication results.
Human Activity Recognition (HAR) based on motion sensors has drawn a lot of attention over the last few years, since perceiving the human status enables context-aware applications to adapt their services on users' needs.
However, motion sensor fusion and feature extraction have not reached their full potentials, remaining still an open issue.
In this paper, we introduce PerceptionNet, a deep Convolutional Neural Network (CNN) that applies a late 2D convolution to multimodal time-series sensor data, in order to extract automatically efficient features for HAR.
We evaluate our approach on two public available HAR datasets to demonstrate that the proposed model fuses effectively multimodal sensors and improves the performance of HAR.
In particular, PerceptionNet surpasses the performance of state-of-the-art HAR methods based on: (i) features extracted from humans, (ii) deep CNNs exploiting early fusion approaches, and (iii) Long Short-Term Memory (LSTM), by an average accuracy of more than 3%.
Toponym Resolution, the task of assigning a location mention in a document to a geographic referent (i.e., latitude/longitude), plays a pivotal role in analyzing location-aware content.
However, the ambiguities of natural language and a huge number of possible interpretations for toponyms constitute insurmountable hurdles for this task.
In this paper, we study the problem of toponym resolution with no additional information other than a gazetteer and no training data.
We demonstrate that a dearth of large enough annotated data makes supervised methods less capable of generalizing.
Our proposed method estimates the geographic scope of documents and leverages the connections between nearby place names as evidence to resolve toponyms.
We explore the interactions between multiple interpretations of mentions and the relationships between different toponyms in a document to build a model that finds the most coherent resolution.
Our model is evaluated on three news corpora, two from the literature and one collected and annotated by us; then, we compare our methods to the state-of-the-art unsupervised and supervised techniques.
We also examine three commercial products including Reuters OpenCalais, Yahoo!
YQL Placemaker, and Google Cloud Natural Language API.
The evaluation shows that our method outperforms the unsupervised technique as well as Reuters OpenCalais and Google Cloud Natural Language API on all three corpora; also, our method shows a performance close to that of the state-of-the-art supervised method and outperforms it when the test data has 40% or more toponyms that are not seen in the training data.
This work examines the mean-square error performance of diffusion stochastic algorithms under a generalized coordinate-descent scheme.
In this setting, the adaptation step by each agent is limited to a random subset of the coordinates of its stochastic gradient vector.
The selection of coordinates varies randomly from iteration to iteration and from agent to agent across the network.
Such schemes are useful in reducing computational complexity at each iteration in power-intensive large data applications.
They are also useful in modeling situations where some partial gradient information may be missing at random.
Interestingly, the results show that the steady-state performance of the learning strategy is not always degraded, while the convergence rate suffers some degradation.
The results provide yet another indication of the resilience and robustness of adaptive distributed strategies.
Motivation: Recognizing human actions in a video is a challenging task which has applications in various fields.
Previous works in this area have either used images from a 2D or 3D camera.
Few have used the idea that human actions can be easily identified by the movement of the joints in the 3D space and instead used a Recurrent Neural Network (RNN) for modeling.
Convolutional neural networks (CNN) have the ability to recognise even the complex patterns in data which makes it suitable for detecting human actions.
Thus, we modeled a CNN which can predict the human activity using the joint data.
Furthermore, using the joint data representation has the benefit of lower dimensionality than image or video representations.
This makes our model simpler and faster than the RNN models.
In this study, we have developed a six layer convolutional network, which reduces each input feature vector of the form 15x1961x4 to an one dimensional binary vector which gives us the predicted activity.
Results: Our model is able to recognise an activity correctly upto 87% accuracy.
Joint data is taken from the Cornell Activity Datasets which have day to day activities like talking, relaxing, eating, cooking etc.
This letter is about a principal weakness of the published article by Li et al. in 2014.
It seems that the mentioned work has a terrible conceptual mistake while presenting its theoretical approach.
In fact, the work has tried to design a new attack and its effective solution for a basic watermarking algorithm by Zhu et al. published in 2013, however in practice, we show the Li et al.'s approach is not correct to obtain the aim.
For disproof of the incorrect approach, we only apply a numerical example as the counterexample of the Li et al.'s approach.
In this paper, we propose a novel CS approach in which the acquisition of non-visible information is also avoided.
Engineering software systems is a multidisciplinary activity, whereby a number of artifacts must be created - and maintained - synchronously.
In this paper we investigate whether production code and the accompanying tests co-evolve by exploring a project's versioning system, code coverage reports and size-metrics.
Three open source case studies teach us that testing activities usually start later on during the lifetime and are more "phased", although we did not observe increasing testing activity before releases.
Furthermore, we note large differences in the levels of test coverage given the proportion of test code.
In neural machine translation (NMT), the most common practice is to stack a number of recurrent or feed-forward layers in the encoder and the decoder.
As a result, the addition of each new layer improves the translation quality significantly.
However, this also leads to a significant increase in the number of parameters.
In this paper, we propose to share parameters across all the layers thereby leading to a recurrently stacked NMT model.
We empirically show that the translation quality of a model that recurrently stacks a single layer 6 times is comparable to the translation quality of a model that stacks 6 separate layers.
We also show that using pseudo-parallel corpora by back-translation leads to further significant improvements in translation quality.
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost.
In this paper, we propose a novel learning scheme for CNNs to simultaneously 1) reduce the model size; 2) decrease the run-time memory footprint; and 3) lower the number of computing operations, without compromising accuracy.
This is achieved by enforcing channel-level sparsity in the network in a simple but effective way.
Different from many existing approaches, the proposed method directly applies to modern CNN architectures, introduces minimum overhead to the training process, and requires no special software/hardware accelerators for the resulting models.
We call our approach network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy.
We empirically demonstrate the effectiveness of our approach with several state-of-the-art CNN models, including VGGNet, ResNet and DenseNet, on various image classification datasets.
For VGGNet, a multi-pass version of network slimming gives a 20x reduction in model size and a 5x reduction in computing operations.
Wireless device-to-device (D2D) communication underlaying cellular network is a promising concept to improve user experience and resource utilization.
Unlike traditional D2D communication where two mobile devices in the proximity establish a direct local link bypassing the base station, in this work we focus on relay-aided D2D communication.
Relay-aided transmission could enhance the performance of D2D communication when D2D user equipments (UEs) are far apart from each other and/or the quality of D2D link is not good enough for direct communication.
Considering the uncertainties in wireless links, we model and analyze the performance of a relay-aided D2D communication network, where the relay nodes serve both the cellular and D2D users.
In particular, we formulate the radio resource allocation problem in a two-hop network to guarantee the data rate of the UEs while protecting other receiving nodes from interference.
Utilizing time sharing strategy, we provide a centralized solution under bounded channel uncertainty.
With a view to reducing the computational burden at relay nodes, we propose a distributed solution approach using stable matching to allocate radio resources in an efficient and computationally inexpensive way.
Numerical results show that the performance of the proposed method is close to the centralized optimal solution and there is a distance margin beyond which relaying of D2D traffic improves network performance.
We present a computational analysis of three language varieties: native, advanced non-native, and translation.
Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language.
Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable; (2) non-native language and translations are closer to each other than each of them is to native language; and (3) some of these characteristics depend on the source or native language, while others do not, reflecting, perhaps, unified principles that similarly affect translations and non-native language.
Community structures are critical towards understanding not only the network topology but also how the network functions.
However, how to evaluate the quality of detected community structures is still challenging and remains unsolved.
The most widely used metric, normalized mutual information (NMI), was proved to have finite size effect, and its improved form relative normalized mutual information (rNMI) has reverse finite size effect.
Corrected normalized mutual information (cNMI) was thus proposed and has neither finite size effect nor reverse finite size effect.
However, in this paper we show that cNMI violates the so-called proportionality assumption.
In addition, NMI-type metrics have the problem of ignoring importance of small communities.
Finally, they cannot be used to evaluate a single community of interest.
In this paper, we map the computed community labels to the ground-truth ones through integer linear programming, then use kappa index and F-score to evaluate the detected community structures.
Experimental results demonstrate the advantages of our method.
This paper is to create a practical steganographic implementation for 4-bit images.The proposed technique converts 4 bit image into 4 shaded Gray Scale image.
This image will be act as reference image to hide the text.
Using this grey scale reference image any text can be hidden.
Single character of a text can be represented by 8-bit.
The 8-bit character can be split into 4X2 bit information.
If the reference image and the data file are transmitted through network separately, we can achieve the effect of Steganography.
Here the image is not at all distorted because said image is only used for referencing.
Any huge mount of text material can be hidden using a very small image.
Decipher the text is not possible intercepting the image or data file separately.
So, it is more secure.
Triangular meshes have gained much interest in image representation and have been widely used in image processing.
This paper introduces a framework of anisotropic mesh adaptation (AMA) methods to image representation and proposes a GPRAMA method that is based on AMA and greedy-point removal (GPR) scheme.
Different than many other methods that triangulate sample points to form the mesh, the AMA methods start directly with a triangular mesh and then adapt the mesh based on a user-defined metric tensor to represent the image.
The AMA methods have clear mathematical framework and provides flexibility for both image representation and image reconstruction.
A mesh patching technique is developed for the implementation of the GPRAMA method, which leads to an improved version of the popular GPRFS-ED method.
The GPRAMA method can achieve better quality than the GPRFS-ED method but with lower computational cost.
Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks.
A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by exploiting crosslingual signals to aid sense identification.
We present a multi-view Bayesian non-parametric algorithm which improves multi-sense word embeddings by (a) using multilingual (i.e., more than two languages) corpora to significantly improve sense embeddings beyond what one achieves with bilingual information, and (b) uses a principled approach to learn a variable number of senses per word, in a data-driven manner.
Ours is the first approach with the ability to leverage multilingual corpora efficiently for multi-sense representation learning.
Experiments show that multilingual training significantly improves performance over monolingual and bilingual training, by allowing us to combine different parallel corpora to leverage multilingual context.
Multilingual training yields comparable performance to a state of the art mono-lingual model trained on five times more training data.
Twitter is one of the most popular social media.
Due to the ease of availability of data, Twitter is used significantly for research purposes.
Twitter is known to evolve in many aspects from what it was at its birth; nevertheless, how it evolved its own linguistic style is still relatively unknown.
In this paper, we study the evolution of various sociolinguistic aspects of Twitter over large time scales.
To the best of our knowledge, this is the first comprehensive study on the evolution of such aspects of this OSN.
We performed quantitative analysis both on the word level as well as on the hashtags since it is perhaps one of the most important linguistic units of this social media.
We studied the (in)formality aspects of the linguistic styles in Twitter and find that it is neither fully formal nor completely informal; while on one hand, we observe that Out-Of-Vocabulary words are decreasing over time (pointing to a formal style), on the other hand it is quite evident that whitespace usage is getting reduced with a huge prevalence of running texts (pointing to an informal style).
We also analyze and propose quantitative reasons for repetition and coalescing of hashtags in Twitter.
We believe that such phenomena may be strongly tied to different evolutionary aspects of human languages.
MmWave communications, one of the cornerstones of future 5G mobile networks, are characterized at the same time by a potential multi-gigabit capacity and by a very dynamic channel, sensitive to blockage, wide fluctuations in the received signal quality, and possibly also sudden link disruption.
While the performance of physical and MAC layer schemes that address these issues has been thoroughly investigated in the literature, the complex interactions between mmWave links and transport layer protocols such as TCP are still relatively unexplored.
This paper uses the ns-3 mmWave module, with its channel model based on real measurements in New York City, to analyze the performance of the Linux TCP/IP stack (i) with and without link-layer retransmissions, showing that they are fundamental to reach a high TCP throughput on mmWave links and (ii) with Multipath TCP (MP-TCP) over multiple LTE and mmWave links, illustrating which are the throughput-optimal combinations of secondary paths and congestion control algorithms in different conditions.
Image-based generative methods, such as generative adversarial networks (GANs) have already been able to generate realistic images with much context control, specially when they are conditioned.
However, most successful frameworks share a common procedure which performs an image-to-image translation with pose of figures in the image untouched.
When the objective is reposing a figure in an image while preserving the rest of the image, the state-of-the-art mainly assumes a single rigid body with simple background and limited pose shift, which can hardly be extended to the images under normal settings.
In this paper, we introduce an image "inner space" preserving model that assigns an interpretable low-dimensional pose descriptor (LDPD) to an articulated figure in the image.
Figure reposing is then generated by passing the LDPD and the original image through multi-stage augmented hourglass networks in a conditional GAN structure, called inner space preserving generative pose machine (ISP-GPM).
We evaluated ISP-GPM on reposing human figures, which are highly articulated with versatile variations.
Test of a state-of-the-art pose estimator on our reposed dataset gave an accuracy over 80% on PCK0.5 metric.
The results also elucidated that our ISP-GPM is able to preserve the background with high accuracy while reasonably recovering the area blocked by the figure to be reposed.
Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2).
The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data.
Our network takes in one or more images of an object instance from arbitrary viewpoints and outputs a reconstruction of the object in the form of a 3D occupancy grid.
Unlike most of the previous works, our network does not require any image annotations or object class labels for training or testing.
Our extensive experimental analysis shows that our reconstruction framework i) outperforms the state-of-the-art methods for single view reconstruction, and ii) enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).
This paper reviews the requirements for the security mechanisms that are currently being developed in the framework of the European research project INDECT.
An overview of features for integrated technologies such as Virtual Private Networks (VPNs), Cryptographic Algorithms, Quantum Cryptography, Federated ID Management and Secure Mobile Ad-hoc networking are described together with their expected use in INDECT.
We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier.
We find that only a few manipulations are needed to greatly decrease the accuracy.
Our method relies on an atomic flip operation, which swaps one token for another, based on the gradients of the one-hot input vectors.
Due to efficiency of our method, we can perform adversarial training which makes the model more robust to attacks at test time.
With the use of a few semantics-preserving constraints, we demonstrate that HotFlip can be adapted to attack a word-level classifier as well.
This paper proposes a variant of the normalized cut algorithm for spectral clustering.
Although the normalized cut algorithm applies the K-means algorithm to the eigenvectors of a normalized graph Laplacian for finding clusters, our algorithm instead uses a minimum volume enclosing ellipsoid for them.
We show that the algorithm shares similarity with the ellipsoidal rounding algorithm for separable nonnegative matrix factorization.
Our theoretical insight implies that the algorithm can serve as a bridge between spectral clustering and separable NMF.
The K-means algorithm has the issues in that the choice of initial points affects the construction of clusters and certain choices result in poor clustering performance.
The normalized cut algorithm inherits these issues since K-means is incorporated in it, whereas the algorithm proposed here does not.
An empirical study is presented to examine the performance of the algorithm.
Automatic License Plate Recognition (ALPR) has been the focus of many researches in the past years.
In general, ALPR is divided into the following problems: detection of on-track vehicles, license plates detection, segmention of license plate characters and optical character recognition (OCR).
Even though commercial solutions are available for controlled acquisition conditions, e.g., the entrance of a parking lot, ALPR is still an open problem when dealing with data acquired from uncontrolled environments, such as roads and highways when relying only on imaging sensors.
Due to the multiple orientations and scales of the license plates captured by the camera, a very challenging task of the ALPR is the License Plate Character Segmentation (LPCS) step, which effectiveness is required to be (near) optimal to achieve a high recognition rate by the OCR.
To tackle the LPCS problem, this work proposes a novel benchmark composed of a dataset designed to focus specifically on the character segmentation step of the ALPR within an evaluation protocol.
Furthermore, we propose the Jaccard-Centroid coefficient, a new evaluation measure more suitable than the Jaccard coefficient regarding the location of the bounding box within the ground-truth annotation.
The dataset is composed of 2,000 Brazilian license plates consisting of 14,000 alphanumeric symbols and their corresponding bounding box annotations.
We also present a new straightforward approach to perform LPCS efficiently.
Finally, we provide an experimental evaluation for the dataset based on four LPCS approaches and demonstrate the importance of character segmentation for achieving an accurate OCR.
Multi-agent cooperation is an important feature of the natural world.
Many tasks involve individual incentives that are misaligned with the common good, yet a wide range of organisms from bacteria to insects and humans are able to overcome their differences and collaborate.
Therefore, the emergence of cooperative behavior amongst self-interested individuals is an important question for the fields of multi-agent reinforcement learning (MARL) and evolutionary theory.
Here, we study a particular class of multi-agent problems called intertemporal social dilemmas (ISDs), where the conflict between the individual and the group is particularly sharp.
By combining MARL with appropriately structured natural selection, we demonstrate that individual inductive biases for cooperation can be learned in a model-free way.
To achieve this, we introduce an innovative modular architecture for deep reinforcement learning agents which supports multi-level selection.
We present results in two challenging environments, and interpret these in the context of cultural and ecological evolution.
Classifying single image patches is important in many different applications, such as road detection or scene understanding.
In this paper, we present convolutional patch networks, which are convolutional networks learned to distinguish different image patches and which can be used for pixel-wise labeling.
We also show how to incorporate spatial information of the patch as an input to the network, which allows for learning spatial priors for certain categories jointly with an appearance model.
In particular, we focus on road detection and urban scene understanding, two application areas where we are able to achieve state-of-the-art results on the KITTI as well as on the LabelMeFacade dataset.
Furthermore, our paper offers a guideline for people working in the area and desperately wandering through all the painstaking details that render training CNs on image patches extremely difficult.
In order to ensure high availability of Web services, recently, a new approach was proposed based on the use of communities.
In composition, this approach consists in replacing the failed Web service by another web service joining a community offering the same functionality of the service failed.
However, this substitution may cause inconsistency in the semantic composition and alter its mediation initially taken to resolve the semantic heterogeneities between Web services.
This paper presents a context oriented solution to this problem by forcing the community to adopt the semantic of the failed web service before the substitution in which all inputs and outputs to/from the latter must be converted according to this adopted semantic, avoiding any alteration of a semantic mediation in web service composition.
The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of the first successful combinations of deep neural networks and reinforcement learning.
Its promise was demonstrated in the Arcade Learning Environment (ALE), a challenging framework composed of dozens of Atari 2600 games used to evaluate general competency in AI.
It achieved dramatically better results than earlier approaches, showing that its ability to learn good representations is quite robust and general.
This paper attempts to understand the principles that underlie DQN's impressive performance and to better contextualize its success.
We systematically evaluate the importance of key representational biases encoded by DQN's network by proposing simple linear representations that make use of these concepts.
Incorporating these characteristics, we obtain a computationally practical feature set that achieves competitive performance to DQN in the ALE.
Besides offering insight into the strengths and weaknesses of DQN, we provide a generic representation for the ALE, significantly reducing the burden of learning a representation for each game.
Moreover, we also provide a simple, reproducible benchmark for the sake of comparison to future work in the ALE.
Understanding software design practice is critical to understanding modern information systems development.
New developments in empirical software engineering, information systems design science and the interdisciplinary design literature combined with recent advances in process theory and testability have created a situation ripe for innovation.
Consequently, this paper utilizes these breakthroughs to formulate a process theory of software design practice: Sensemaking-Coevolution-Implementation Theory explains how complex software systems are created by collocated software development teams in organizations.
It posits that an independent agent (design team) creates a software system by alternating between three activities: organizing their perceptions about the context, mutually refining their understandings of the context and design space, and manifesting their understanding of the design space in a technological artifact.
This theory development paper defines and illustrates Sensemaking-Coevolution-Implementation Theory, grounds its concepts and relationships in existing literature, conceptually evaluates the theory and situates it in the broader context of information systems development.
In this paper, we claim that Vector Cosine, which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared contexts in the dependency ranked lists.
This claim comes from the hypothesis that similar words do not simply occur in similar contexts, but they share a larger portion of their most relevant contexts compared to other related words.
To prove it, we describe and evaluate APSyn, a variant of Average Precision that, independently of the adopted parameters, outperforms the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets.
In the best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy in the TOEFL dataset, beating therefore the non-English US college applicants (whose average, as reported in the literature, is 64.50%) and several state-of-the-art approaches.
Millimeter wave (mmWave) systems are emerging as an essential technology to enable extremely high data rate wireless communications.
The main limiting factors of mmWave systems are blockage (high penetration loss) and deafness (misalignment between the beams of the transmitter and receiver).
To alleviate these problems, it is imperative to incorporate efficient association and relaying between terminals and access points.
Unfortunately, the existing association techniques are designed for the traditional interference-limited networks, and thus are highly suboptimal for mmWave communications due to narrow-beam operations and the resulting non-negligible interference-free behavior.
This paper introduces a distributed approach that solves the joint association and relaying problem in mmWave networks considering the load balancing at access points.
The problem is posed as a novel stochastic optimization problem, which is solved by distributed auction algorithms where the clients and relays act asynchronously to achieve optimal client-relay-access point association.
It is shown that the algorithms provably converge to a solution that maximizes the aggregate logarithmic utility within a desired bound.
Numerical results allow to quantify the performance enhancements introduced by the relays, and the substantial improvements of the network throughput and fairness among the clients by the proposed association method as compared to standard approaches.
It is concluded that mmWave communications with proper association and relaying mechanisms can support extremely high data rates, connection reliability, and fairness among the clients.
Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using deep neural networks as function approximators to learn directly from raw input images.
However, learning directly from raw images is data inefficient.
The agent must learn feature representation of complex states in addition to learning a policy.
As a result, deep RL typically suffers from slow learning speeds and often requires a prohibitively large amount of training time and data to reach reasonable performance, making it inapplicable to real-world settings where data is expensive.
In this work, we improve data efficiency in deep RL by addressing one of the two learning goals, feature learning.
We leverage supervised learning to pre-train on a small set of non-expert human demonstrations and empirically evaluate our approach using the asynchronous advantage actor-critic algorithms (A3C) in the Atari domain.
Our results show significant improvements in learning speed, even when the provided demonstration is noisy and of low quality.
This article presents two area/latency optimized gate level asynchronous full adder designs which correspond to early output logic.
The proposed full adders are constructed using the delay-insensitive dual-rail code and adhere to the four-phase return-to-zero handshaking.
For an asynchronous ripple carry adder (RCA) constructed using the proposed early output full adders, the relative-timing assumption becomes necessary and the inherent advantages of the relative-timed RCA are: (1) computation with valid inputs, i.e., forward latency is data-dependent, and (2) computation with spacer inputs involves a bare minimum constant reverse latency of just one full adder delay, thus resulting in the optimal cycle time.
With respect to different 32-bit RCA implementations, and in comparison with the optimized strong-indication, weak-indication, and early output full adder designs, one of the proposed early output full adders achieves respective reductions in latency by 67.8, 12.3 and 6.1 %, while the other proposed early output full adder achieves corresponding reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty.
Further, the proposed early output full adders based asynchronous RCAs enable minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering carry-propagation over the entire RCA width of 32-bits, and maximum reductions in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical carry chain length of 4 full adder stages, when compared to the least of the cycle time estimates of various strong-indication, weak-indication, and early output asynchronous RCAs of similar size.
All the asynchronous full adders and RCAs were realized using standard cells in a semi-custom design fashion based on a 32/28 nm CMOS process technology.
In recent years, we have seen an emergence of data-driven approaches in robotics.
However, most existing efforts and datasets are either in simulation or focus on a single task in isolation such as grasping, pushing or poking.
In order to make progress and capture the space of manipulation, we would need to collect a large-scale dataset of diverse tasks such as pouring, opening bottles, stacking objects etc.
But how does one collect such a dataset?
In this paper, we present the largest available robotic-demonstration dataset (MIME) that contains 8260 human-robot demonstrations over 20 different robotic tasks (https://sites.google.com/view/mimedataset).
These tasks range from the simple task of pushing objects to the difficult task of stacking household objects.
Our dataset consists of videos of human demonstrations and kinesthetic trajectories of robot demonstrations.
We also propose to use this dataset for the task of mapping 3rd person video features to robot trajectories.
Furthermore, we present two different approaches using this dataset and evaluate the predicted robot trajectories against ground-truth trajectories.
We hope our dataset inspires research in multiple areas including visual imitation, trajectory prediction, and multi-task robotic learning.
Background: The speed and precision with which objects are moved by hand or hand-tool interaction under image guidance depend on a specific type of visual and spatial sensorimotor learning.
Novices have to learn to optimally control what their hands are doing in a real-world environment while looking at an image representation of the scene on a video monitor.
Previous research has shown slower task execution times and lower performance scores under image-guidance compared with situations of direct action viewing.
The cognitive processes for overcoming this drawback by training are not yet understood.
Methods: We investigated the effects of training on the time and precision of direct view versus image guided object positioning on targets of a Real-world Action Field (RAF).
Two men and two women had to learn to perform the task as swiftly and as precisely as possible with their dominant hand, using a tool or not and wearing a glove or not.
Individuals were trained in sessions of mixed trial blocks with no feed-back.
Results: As predicted, image-guidance produced significantly slower times and lesser precision in all trainees and sessionscompared with direct viewing.
With training, all trainees get faster in all conditions, but only one of them gets reliably more precise in the image-guided conditions.
Speed-accuracy trade-offs in the individual performance data show that the highest precision scores and steepest learning curve, for time and precision, were produced by the slowest starter.Conclusions: Performance evolution towards optimal precision is compromised when novices start by going as fast as they can.
The findings have direct implications for individual skill monitoring in training programmes for image-guided technology applications with human operators.
We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs.
NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances.
By learning to compose lower-level programs to express higher-level programs, NPI reduces sample complexity and increases generalization ability compared to sequence-to-sequence LSTMs.
The program memory allows efficient learning of additional tasks by building on existing programs.
NPI can also harness the environment (e.g. a scratch pad with read-write pointers) to cache intermediate results of computation, lessening the long-term memory burden on recurrent hidden units.
In this work we train the NPI with fully-supervised execution traces; each program has example sequences of calls to the immediate subprograms conditioned on the input.
Rather than training on a huge number of relatively weak labels, NPI learns from a small number of rich examples.
We demonstrate the capability of our model to learn several types of compositional programs: addition, sorting, and canonicalizing 3D models.
Furthermore, a single NPI learns to execute these programs and all 21 associated subprograms.
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints.
ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power.
ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less.
We evaluated ESPNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset.
Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices.
Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively.
Search engines are nowadays one of the most important entry points for Internet users and a central tool to solve most of their information needs.
Still, there exist a substantial amount of users' searches which obtain unsatisfactory results.
Needless to say, several lines of research aim to increase the relevancy of the results users retrieve.
In this paper the authors frame this problem within the much broader (and older) one of information overload.
They argue that users' dissatisfaction with search engines is a currently common manifestation of such a problem, and propose a different angle from which to tackle with it.
As it will be discussed, their approach shares goals with a current hot research topic (namely, learning to rank for information retrieval) but, unlike the techniques commonly applied in that field, their technique cannot be exactly considered machine learning and, additionally, it can be used to change the search engine's response in real-time, driven by the users behavior.
Their proposal adapts concepts from Swarm Intelligence (in particular, Ant Algorithms) from an Information Foraging point of view.
It will be shown that the technique is not only feasible, but also an elegant solution to the stated problem; what's more, it achieves promising results, both increasing the performance of a major search engine for informational queries, and substantially reducing the time users require to answer complex information needs.
Globalization and the world wide web has resulted in academia and science being an international and multicultural community forged by researchers and scientists with different ethnicities.
How ethnicity shapes the evolution of membership, status and interactions of the scientific community, however, is not well understood.
This is due to the difficulty of ethnicity identification at the large scale.
We use name ethnicity classification as an indicator of ethnicity.
Based on automatic name ethnicity classification of 1.7+ million authors gathered from Web, the name ethnicity of computer science scholars is investigated by population size, publication contribution and collaboration strength.
By showing the evolution of name ethnicity from 1936 to 2010, we discover that ethnicity diversity has increased significantly over time and that different research communities in certain publication venues have different ethnicity compositions.
We notice a clear rise in the number of Asian name ethnicities in papers.
Their fraction of publication contribution increases from approximately 10% to near 50% from 1970 to 2010.
We also find that name ethnicity acts as a homophily factor on coauthor networks, shaping the formation of coauthorship as well as evolution of research communities.
Next generation cellular networks will have to leverage large cell densifications to accomplish the ambitious goals for aggregate multi-user sum rates, for which CRAN architecture is a favored network design.
This shifts the attention back to applicable resource allocation (RA), which need to be applicable for very short radio frames, large and dense sets of radio heads, and large user populations in the coordination area.
So far, mainly CSI-based RA schemes have been proposed for this task.
However, they have considerable complexity and also incur a significant CSI acquisition overhead on the system.
In this paper, we study an alternative approach which promises lower complexity with also a lower overhead.
We propose to base the RA in multi-antenna CRAN systems on the position information of user terminals only.
We use Random Forests as supervised machine learning approach to determine the multi-user RAs.
This likely leads to lower overhead costs, as the acquisition of position information requires less radio resources in comparison to the acquisition of instantaneous CSI.
The results show the following findings: I) In general, learning-based RA schemes can achieve comparable spectral efficiency to CSI-based scheme; II) If taking the system overhead into account, learning-based RA scheme utilizing position information outperform legacy CSI-based scheme by up to 100%; III) Despite their dependency on the training data, Random Forests based RA scheme is robust against position inaccuracies and changes in the propagation scenario; IV) The most important factor influencing the performance of learning-based RA scheme is the antenna orientation, for which we present three approaches that restore most of the original performance results.
To the best of our knowledge, these insights are new and indicate a novel as well as promising approach to master the complexity in future cellular networks.
The growing complexity of heterogeneous cellular networks (HetNets) has necessitated the need to consider variety of user and base station (BS) configurations for realistic performance evaluation and system design.
This is directly reflected in the HetNet simulation models considered by standardization bodies, such as the third generation partnership project (3GPP).
Complementary to these simulation models, stochastic geometry based approach modeling the user and BS locations as independent and homogeneous Poisson point processes (PPPs) has gained prominence in the past few years.
Despite its success in revealing useful insights, this PPP-based model is not rich enough to capture all the spatial configurations that appear in real world HetNet deployments (on which 3GPP simulation models are based).
In this paper, we bridge the gap between the 3GPP simulation models and the popular PPP-based analytical model by developing a new unified HetNet model in which a fraction of users and some BS tiers are modeled as Poisson cluster processes (PCPs).
This model captures both non-uniformity and coupling in the BS and user locations.
For this setup, we derive exact expression for downlink coverage probability under maximum signal-to-interference ratio (SIR) cell association model.
As intermediate results, we define and evaluate sum-product functionals for PPP and PCP.
Special instances of the proposed model are shown to closely resemble different configurations considered in 3GPP HetNet models.
Our results concretely demonstrate that the performance trends are highly sensitive to the assumptions made on the user and SBS configurations.
We introduce a novel framework for image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image.
Our approach reconciles classical slot filling approaches (that are generally better grounded in images) with modern neural captioning approaches (that are generally more natural sounding and accurate).
Our approach first generates a sentence `template' with slot locations explicitly tied to specific image regions.
These slots are then filled in by visual concepts identified in the regions by object detectors.
The entire architecture (sentence template generation and slot filling with object detectors) is end-to-end differentiable.
We verify the effectiveness of our proposed model on different image captioning tasks.
On standard image captioning and novel object captioning, our model reaches state-of-the-art on both COCO and Flickr30k datasets.
We also demonstrate that our model has unique advantages when the train and test distributions of scene compositions -- and hence language priors of associated captions -- are different.
Code has been made available at: https://github.com/jiasenlu/NeuralBabyTalk
Facilitating the coexistence of radar systems with communication systems has been a major area of research in radar engineering.
The current work presents a new way to sense the environment using the channel equalization block of existing communication systems.
We have named this system CommSense.
In the current paper we demonstrate the feasibility of the system using Global System for Mobile Communications (GSM) signals.
The implementation has been done using open-source Software Defined Radio (SDR) environment.
In the preliminary results obtained in our work we show that it is possible to distinguish environmental changes using the proposed system.
The major advantage of the system is that it is inexpensive as channel estimation is an inherent block in any communication system and hence the added cost to make it work as an environment sensor is minimal.
The major challenge, on which we are continuing our work, is how to characterize the features in the environmental changes.
This is an acute challenge given the fact that the bandwidth available is narrow and the system is inherently a forward looking radar.
However the initial results, as shown in this paper, are encouraging and we intend to use an application specific instrumentation (ASIN) scheme to distinguish the environmental changes.
Many software development organizations still lack support for obtaining intellectual control over their software development processes and for determining the performance of their processes and the quality of the produced products.
Systematic support for detecting and reacting to critical project states in order to achieve planned goals is usually missing.
One means to institutionalize measurement on the basis of explicit models is the development and establishment of a so-called Software Project Control Center (SPCC) for systematic quality assurance and management support.
An SPCC is comparable to a control room, which is a well known term in the mechanical production domain.
Its tasks include collecting, in- terpreting, and visualizing measurement data in order to provide context-, purpose-, and role-oriented information for all stakeholders (e.g., project managers, quality assurance manager, developers) during the execution of a software development project.
The article will present an overview of SPCC concepts, a concrete instantiation that supports goal-oriented data visualization (G-SPCC approach), and experiences from practical applications.
Money laundering is a crime that makes it possible to finance other crimes, for this reason, it is important for criminal organizations and their combat is prioritized by nations around the world.
The anti-money laundering process has not evolved as expected because it has prioritized only the signaling of suspicious transactions.
The constant increasing in the volume of transactions has overloaded the indispensable human work of final evaluation of the suspicions.
This article presents a multiagent system that aims to go beyond the capture of suspicious transactions, seeking to assist the human expert in the analysis of suspicions.
The agents created use data mining techniques to create transactional behavioral profiles; apply rules generated in learning process in conjunction with specific rules based on legal aspects and profiles created to capture suspicious transactions; and analyze these suspicious transactions indicating to the human expert those that require more detailed analysis.
This article analyzes Twitter as a potential alternative source of external links for use in webometric analysis because of its capacity to embed hyperlinks in different tweets.
Given the limitations on searching Twitter's public API, we decided to use the Topsy search engine as a source for compiling tweets.
To this end, we took a global sample of 200 universities and compiled all the tweets with hyperlinks to any of these institutions.
Further link data was obtained from alternative sources (MajesticSEO and OpenSiteExplorer) in order to compare the results.
Thereafter, various statistical tests were performed to determine the correlation between the indicators and the ability to predict external links from the collected tweets.
The results indicate a high volume of tweets, although they are skewed by the presence and performance of specific universities and countries.
The data provided by Topsy correlated significantly with all link indicators, particularly with OpenSiteExplorer (r=0.769).
Finally, prediction models do not provide optimum results because of high error rates, which fall slightly in nonlinear models applied to specific environments.
We conclude that the use of Twitter (via Topsy) as a source of hyperlinks to universities produces promising results due to its high correlation with link indicators, though limited by policies and culture regarding use and presence in social networks.
This paper addresses the important problem of discerning hateful content in social media.
We propose a detection scheme that is an ensemble of Recurrent Neural Network (RNN) classifiers, and it incorporates various features associated with user-related information, such as the users' tendency towards racism or sexism.
These data are fed as input to the above classifiers along with the word frequency vectors derived from the textual content.
Our approach has been evaluated on a publicly available corpus of 16k tweets, and the results demonstrate its effectiveness in comparison to existing state of the art solutions.
More specifically, our scheme can successfully distinguish racism and sexism messages from normal text, and achieve higher classification quality than current state-of-the-art algorithms.
We give an algorithm to compute a morph between any two convex drawings of the same plane graph.
The morph preserves the convexity of the drawing at any time instant and moves each vertex along a piecewise linear curve with linear complexity.
The linear bound is asymptotically optimal in the worst case.
We present a hybrid control framework for solving a motion planning problem among a collection of heterogenous agents.
The proposed approach utilizes a finite set of low-level motion primitives, each based on a piecewise affine feedback control, to generate complex motions in a gridded workspace.
The constraints on allowable sequences of successive motion primitives are formalized through a maneuver automaton.
At the higher level, a control policy generated by a shortest path non-deterministic algorithm determines which motion primitive is executed in each box of the gridded workspace.
The overall framework yields a highly robust control design on both the low and high levels.
We experimentally demonstrate the efficacy and robustness of this framework for multiple quadrocopters maneuvering in a 2D or 3D workspace.
The standard classification of emotions involves categorizing the expression of emotions.
In this paper, parameters underlying some emotions are identified and a new classification based on these parameters is suggested.
Group centrality is an extension of the classical notion of centrality for individuals, to make it applicable to sets of them.
We perform a SWOT (strengths, weaknesses, opportunities and threats) analysis of the use of group centrality in semantic networks, for different centrality notions: degree, closeness, betweenness, giving prominence to random walks.
Among our main results stand out the relevance and NP-hardness of the problem of finding the most central set in a semantic network for an specific centrality measure.
Sparse coding (SC) is attracting more and more attention due to its comprehensive theoretical studies and its excellent performance in many signal processing applications.
However, most existing sparse coding algorithms are nonconvex and are thus prone to becoming stuck into bad local minima, especially when there are outliers and noisy data.
To enhance the learning robustness, in this paper, we propose a unified framework named Self-Paced Sparse Coding (SPSC), which gradually include matrix elements into SC learning from easy to complex.
We also generalize the self-paced learning schema into different levels of dynamic selection on samples, features and elements respectively.
Experimental results on real-world data demonstrate the efficacy of the proposed algorithms.
Spiking neural networks (SNNs) could play a key role in unsupervised machine learning applications, by virtue of strengths related to learning from the fine temporal structure of event-based signals.
However, some spike-timing-related strengths of SNNs are hindered by the sensitivity of spike-timing-dependent plasticity (STDP) rules to input spike rates, as fine temporal correlations may be obstructed by coarser correlations between firing rates.
In this article, we propose a spike-timing-dependent learning rule that allows a neuron to learn from the temporally-coded information despite the presence of rate codes.
Our long-term plasticity rule makes use of short-term synaptic fatigue dynamics.
We show analytically that, in contrast to conventional STDP rules, our fatiguing STDP (FSTDP) helps learn the temporal code, and we derive the necessary conditions to optimize the learning process.
We showcase the effectiveness of FSTDP in learning spike-timing correlations among processes of different rates in synthetic data.
Finally, we use FSTDP to detect correlations in real-world weather data from the United States in an experimental realization of the algorithm that uses a neuromorphic hardware platform comprising phase-change memristive devices.
Taken together, our analyses and demonstrations suggest that FSTDP paves the way for the exploitation of the spike-based strengths of SNNs in real-world applications.
Though quite challenging, leveraging large-scale unlabeled or partially labeled data in learning systems (e.g., model/classifier training) has attracted increasing attentions due to its fundamental importance.
To address this problem, many active learning (AL) methods have been proposed that employ up-to-date detectors to retrieve representative minority samples according to predefined confidence or uncertainty thresholds.
However, these AL methods cause the detectors to ignore the remaining majority samples (i.e., those with low uncertainty or high prediction confidence).
In this work, by developing a principled active sample mining (ASM) framework, we demonstrate that cost-effectively mining samples from these unlabeled majority data is key to training more powerful object detectors while minimizing user effort.
Specifically, our ASM framework involves a switchable sample selection mechanism for determining whether an unlabeled sample should be manually annotated via AL or automatically pseudo-labeled via a novel self-learning process.
The proposed process can be compatible with mini-batch based training (i.e., using a batch of unlabeled or partially labeled data as a one-time input) for object detection.
In addition, a few samples with low-confidence predictions are selected and annotated via AL.
Notably, our method is suitable for object categories that are not seen in the unlabeled data during the learning process.
Extensive experiments clearly demonstrate that our ASM framework can achieve performance comparable to that of alternative methods but with significantly fewer annotations.
Evaluating human-computer interaction is essential as a broadening population uses machines, sometimes in sensitive contexts.
However, traditional evaluation methods may fail to combine real-time measures, an "objective" approach and data contextualization.
In this review we look at how adding neuroimaging techniques can respond to such needs.
We focus on electroencephalography (EEG), as it could be handled effectively during a dedicated evaluation phase.
We identify workload, attention, vigilance, fatigue, error recognition, emotions, engagement, flow and immersion as being recognizable by EEG.
We find that workload, attention and emotions assessments would benefit the most from EEG.
Moreover, we advocate to study further error recognition through neuroimaging to enhance usability and increase user experience.
Conventional Open Information Extraction (Open IE) systems are usually built on hand-crafted patterns from other NLP tools such as syntactic parsing, yet they face problems of error propagation.
In this paper, we propose a neural Open IE approach with an encoder-decoder framework.
Distinct from existing methods, the neural Open IE approach learns highly confident arguments and relation tuples bootstrapped from a state-of-the-art Open IE system.
An empirical study on a large benchmark dataset shows that the neural Open IE system significantly outperforms several baselines, while maintaining comparable computational efficiency.
Under normality and homoscedasticity assumptions, Linear Discriminant Analysis (LDA) is known to be optimal in terms of minimising the Bayes error for binary classification.
In the heteroscedastic case, LDA is not guaranteed to minimise this error.
Assuming heteroscedasticity, we derive a linear classifier, the Gaussian Linear Discriminant (GLD), that directly minimises the Bayes error for binary classification.
In addition, we also propose a local neighbourhood search (LNS) algorithm to obtain a more robust classifier if the data is known to have a non-normal distribution.
We evaluate the proposed classifiers on two artificial and ten real-world datasets that cut across a wide range of application areas including handwriting recognition, medical diagnosis and remote sensing, and then compare our algorithm against existing LDA approaches and other linear classifiers.
The GLD is shown to outperform the original LDA procedure in terms of the classification accuracy under heteroscedasticity.
While it compares favourably with other existing heteroscedastic LDA approaches, the GLD requires as much as 60 times lower training time on some datasets.
Our comparison with the support vector machine (SVM) also shows that, the GLD, together with the LNS, requires as much as 150 times lower training time to achieve an equivalent classification accuracy on some of the datasets.
Thus, our algorithms can provide a cheap and reliable option for classification in a lot of expert systems.
Traditionally, the performance of ocr algorithms and systems is based on the recognition of isolated characters.
When a system classifies an individual character, its output is typically a character label or a reject marker that corresponds to an unrecognized character.
By comparing output labels with the correct labels, the number of correct recognition, substitution errors misrecognized characters, and rejects unrecognized characters are determined.
Nowadays, although recognition of printed isolated characters is performed with high accuracy, recognition of handwritten characters still remains an open problem in the research arena.
The ability to identify machine printed characters in an automated or a semi automated manner has obvious applications in numerous fields.
Since creating an algorithm with a one hundred percent correct recognition rate is quite probably impossible in our world of noise and different font styles, it is important to design character recognition algorithms with these failures in mind so that when mistakes are inevitably made, they will at least be understandable and predictable to the person working with the
Recently deep neural networks based on tanh activation function have shown their impressive power in image denoising.
In this letter, we try to use rectifier function instead of tanh and propose a dual-pathway rectifier neural network by combining two rectifier neurons with reversed input and output weights in the same hidden layer.
We drive the equivalent activation function and compare it to some typical activation functions for image denoising under the same network architecture.
The experimental results show that our model achieves superior performances faster especially when the noise is small.
Tokenization is the task of chopping it up into pieces, called tokens, perhaps at the same time throwing away certain characters, such as punctuation.
A token is an instance of token a sequence of characters in some particular document that are grouped together as a useful semantic unit for processing.
New software tool and algorithm to support the IRS at tokenization process are presented.
Our proposed tool will filter out the three computer character Sequences: IP-Addresses, Web URLs, Date, and Email Addresses.
Our tool will use the pattern matching algorithms and filtration methods.
After this process, the IRS can start a new tokenization process on the new retrieved text which will be free of these sequences.
Most of open-source software systems become available on the internet today.
Thus, we need automatic methods to label software code.
Software code can be labeled with a set of keywords.
These keywords in this paper referred as software labels.
The goal of this paper is to provide a quick view of the software code vocabulary.
This paper proposes an automatic approach to document the object-oriented software by labeling its code.
The approach exploits all software identifiers to label software code.
The paper presents the results of study conducted on the ArgoUML and drawing shapes case studies.
Results showed that all code labels were correctly identified.
This work discusses an important issue in the area of human resource management by proposing a novel model for creation and evaluation of software teams.
The model consists of several assessments, including a technical test, a quality of life test and a psychological-sociological test.
Since the technical test requires particular organizational specifications and cannot be examined without reference to a specific company, only the sociological test and the quality of life tests are extensively discussed in this work.
Two strategies are discussed for assigning roles in a project.
Initially, six software projects were selected, and after extensive analysis of the projects, two projects were chosen and correctives actions were applied.
An empirical evaluation was also conducted to assess the model effectiveness.
The experimental results demonstrate that the application of the model improved the productivity of project teams.
In this paper we introduce Epiphany as a high-performance energy-efficient manycore architecture suitable for real-time embedded systems.
This scalable architecture supports floating point operations in hardware and achieves 50 GFLOPS/W in 28 nm technology, making it suitable for high performance streaming applications like radio base stations and radar signal processing.
Through an efficient 2D mesh Network-on-Chip and a distributed shared memory model, the architecture is scalable to thousands of cores on a single chip.
An Epiphany-based open source computer named Parallella was launched in 2012 through Kickstarter crowd funding and has now shipped to thousands of customers around the world.
The increasing number of applications requiring the solution of large scale singular value problems have rekindled interest in iterative methods for the SVD.
Some promising recent ad- vances in large scale iterative methods are still plagued by slow convergence and accuracy limitations for computing smallest singular triplets.
Furthermore, their current implementations in MATLAB cannot address the required large problems.
Recently, we presented a preconditioned, two-stage method to effectively and accurately compute a small number of extreme singular triplets.
In this research, we present a high-performance software, PRIMME SVDS, that implements our hybrid method based on the state-of-the-art eigensolver package PRIMME for both largest and smallest singular values.
PRIMME SVDS fills a gap in production level software for computing the partial SVD, especially with preconditioning.
The numerical experiments demonstrate its superior performance compared to other state-of-the-art software and its good parallel performance under strong and weak scaling.
Support vector machines represent a promising development in machine learning research that is not widely used within the remote sensing community.
This paper reports the results of Multispectral(Landsat-7 ETM+) and Hyperspectral DAIS)data in which multi-class SVMs are compared with maximum likelihood and artificial neural network methods in terms of classification accuracy.
Our results show that the SVM achieves a higher level of classification accuracy than either the maximum likelihood or the neural classifier, and that the support vector machine can be used with small training datasets and high-dimensional data.
This paper presents an approach that exploits Java annotations to provide meta information needed to automatically transform plain Java programs into parallel code that can be run on multicore workstation.
Programmers just need to decorate the methods that will eventually be executed in parallel with standard Java annotations.
Annotations are automatically processed at launch-time and parallel byte code is derived.
Once in execution the program automatically retrieves the information about the executing platform and evaluates the information specified inside the annotations to transform the byte-code into a semantically equivalent multithreaded version, depending on the target architecture features.
The results returned by the annotated methods, when invoked, are futures with a wait-by-necessity semantics.
Bug localization in object oriented program ha s always been an important issue in softeware engineering.
In this paper, I propose a source level bug localization technique for object oriented embedded programs.
My proposed technique, presents the idea of debugging an object oriented program in class level, incorporating the object state information into the Class Dependence Graph (ClDG).
Given a program (having buggy statement) and an input that fails and others pass, my approach uses concrete as well as symbolic execution to synthesize the passing inputs that marginally from the failing input in their control flow behavior.
A comparison of the execution traces of the failing input and the passing input provides necessary clues to the root-cause of the failure.
A state trace difference, regarding the respective nodes of the ClDG is obtained, which leads to detect the bug in the program.
We propose a robust classifier to predict buying intentions based on user behaviour within a large e-commerce website.
In this work we compare traditional machine learning techniques with the most advanced deep learning approaches.
We show that both Deep Belief Networks and Stacked Denoising auto-Encoders achieved a substantial improvement by extracting features from high dimensional data during the pre-train phase.
They prove also to be more convenient to deal with severe class imbalance.
The problem of MIMO channel estimation at millimeter wave (mmWave) frequencies, both in a single-user and in a multi-user setting, is tackled in this paper.
Using a subspace approach, we develop a protocol enabling the estimation of the right (resp. left) singular vectors at the transmitter (resp. receiver) side; then, we adapt the projection approximation subspace tracking with deflation (PASTd) and the orthogonal Oja (OOJA) algorithms to our framework and obtain two channel estimation algorithms.
We also present an alternative algorithm based on the least squares (LS) approach.
The hybrid analog/digital nature of the beamformer is also explicitly taken into account at the algorithm design stage.
Our results clearly show that the proposed algorithms are very effective in estimating the principal directions of the MIMO channel matrix, and that they compare favorably, in terms of the performance-complexity trade-off, with respect to several competing alternatives.
Sequential neural networks models are powerful tools in a variety of Natural Language Processing (NLP) tasks.
The sequential nature of these models raises the questions: to what extent can these models implicitly learn hierarchical structures typical to human language, and what kind of grammatical phenomena can they acquire?
We focus on the task of agreement prediction in Basque, as a case study for a task that requires implicit understanding of sentence structure and the acquisition of a complex but consistent morphological system.
Analyzing experimental results from two syntactic prediction tasks -- verb number prediction and suffix recovery -- we find that sequential models perform worse on agreement prediction in Basque than one might expect on the basis of a previous agreement prediction work in English.
Tentative findings based on diagnostic classifiers suggest the network makes use of local heuristics as a proxy for the hierarchical structure of the sentence.
We propose the Basque agreement prediction task as challenging benchmark for models that attempt to learn regularities in human language.
In multi-cycle assignment problems with rotational diversity, a set of tasks has to be repeatedly assigned to a set of agents.
Over multiple cycles, the goal is to achieve a high diversity of assignments from tasks to agents.
At the same time, the assignments' profit has to be maximized in each cycle.
Due to changing availability of tasks and agents, planning ahead is infeasible and each cycle is an independent assignment problem but influenced by previous choices.
We approach the multi-cycle assignment problem as a two-part problem: Profit maximization and rotation are combined into one objective value, and then solved as a General Assignment Problem.
Rotational diversity is maintained with a single execution of the costly assignment model.
Our simple, yet effective method is applicable to different domains and applications.
Experiments show the applicability on a multi-cycle variant of the multiple knapsack problem and a real-world case study on the test case selection and assignment problem, an example from the software engineering domain, where test cases have to be distributed over compatible test machines.
Lane mark detection is an important element in the road scene analysis for Advanced Driver Assistant System (ADAS).
Limited by the onboard computing power, it is still a challenge to reduce system complexity and maintain high accuracy at the same time.
In this paper, we propose a Lane Marking Detector (LMD) using a deep convolutional neural network to extract robust lane marking features.
To improve its performance with a target of lower complexity, the dilated convolution is adopted.
A shallower and thinner structure is designed to decrease the computational cost.
Moreover, we also design post-processing algorithms to construct 3rd-order polynomial models to fit into the curved lanes.
Our system shows promising results on the captured road scenes.
We propose a type system for a calculus of contracting processes.
Processes can establish sessions by stipulating contracts, and then can interact either by keeping the promises made, or not.
Type safety guarantees that a typeable process is honest - that is, it abides by the contracts it has stipulated in all possible contexts, even in presence of dishonest adversaries.
Type inference is decidable, and it allows to safely approximate the honesty of processes using either synchronous or asynchronous communication.
The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs.
Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR.
This paper documents some of our experiences with these tools, which ones worked and which did not, and what we've learned from them.
The result is a codec which compares favorably with HEVC on still images, and is on a path to do so for video as well.
Majority of Artificial Neural Network (ANN) implementations in autonomous systems use a fixed/user-prescribed network topology, leading to sub-optimal performance and low portability.
The existing neuro-evolution of augmenting topology or NEAT paradigm offers a powerful alternative by allowing the network topology and the connection weights to be simultaneously optimized through an evolutionary process.
However, most NEAT implementations allow the consideration of only a single objective.
There also persists the question of how to tractably introduce topological diversification that mitigates overfitting to training scenarios.
To address these gaps, this paper develops a multi-objective neuro-evolution algorithm.
While adopting the basic elements of NEAT, important modifications are made to the selection, speciation, and mutation processes.
With the backdrop of small-robot path-planning applications, an experience-gain criterion is derived to encapsulate the amount of diverse local environment encountered by the system.
This criterion facilitates the evolution of genes that support exploration, thereby seeking to generalize from a smaller set of mission scenarios than possible with performance maximization alone.
The effectiveness of the single-objective (optimizing performance) and the multi-objective (optimizing performance and experience-gain) neuro-evolution approaches are evaluated on two different small-robot cases, with ANNs obtained by the multi-objective optimization observed to provide superior performance in unseen scenarios.
Stream computation is one of the approaches suitable for FPGA-based custom computing due to its high throughput capability brought by pipelining with regular memory access.
To increase performance of iterative stream computation, we can exploit both temporal and spatial parallelism by deepening and duplicating pipelines, respectively.
However, the performance is constrained by several factors including available hardware resources on FPGA, an external memory bandwidth, and utilization of pipeline stages, and therefore we need to find the best mix of the different parallelism to achieve the highest performance per power.
In this paper, we present a domain-specific language (DSL) based design space exploration for temporally and/or spatially parallel stream computation with FPGA.
We define a DSL where we can easily design a hierarchical structure of parallel stream computation with abstract description of computation.
For iterative stream computation of fluid dynamics simulation, we design hardware structures with a different mix of the temporal and spatial parallelism.
By measuring the performance and the power consumption, we find the best among them.
While motivation is of great interest to computing educators, relatively little work has been done on understanding faculty attitudes toward student motivation.
Two previous qualitative studies of instructor attitudes found results identical to those from other disciplines, but neither study considered whether instructors perceive student motivation to be more important in certain computing classes.
In this work we present quantitative results about the perceived importance of student motivation in computing courses on the part of computing educators.
Our survey results show that while a majority of respondents believe student motivation is necessary in all computing courses, the structure and audience in certain computing classes elevate the importance of student motivation.
We determine necessary conditions on the structure of symbol error rate (SER) optimal quantizers for limited feedback beamforming in wireless networks with one transmitter-receiver pair and R parallel amplify-and-forward relays.
We call a quantizer codebook "small" if its cardinality is less than R, and "large" otherwise.
A "d-codebook" depends on the power constraints and can be optimized accordingly, while an "i-codebook" remains fixed.
It was previously shown that any i-codebook that contains the single-relay selection (SRS) codebook achieves the full-diversity order, R. We prove the following:   Every full-diversity i-codebook contains the SRS codebook, and thus is necessarily large.
In general, as the power constraints grow to infinity, the limit of an optimal large d-codebook contains an SRS codebook, provided that it exists.
For small codebooks, the maximal diversity is equal to the codebook cardinality.
Every diversity-optimal small i-codebook is an orthogonal multiple-relay selection (OMRS) codebook.
Moreover, the limit of an optimal small d-codebook is an OMRS codebook.
We observe that SRS is nothing but a special case of OMRS for codebooks with cardinality equal to R. As a result, we call OMRS as "the universal necessary condition" for codebook optimality.
Finally, we confirm our analytical findings through simulations.
We equip dynamic geometry software (DGS) with a user-friendly method that enables massively parallel calculations on the graphics processing unit (GPU).
This interplay of DGS and GPU opens up various applications in education and mathematical research.
The GPU-aided discovery of mathematical properties, interactive visualizations of algebraic surfaces (raycasting), the mathematical deformation of images and footage in real-time, and computationally demanding numerical simulations of PDEs are examples from the long and versatile list of new domains that our approach makes accessible within a DGS.
We ease the development of complex (mathematical) visualizations and provide a rapid-prototyping scheme for general-purpose computations (GPGPU).
The possibility to program both CPU and GPU with the use of only one high-level (scripting) programming language is a crucial aspect of our concept.
We embed shader programming seamlessly within a high-level (scripting) programming environment.
The aforementioned requires the symbolic process of the transcompilation of a high-level programming language into shader programming language for GPU and, in this article, we address the challenge of the automatic translation of a high-level programming language to a shader language of the GPU.
To maintain platform independence and the possibility to use our technology on modern devices, we focus on a realization through WebGL.
In context of efforts of composing category-theoretic and logical methods in the area of knowledge representation we propose the notion of conceptory.
We consider intersection/union and other constructions in conceptories as expressive alternative to category-theoretic (co)limits and show they have features similar to (pro-, in-)jections.
Then we briefly discuss approaches to development of formal systems built on the base of conceptories and describe possible application of such system to the specific ontology.
A challenge in multiagent control systems is to ensure that they are appropriately resilient to communication failures between the various agents.
In many common game-theoretic formulations of these types of systems, it is implicitly assumed that all agents have access to as much information about other agents' actions as needed.
This paper endeavors to augment these game-theoretic methods with policies that would allow agents to react on-the-fly to losses of this information.
Unfortunately, we show that even if a single agent loses communication with one other weakly-coupled agent, this can cause arbitrarily-bad system states to emerge as various solution concepts of an associated game, regardless of how the agent accounts for the communication failure and regardless of how weakly coupled the agents are.
Nonetheless, we show that the harm that communication failures can cause is limited by the structure of the problem; when agents' action spaces are richer, problems are more susceptible to these types of pathologies.
Finally, we undertake an initial study into how a system designer might prevent these pathologies, and explore a few limited settings in which communication failures cannot cause harm.
CANDECOMP/PARAFAC (CPD) approximates multiway data by sum of rank-1 tensors.
Our recent study has presented a method to rank-1 tensor deflation, i.e. sequential extraction of the rank-1 components.
In this paper, we extend the method to block deflation problem.
When at least two factor matrices have full column rank, one can extract two rank-1 tensors simultaneously, and rank of the data tensor is reduced by 2.
For decomposition of order-3 tensors of size R x R x R and rank-R, the block deflation has a complexity of O(R^3) per iteration which is lower than the cost O(R^4) of the ALS algorithm for the overall CPD.
As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation.
They capture semantic and syntactic relations among words but the vector corresponding to the words are only meaningful relative to each other.
Neither the vector nor its dimensions have any absolute, interpretable meaning.
We introduce an additive modification to the objective function of the embedding learning algorithm that encourages the embedding vectors of words that are semantically related to a predefined concept to take larger values along a specified dimension, while leaving the original semantic learning mechanism mostly unaffected.
In other words, we align words that are already determined to be related, along predefined concepts.
Therefore, we impart interpretability to the word embedding by assigning meaning to its vector dimensions.
The predefined concepts are derived from an external lexical resource, which in this paper is chosen as Roget's Thesaurus.
We observe that alignment along the chosen concepts is not limited to words in the Thesaurus and extends to other related words as well.
We quantify the extent of interpretability and assignment of meaning from our experimental results.
We also demonstrate the preservation of semantic coherence of the resulting vector space by using word-analogy and word-similarity tests.
These tests show that the interpretability-imparted word embeddings that are obtained by the proposed framework do not sacrifice performances in common benchmark tests.
Ever since the feasibility of in-band full-duplex (FD) at the Physical (PHY) layer has been established, several studies have emerged investigating protocol aspects of enabling FD operation in various legacy wireless technologies.
Recently, the adoption of a simultaneous transmit and receive (STR) mode for next generation wireless local area networks (WLANs) has received significant attention.
Enabling STR mode (FD communication mode) in 802.11 WLANs creates bi-directional FD (BFD) and uni-directional FD (UFD) links.
STR mode in 802.11 WLANs must be enabled with minimal protocol modifications while accounting for the co-existence and compatibility with legacy nodes and protocols.
This paper provides a novel solution, that can leverage carrier sense multiple access with enhanced collision avoidance (CSMA/ECA) and adaptive sensitivity control mechanisms, for enabling STR operation.
The key aspects of the proposed solution include co-existence with legacy nodes, identification of eligible nodes for UFD, optimization of secondary BFD and UFD transmissions, and creation of UFD opportunities.
Performance evaluation demonstrates that the proposed solution is effective in achieving the gains provided by STR operation.
Starting from a 3D electrothermal field problem discretized by the Finite Integration Technique, the equivalence to a circuit description is shown by exploiting the analogy to the Modified Nodal Analysis approach.
Using this analogy, an algorithm for the automatic generation of a monolithic SPICE netlist is presented.
Joule losses from the electrical circuit are included as heat sources in the thermal circuit.
The thermal simulation yields nodal temperatures that influence the electrical conductivity.
Apart from the used field discretization, this approach applies no further simplifications.
An example 3D chip package is used to validate the algorithm.
Deep learning techniques have been paramount in the last years, mainly due to their outstanding results in a number of applications, that range from speech recognition to face-based user identification.
Despite other techniques employed for such purposes, Deep Boltzmann Machines are among the most used ones, which are composed of layers of Restricted Boltzmann Machines (RBMs) stacked on top of each other.
In this work, we evaluate the concept of temperature in DBMs, which play a key role in Boltzmann-related distributions, but it has never been considered in this context up to date.
Therefore, the main contribution of this paper is to take into account this information and to evaluate its influence in DBMs considering the task of binary image reconstruction.
We expect this work can foster future research considering the usage of different temperatures during learning in DBMs.
Information quality in social media is an increasingly important issue, but web-scale data hinders experts' ability to assess and correct much of the inaccurate content, or `fake news,' present in these platforms.
This paper develops a method for automating fake news detection on Twitter by learning to predict accuracy assessments in two credibility-focused Twitter datasets: CREDBANK, a crowdsourced dataset of accuracy assessments for events in Twitter, and PHEME, a dataset of potential rumors in Twitter and journalistic assessments of their accuracies.
We apply this method to Twitter content sourced from BuzzFeed's fake news dataset and show models trained against crowdsourced workers outperform models based on journalists' assessment and models trained on a pooled dataset of both crowdsourced workers and journalists.
All three datasets, aligned into a uniform format, are also publicly available.
A feature analysis then identifies features that are most predictive for crowdsourced and journalistic accuracy assessments, results of which are consistent with prior work.
We close with a discussion contrasting accuracy and credibility and why models of non-experts outperform models of journalists for fake news detection in Twitter.
In this paper, we focus on online representation learning in non-stationary environments which may require continuous adaptation of model architecture.
We propose a novel online dictionary-learning (sparse-coding) framework which incorporates the addition and deletion of hidden units (dictionary elements), and is inspired by the adult neurogenesis phenomenon in the dentate gyrus of the hippocampus, known to be associated with improved cognitive function and adaptation to new environments.
In the online learning setting, where new input instances arrive sequentially in batches, the neuronal-birth is implemented by adding new units with random initial weights (random dictionary elements); the number of new units is determined by the current performance (representation error) of the dictionary, higher error causing an increase in the birth rate.
Neuronal-death is implemented by imposing l1/l2-regularization (group sparsity) on the dictionary within the block-coordinate descent optimization at each iteration of our online alternating minimization scheme, which iterates between the code and dictionary updates.
Finally, hidden unit connectivity adaptation is facilitated by introducing sparsity in dictionary elements.
Our empirical evaluation on several real-life datasets (images and language) as well as on synthetic data demonstrates that the proposed approach can considerably outperform the state-of-art fixed-size (nonadaptive) online sparse coding of Mairal et al.
(2009) in the presence of nonstationary data.
Moreover, we identify certain properties of the data (e.g., sparse inputs with nearly non-overlapping supports) and of the model (e.g., dictionary sparsity) associated with such improvements.
Decreasing costs of vision sensors and advances in embedded hardware boosted lane related research detection, estimation, and tracking in the past two decades.
The interest in this topic has increased even more with the demand for advanced driver assistance systems (ADAS) and self-driving cars.
Although extensively studied independently, there is still need for studies that propose a combined solution for the multiple problems related to the ego-lane, such as lane departure warning (LDW), lane change detection, lane marking type (LMT) classification, road markings detection and classification, and detection of adjacent lanes (i.e., immediate left and right lanes) presence.
In this paper, we propose a real-time Ego-Lane Analysis System (ELAS) capable of estimating ego-lane position, classifying LMTs and road markings, performing LDW and detecting lane change events.
The proposed vision-based system works on a temporal sequence of images.
Lane marking features are extracted in perspective and Inverse Perspective Mapping (IPM) images that are combined to increase robustness.
The final estimated lane is modeled as a spline using a combination of methods (Hough lines with Kalman filter and spline with particle filter).
Based on the estimated lane, all other events are detected.
To validate ELAS and cover the lack of lane datasets in the literature, a new dataset with more than 20 different scenes (in more than 15,000 frames) and considering a variety of scenarios (urban road, highways, traffic, shadows, etc.) was created.
The dataset was manually annotated and made publicly available to enable evaluation of several events that are of interest for the research community (i.e., lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes).
ELAS achieved high detection rates in all real-world events and proved to be ready for real-time applications.
Optical Character Recognition (OCR) has been a topic of interest for many years.
It is defined as the process of digitizing a document image into its constituent characters.
Despite decades of intense research, developing OCR with capabilities comparable to that of human still remains an open challenge.
Due to this challenging nature, researchers from industry and academic circles have directed their attentions towards Optical Character Recognition.
Over the last few years, the number of academic laboratories and companies involved in research on Character Recognition has increased dramatically.
This research aims at summarizing the research so far done in the field of OCR.
It provides an overview of different aspects of OCR and discusses corresponding proposals aimed at resolving issues of OCR.
Yes, it can.
Data augmentation is perhaps the oldest preprocessing step in computer vision literature.
Almost every computer vision model trained on imaging data uses some form of augmentation.
In this paper, we use the inter-vertebral disk segmentation task alongside a deep residual U-Net as the learning model, to explore the effectiveness of augmentation.
In the extreme, we observed that a model trained on patches extracted from just one scan, with each patch augmented 50 times; achieved a Dice score of 0.73 in a validation set of 40 cases.
Qualitative evaluation indicated a clinically usable segmentation algorithm, which appropriately segments regions of interest, alongside limited false positive specks.
When the initial patches are extracted from nine scans the average Dice coefficient jumps to 0.86 and most of the false positives disappear.
While this still falls short of state-of-the-art deep learning based segmentation of discs reported in literature, qualitative examination reveals that it does yield segmentation, which can be amended by expert clinicians with minimal effort to generate additional data for training improved deep models.
Extreme augmentation of training data, should thus be construed as a strategy for training deep learning based algorithms, when very little manually annotated data is available to work with.
Models trained with extreme augmentation can then be used to accelerate the generation of manually labelled data.
Hence, we show that extreme augmentation can be a valuable tool in addressing scaling up small imaging data sets to address medical image segmentation tasks.
The successes of previous and current Mars rovers have encouraged space agencies worldwide to pursue additional planetary exploration missions with more ambitious navigation goals.
For example, NASA's planned Mars Sample Return mission will be a multi-year undertaking that will require a solar-powered rover to drive over 150 metres per sol for approximately three months.
This paper reviews the mobility planning framework used by current rovers and surveys the major challenges involved in continuous long-distance navigation on the Red Planet.
It also discusses recent work related to environment-aware and energy-aware navigation, and provides a perspective on how such work may eventually allow a solar-powered rover to achieve autonomous long-distance navigation on Mars.
GENESIS3 is the new version of the GENESIS software environment for musical creation by means of mass-interaction physics network modeling.
It was designed, and developed from scratch, in hindsight of more than 10 years working on and using the previous version.
We take the opportunity of this birth to provide in this article (1) an analysis of the peculiarities in GENESIS, aiming at highlighting its core ?software paradigm?
; and (2) an update on the features of the new version as compared to the last.
We investigate conditions under which a co-computably enumerable set in a computable metric space is computable.
Using higher-dimensional chains and spherical chains we prove that in each computable metric space which is locally computable each co-computably enumerable sphere is computable and each co-c.e. cell with co-c.e. boundary sphere is computable.
We present a method for discovering never-seen-before objects in 3D point clouds obtained from sensors like Microsoft Kinect.
We generate supervoxels directly from the point cloud data and use them with a Siamese network, built on a recently proposed 3D convolutional neural network architecture.
We use known objects to train a non-linear embedding of supervoxels, by optimizing the criteria that supervoxels which fall on the same object should be closer than those which fall on different objects, in the embedding space.
We test on unknown objects, which were not seen during training, and perform clustering in the learned embedding space of supervoxels to effectively perform novel object discovery.
We validate the method with extensive experiments, quantitatively showing that it can discover numerous unseen objects while being trained on only a few dense 3D models.
We also show very good qualitative results of object discovery in point cloud data when the test objects, either specific instances or even categories, were never seen during training.
Convolutional Neural Networks (CNNs) have demonstrated great results for the single-image super-resolution (SISR) problem.
Currently, most CNN algorithms promote deep and computationally expensive models to solve SISR.
However, we propose a novel SISR method that uses relatively less number of computations.
On training, we get group convolutions that have unused connections removed.
We have refined this system specifically for the task at hand by removing unnecessary modules from original CondenseNet.
Further, a reconstruction network consisting of deconvolutional layers has been used in order to upscale to high resolution.
All these steps significantly reduce the number of computations required at testing time.
Along with this, bicubic upsampled input is added to the network output for easier learning.
Our model is named SRCondenseNet.
We evaluate the method using various benchmark datasets and show that it performs favourably against the state-of-the-art methods in terms of both accuracy and number of computations required.
We consider the two-sided stable matching setting in which there may be uncertainty about the agents' preferences due to limited information or communication.
We consider three models of uncertainty: (1) lottery model --- in which for each agent, there is a probability distribution over linear preferences, (2) compact indifference model --- for each agent, a weak preference order is specified and each linear order compatible with the weak order is equally likely and (3) joint probability model --- there is a lottery over preference profiles.
For each of the models, we study the computational complexity of computing the stability probability of a given matching as well as finding a matching with the highest probability of being stable.
We also examine more restricted problems such as deciding whether a certainly stable matching exists.
We find a rich complexity landscape for these problems, indicating that the form uncertainty takes is significant.
Current language models have a significant limitation in the ability to encode and decode factual knowledge.
This is mainly because they acquire such knowledge from statistical co-occurrences although most of the knowledge words are rarely observed.
In this paper, we propose a Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by the knowledge graph with the RNN language model.
By predicting whether the word to generate has an underlying fact or not, the model can generate such knowledge-related words by copying from the description of the predicted fact.
In experiments, we show that the NKLM significantly improves the performance while generating a much smaller number of unknown words.
Lensless imaging is an important and challenging problem.
One notable solution to lensless imaging is a single pixel camera which benefits from ideas central to compressive sampling.
However, traditional single pixel cameras require many illumination patterns which result in a long acquisition process.
Here we present a method for lensless imaging based on compressive ultrafast sensing.
Each sensor acquisition is encoded with a different illumination pattern and produces a time series where time is a function of the photon's origin in the scene.
Currently available hardware with picosecond time resolution enables time tagging photons as they arrive to an omnidirectional sensor.
This allows lensless imaging with significantly fewer patterns compared to regular single pixel imaging.
To that end, we develop a framework for designing lensless imaging systems that use ultrafast detectors.
We provide an algorithm for ideal sensor placement and an algorithm for optimized active illumination patterns.
We show that efficient lensless imaging is possible with ultrafast measurement and compressive sensing.
This paves the way for novel imaging architectures and remote sensing in extreme situations where imaging with a lens is not possible.
Middleboxes have become a vital part of modern networks by providing service functions such as content filtering, load balancing and optimization of network traffic.
An ordered sequence of middleboxes composing a logical service is called service chain.
Service Function Chaining (SFC) enables us to define these service chains.
Recent optimization models of SFCs assume that the functionality of a middlebox is provided by a single software appliance, commonly known as Virtual Network Function (VNF).
This assumption limits SFCs to the throughput of an individual VNF and resources of a physical machine hosting the VNF instance.
Moreover, typical service providers offer VNFs with heterogeneous throughput and resource configurations.
Thus, deploying a service chain with custom throughput can become a tedious process of stitching heterogeneous VNF instances.
In this paper, we describe how we can overcome these limitations without worrying about underlying VNF configurations and resource constraints.
This prospect is achieved by distributed deploying multiple VNF instances providing the functionality of a middlebox and modeling the optimal deployment of a service chain as a mixed integer programming problem.
The proposed model optimizes host and bandwidth resources allocation, and determines the optimal placement of VNF instances, while balancing workload and routing traffic among these VNF instances.
We show that this problem is NP-Hard and propose a heuristic solution called Kariz.
Kariz utilizes a tuning parameter to control the trade-off between speed and accuracy of the solution.
Finally, our solution is evaluated using simulations in data-center networks.
Antitubercular activity of Sulfathiazole Derivitives series were subjected to Quantitative Structure Activity Relationship (QSAR) Analysis with an attempt to derive and understand a correlation between the Biologically Activity as dependent variable and various descriptors as independent variables.
QSAR models generated using 28 compounds.
Several statistical regression expressions were obtained using Partial Least Squares (PLS) Regression, Multiple Linear Regression (MLR) and Principal Component Regression (PCR) methods.
The among these methods, Partial Least Square Regression (PLS) method has shown very promising result as compare to other two methods.
A QSAR model was generated by a training set of 18 molecules with correlation coefficient r (r square) of 0.9191, significant cross validated correlation coefficient (q square) of 0.8300, F test of 53.5783, r square for external test set pred_r square -3.6132, coefficient of correlation of predicted data set pred_r_se square 1.4859 and degree of freedom 14 by Partial Least Squares Regression Method.
When studying networks using random graph models, one is sometimes faced with situations where the notion of adjacency between nodes reflects multiple constraints.
Traditional random graph models are insufficient to handle such situations.
A simple idea to account for multiple constraints consists in taking the intersection of random graphs.
In this paper we initiate the study of random graphs so obtained through a simple example.
We examine the intersection of an Erdos-Renyi graph and of one-dimensional geometric random graphs.
We investigate the zero-one laws for the property that there are no isolated nodes.
When the geometric component is defined on the unit circle, a full zero-one law is established and we determine its critical scaling.
When the geometric component lies in the unit interval, there is a gap in that the obtained zero and one laws are found to express deviations from different critical scalings.
In particular, the first moment method requires a larger critical scaling than in the unit circle case in order to obtain the one law.
This discrepancy is somewhat surprising given that the zero-one laws for the absence of isolated nodes are identical in the geometric random graphs on both the unit interval and unit circle.
The angle between two compressed sparse vectors subject to the norm/distance constraints imposed by the restricted isometry property (RIP) of the sensing matrix plays a crucial role in the studies of many compressive sensing (CS) problems.
Assuming that (i) u and v are two sparse vectors separated by an angle thetha, and (ii) the sensing matrix Phi satisfies RIP, this paper is aimed at analytically characterizing the achievable angles between Phi*u and Phi*v. Motivated by geometric interpretations of RIP and with the aid of the well-known law of cosines, we propose a plane geometry based formulation for the study of the considered problem.
It is shown that all the RIP-induced norm/distance constraints on Phi*u and Phi*v can be jointly depicted via a simple geometric diagram in the two-dimensional plane.
This allows for a joint analysis of all the considered algebraic constraints from a geometric perspective.
By conducting plane geometry analyses based on the constructed diagram, closed-form formulae for the maximal and minimal achievable angles are derived.
Computer simulations confirm that the proposed solution is tighter than an existing algebraic-based estimate derived using the polarization identity.
The obtained results are used to derive a tighter restricted isometry constant of structured sensing matrices of a certain kind, to wit, those in the form of a product of an orthogonal projection matrix and a random sensing matrix.
Follow-up applications to three CS problems, namely, compressed-domain interference cancellation, RIP-based analysis of the orthogonal matching pursuit algorithm, and the study of democratic nature of random sensing matrices are investigated.
Many clustering problems in computer vision and other contexts are also classification problems, where each cluster shares a meaningful label.
Subspace clustering algorithms in particular are often applied to problems that fit this description, for example with face images or handwritten digits.
While it is straightforward to request human input on these datasets, our goal is to reduce this input as much as possible.
We present a pairwise-constrained clustering algorithm that actively selects queries based on the union-of-subspaces model.
The central step of the algorithm is in querying points of minimum margin between estimated subspaces; analogous to classifier margin, these lie near the decision boundary.
We prove that points lying near the intersection of subspaces are points with low margin.
Our procedure can be used after any subspace clustering algorithm that outputs an affinity matrix.
We demonstrate on several datasets that our algorithm drives the clustering error down considerably faster than the state-of-the-art active query algorithms on datasets with subspace structure and is competitive on other datasets.
Robotic systems, working together as a team, are becoming valuable players in different real-world applications, from disaster response to warehouse fulfillment services.
Centralized solutions for coordinating multi-robot teams often suffer from poor scalability and vulnerability to communication disruptions.
This paper develops a decentralized multi-agent task allocation (Dec-MATA) algorithm for multi-robot applications.
The task planning problem is posed as a maximum-weighted matching of a bipartite graph, the solution of which using the blossom algorithm allows each robot to autonomously identify the optimal sequence of tasks it should undertake.
The graph weights are determined based on a soft clustering process, which also plays a problem decomposition role seeking to reduce the complexity of the individual-agents' task assignment problems.
To evaluate the new Dec-MATA algorithm, a series of case studies (of varying complexity) are performed, with tasks being distributed randomly over an observable 2D environment.
A centralized approach, based on a state-of-the-art MILP formulation of the multi-Traveling Salesman problem is used for comparative analysis.
While getting within 7-28% of the optimal cost obtained by the centralized algorithm, the Dec-MATA algorithm is found to be 1-3 orders of magnitude faster and minimally sensitive to task-to-robot ratios, unlike the centralized algorithm.
Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors.
However, even most recent approaches focus on the case of a single isolated hand.
In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects.
Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data.
Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques.
Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras.
For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.
Local descriptors based on the image noise residual have proven extremely effective for a number of forensic applications, like forgery detection and localization.
Nonetheless, motivated by promising results in computer vision, the focus of the research community is now shifting on deep learning.
In this paper we show that a class of residual-based descriptors can be actually regarded as a simple constrained convolutional neural network (CNN).
Then, by relaxing the constraints, and fine-tuning the net on a relatively small training set, we obtain a significant performance improvement with respect to the conventional detector.
After the concept of industry cluster was tangibly applied in many countries, SMEs trended to link to each other to maintain their competitiveness in the market.
The major key success factors of the cluster are knowledge sharing and collaboration between partners.
This knowledge is collected in form of tacit and explicit knowledge from experts and institutions within the cluster.
The objective of this study is about enhancing the industry cluster with knowledge management by using knowledge engineering which is one of the most important method for managing knowledge.
This work analyzed three well known knowledge engineering methods, i.e.MOKA, SPEDE and CommonKADS, and compares the capability to be implemented in the cluster context.
Then, we selected one method and proposed the adapted methodology.
At the end of this paper, we validated and demonstrated the proposed methodology with some primary result by using case study of handicraft cluster in Thailand.
This paper describes the realization of the Ontology Web Search Engine.
The Ontology Web Search Engine is realizable as independent project and as a part of other projects.
The main purpose of this paper is to present the Ontology Web Search Engine realization details as the part of the Semantic Web Expert System and to present the results of the Ontology Web Search Engine functioning.
It is expected that the Semantic Web Expert System will be able to process ontologies from the Web, generate rules from these ontologies and develop its knowledge base.
Modelling, simulation and optimization form an integrated part of modern design practice in engineering and industry.
Tremendous progress has been observed for all three components over the last few decades.
However, many challenging issues remain unresolved, and the current trends tend to use nature-inspired algorithms and surrogate-based techniques for modelling and optimization.
This 4th workshop on Computational Optimization, Modelling and Simulation (COMS 2013) at ICCS 2013 will further summarize the latest developments of optimization and modelling and their applications in science, engineering and industry.
In this review paper, we will analyse the recent trends in modelling and optimization, and their associated challenges.
We will discuss important topics for further research, including parameter-tuning, large-scale problems, and the gaps between theory and applications.
This paper deals with segmentation of organs at risk (OAR) in head and neck area in CT images which is a crucial step for reliable intensity modulated radiotherapy treatment.
We introduce a convolution neural network with encoder-decoder architecture and a new loss function, the batch soft Dice loss function, used to train the network.
The resulting model produces segmentations of every OAR in the public MICCAI 2015 Head And Neck Auto-Segmentation Challenge dataset.
Despite the heavy class imbalance in the data, we improve accuracy of current state-of-the-art methods by 0.33 mm in terms of average surface distance and by 0.11 in terms of Dice overlap coefficient on average.
Measures of complex network analysis, such as vertex centrality, have the potential to unveil existing network patterns and behaviors.
They contribute to the understanding of networks and their components by analyzing their structural properties, which makes them useful in several computer science domains and applications.
Unfortunately, there is a large number of distinct centrality measures and little is known about their common characteristics in practice.
By means of an empirical analysis, we aim at a clear understanding of the main centrality measures available, unveiling their similarities and differences in a large number of distinct social networks.
Our experiments show that the vertex centrality measures known as information, eigenvector, subgraph, walk betweenness and betweenness can distinguish vertices in all kinds of networks with a granularity performance at 95%, while other metrics achieved a considerably lower result.
In addition, we demonstrate that several pairs of metrics evaluate the vertices in a very similar way, i.e. their correlation coefficient values are above 0.7.
This was unexpected, considering that each metric presents a quite distinct theoretical and algorithmic foundation.
Our work thus contributes towards the development of a methodology for principled network analysis and evaluation.
The network transport of 3D video, which contains two views of a video scene, poses significant challenges due to the increased video data compared to conventional single-view video.
Addressing these challenges requires a thorough understanding of the traffic and multiplexing characteristics of the different representation formats of 3D video.
We examine the average bitrate-distortion (RD) and bitrate variability-distortion (VD) characteristics of three main representation formats.
Specifically, we compare multiview video (MV) representation and encoding, frame sequential (FS) representation, and side-by-side (SBS) representation, whereby conventional single-view encoding is employed for the FS and SBS representations.
Our results for long 3D videos in full HD format indicate that the MV representation and encoding achieves the highest RD efficiency, while exhibiting the highest bitrate variabilities.
We examine the impact of these bitrate variabilities on network transport through extensive statistical multiplexing simulations.
We find that when multiplexing a small number of streams, the MV and FS representations require the same bandwidth.
However, when multiplexing a large number of streams or smoothing traffic, the MV representation and encoding reduces the bandwidth requirement relative to the FS representation.
Kernel methods give powerful, flexible, and theoretically grounded approaches to solving many problems in machine learning.
The standard approach, however, requires pairwise evaluations of a kernel function, which can lead to scalability issues for very large datasets.
Rahimi and Recht (2007) suggested a popular approach to handling this problem, known as random Fourier features.
The quality of this approximation, however, is not well understood.
We improve the uniform error bound of that paper, as well as giving novel understandings of the embedding's variance, approximation error, and use in some machine learning methods.
We also point out that surprisingly, of the two main variants of those features, the more widely used is strictly higher-variance for the Gaussian kernel and has worse bounds.
The development of computer technology has been rapid.
Not so long ago, the first computer was developed which was large and bulky.
Now, the latest generation of smartphones has a calculation power, which would have been considered those of supercomputers in 1990.
For a smart environment, the person recognition and re-recognition is an important topic.
The distribution of new technologies like wearable computing is a new approach to the field of person recognition and re-recognition.
This article lays out the idea of identifying and re-identifying wearable computing devices by listening to their wireless communication connectivity like Wi-Fi and Bluetooth and building a classification of interaction scenarios for the combination of human-wearable-environment.
Control systems behavior can be analyzed taking into account a large number of parameters: performances, reliability, availability, security.
Each control system presents various security vulnerabilities that affect in lower or higher measure its functioning.
In this paper the authors present a method to assess the impact of security issues on the systems availability.
A fuzzy model for estimating the availability of the system based on the security level and achieved availability coefficient (depending on MTBF and MTR) is developed and described.
The results of the fuzzy inference system (FIS) are presented in the last section of the paper.
An object detector performs suboptimally when applied to image data taken from a viewpoint different from the one with which it was trained.
In this paper, we present a viewpoint adaptation algorithm that allows a trained single-view object detector to be adapted to a new, distinct viewpoint.
We first illustrate how a feature space transformation can be inferred from a known homography between the source and target viewpoints.
Second, we show that a variety of trained classifiers can be modified to behave as if that transformation were applied to each testing instance.
The proposed algorithm is evaluated on a person detection task using images from the PETS 2007 and CAVIAR datasets, as well as from a new synthetic multi-view person detection dataset.
It yields substantial performance improvements when adapting single-view person detectors to new viewpoints, and simultaneously reduces computational complexity.
This work has the potential to improve detection performance for cameras viewing objects from arbitrary viewpoints, while simplifying data collection and feature extraction.
This paper addresses the problem of finding multiple near-optimal, spatially-dissimilar paths that can be considered as alternatives in the decision making process, for finding optimal corridors in which to construct a new road.
We further consider combinations of techniques for reducing the costs associated with the computation and increasing the accuracy of the cost formulation.
Numerical results for five algorithms to solve the dissimilar multipath problem show that a "bidirectional approach" yields the fastest running times and the most robust algorithm.
Further modifications of the algorithms to reduce the running time were tested and it is shown that running time can be reduced by an average of 56 percent without compromising the quality of the results.
Cloud Computing is an emerging area for accessing computing resources.
In general, Cloud service providers offer services that can be clustered into three categories: SaaS, PaaS and IaaS.
This paper discusses the Cloud workload analysis.
The efficient Cloud workload resource mapping technique is proposed.
This paper aims to provide a means of understanding and investigating IaaS Cloud workloads and the resources.
In this paper, regression analysis is used to analyze the Cloud workloads and identifies the relationship between Cloud workloads and available resources.
The effective organization of dynamic nature resources can be done with the help of Cloud workloads.
Till Cloud workload is considered a vital talent, the Cloud resources cannot be consumed in an effective style.
The proposed technique has been validated by Z Formal specification language.
This approach is effective in minimizing the cost and submission burst time of Cloud workloads.
This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth.
HEALER's sequential plans (built using knowledge of social networks of homeless youth) choose intervention participants strategically to maximize influence spread, while reasoning about uncertainties in the network.
While previous work presents influence maximizing techniques to choose intervention participants, they do not address three real-world issues: (i) they completely fail to scale up to real-world sizes; (ii) they do not handle deviations in execution of intervention plans; (iii) constructing real-world social networks is an expensive process.
HEALER handles these issues via four major contributions: (i) HEALER casts this influence maximization problem as a POMDP and solves it using a novel planner which scales up to previously unsolvable real-world sizes; (ii) HEALER allows shelter officials to modify its recommendations, and updates its future plans in a deviation-tolerant manner; (iii) HEALER constructs social networks of homeless youth at low cost, using a Facebook application.
Finally, (iv) we show hardness results for the problem that HEALER solves.
HEALER will be deployed in the real world in early Spring 2016 and is currently undergoing testing at a homeless shelter.
In a vertex-colored graph, an edge is happy if its endpoints have the same color.
Similarly, a vertex is happy if all its incident edges are happy.
Motivated by the computation of homophily in social networks, we consider the algorithmic aspects of the following Maximum Happy Edges (k-MHE) problem: given a partially k-colored graph G, find an extended full k-coloring of G maximizing the number of happy edges.
When we want to maximize the number of happy vertices, the problem is known as Maximum Happy Vertices (k-MHV).
We further study the complexity of the problems and their weighted variants.
For instance, we prove that for every k >= 3, both problems are NP-complete for bipartite graphs and k-MHV remains hard for split graphs.
In terms of exact algorithms, we show both problems can be solved in time O*(2^n), and give an even faster O*(1.89^n)-time algorithm when k = 3.
From a parameterized perspective, we give a linear vertex kernel for Weighted k-MHE, where edges are weighted and the goal is to obtain happy edges of at least a specified total weight.
Finally, we prove both problems are solvable in polynomial-time when the graph has bounded treewidth or bounded neighborhood diversity.
Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parent-like trainers during a task.
In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback.
Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback.
The integration process also considers the case in which the two modalities convey incongruent information.
Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances.
We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task.
In our experimental setup, we explore the interplay of multimodal feedback and task-specific affordances in a robot cleaning scenario.
We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances.
Our experiments show that the best performance is obtained by using audio-visual feedback with affordancemodulated IRL.
The obtained results demonstrate the importance of multi-modal sensory processing integrated with goal-oriented knowledge in IRL tasks.
Semantic image inpainting is a challenging task where large missing regions have to be filled based on the available visual data.
Existing methods which extract information from only a single image generally produce unsatisfactory results due to the lack of high level context.
In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data.
Given a trained generative model, we search for the closest encoding of the corrupted image in the latent image manifold using our context and prior losses.
This encoding is then passed through the generative model to infer the missing content.
In our method, inference is possible irrespective of how the missing content is structured, while the state-of-the-art learning based method requires specific information about the holes in the training phase.
Experiments on three datasets show that our method successfully predicts information in large missing regions and achieves pixel-level photorealism, significantly outperforming the state-of-the-art methods.
Scheduling in Grid computing has been active area of research since its beginning.
However, beginners find very difficult to understand related concepts due to a large learning curve of Grid computing.
Thus, there is a need of concise understanding of scheduling in Grid computing area.
This paper strives to present concise understanding of scheduling and related understanding of Grid computing system.
The paper describes overall picture of Grid computing and discusses important sub-systems that enable Grid computing possible.
Moreover, the paper also discusses concepts of resource scheduling and application scheduling and also presents classification of scheduling algorithms.
Furthermore, the paper also presents methodology used for evaluating scheduling algorithms including both real system and simulation based approaches.
The presented work on scheduling in Grid containing concise understandings of scheduling system, scheduling algorithm, and scheduling methodology would be very useful to users and researchers
Genomics are rapidly transforming medical practice and basic biomedical research, providing insights into disease mechanisms and improving therapeutic strategies, particularly in cancer.
The ability to predict the future course of a patient's disease from high-dimensional genomic profiling will be essential in realizing the promise of genomic medicine, but presents significant challenges for state-of-the-art survival analysis methods.
In this abstract we present an investigation in learning genomic representations with neural networks to predict patient survival in cancer.
We demonstrate the advantages of this approach over existing survival analysis methods using brain tumor data.
Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc.
In this paper we focus on sentiment classification of Twitter data.
Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so.
Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data.
The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters.
We collected a large set of 1.5 million tweets in 13 European languages.
We created 138 sentiment models and out-of-sample datasets, which are used as a gold standard for evaluations.
The corresponding 138 in-sample datasets are used to empirically compare six different estimation procedures: three variants of cross-validation, and three variants of sequential validation (where test set always follows the training set).
We find no significant difference between the best cross-validation and sequential validation.
However, we observe that all cross-validation variants tend to overestimate the performance, while the sequential methods tend to underestimate it.
Standard cross-validation with random selection of examples is significantly worse than the blocked cross-validation, and should not be used to evaluate classifiers in time-ordered data scenarios.
In this paper, we propose a new sparse signal recovery algorithm, referred to as sparse Kalman tree search (sKTS), that provides a robust reconstruction of the sparse vector when the sequence of correlated observation vectors are available.
The proposed sKTS algorithm builds on expectation-maximization (EM) algorithm and consists of two main operations: 1) Kalman smoothing to obtain the a posteriori statistics of the source signal vectors and 2) greedy tree search to estimate the support of the signal vectors.
Through numerical experiments, we demonstrate that the proposed sKTS algorithm is effective in recovering the sparse signals and performs close to the Oracle (genie-based) Kalman estimator.
For Nonlinear-Frequency Division-Multiplexed (NFDM) systems, the statistics of the received nonlinear spectrum in the presence of additive white Gaussian noise (AWGN) is an open problem.
We present a novel method, based on the Fourier collocation algorithm, to compute these statistics.
We consider the problem of computing a binary linear transformation using unreliable components when all circuit components are unreliable.
Two noise models of unreliable components are considered: probabilistic errors and permanent errors.
We introduce the "ENCODED" technique that ensures that the error probability of the computation of the linear transformation is kept bounded below a small constant independent of the size of the linear transformation even when all logic gates in the computation are noisy.
Further, we show that the scheme requires fewer operations (in order sense) than its "uncoded" counterpart.
By deriving a lower bound, we show that in some cases, the scheme is order-optimal.
Using these results, we examine the gain in energy-efficiency from use of "voltage-scaling" scheme where gate-energy is reduced by lowering the supply voltage.
We use a gate energy-reliability model to show that tuning gate-energy appropriately at different stages of the computation ("dynamic" voltage scaling), in conjunction with ENCODED, can lead to order-sense energy-savings over the classical "uncoded" approach.
Finally, we also examine the problem of computing a linear transformation when noiseless decoders can be used, providing upper and lower bounds to the problem.
There is no known way of giving a domain-theoretic semantics to higher-order probabilistic languages, in such a way that the involved domains are continuous or quasi-continuous - the latter is required to do any serious mathematics.
We argue that the problem naturally disappears for languages with two kinds of types, where one kind is interpreted in a Cartesian-closed category of continuous dcpos, and the other is interpreted in a category that is closed under the probabilistic powerdomain functor.
Such a setting is provided by Paul B.Levy's call-by-push-value paradigm.
Following this insight, we define a call-by-push-value language, with probabilistic choice sitting inside the value types, and where conversion from a value type to a computation type involves demonic non-determinism.
We give both a domain-theoretic semantics and an operational semantics for the resulting language, and we show that they are sound and adequate.
With the addition of statistical termination testers and parallel if, we show that the language is even fully abstract - and those two primitives are required for that.
This paper investigates sparse signal recovery based on expectation propagation (EP) from unitarily invariant measurements.
A rigorous analysis is presented for the state evolution (SE) of an EP-based message-passing algorithm in the large system limit, where both input and output dimensions tend to infinity at an identical speed.
The main result is the justification of an SE formula conjectured by Ma and Ping.
A transformation network describes how one set of resources can be transformed into another via technological processes.
Transformation networks in economics are useful because they can highlight areas for future innovations, both in terms of new products, new production techniques, or better efficiency.
They also make it easy to detect areas where an economy might be fragile.
In this paper, we use computational simulations to investigate how the density of a transformation network affects the economic performance, as measured by the gross domestic product (GDP), of an artificial economy.
Our results show that on average, the GDP of our economy increases as the density of the transformation network increases.
We also find that while the average performance increases, the maximum possible performance decreases and the minimum possible performance increases.
Attributing the culprit of a cyber-attack is widely considered one of the major technical and policy challenges of cyber-security.
The lack of ground truth for an individual responsible for a given attack has limited previous studies.
Here, we overcome this limitation by leveraging DEFCON capture-the-flag (CTF) exercise data where the actual ground-truth is known.
In this work, we use various classification techniques to identify the culprit in a cyberattack and find that deceptive activities account for the majority of misclassified samples.
We also explore several heuristics to alleviate some of the misclassification caused by deception.
In this paper, we describe a tool for debugging the output and attention weights of neural machine translation (NMT) systems and for improved estimations of confidence about the output based on the attention.
The purpose of the tool is to help researchers and developers find weak and faulty example translations that their NMT systems produce without the need for reference translations.
Our tool also includes an option to directly compare translation outputs from two different NMT engines or experiments.
In addition, we present a demo website of our tool with examples of good and bad translations: http://attention.lielakeda.lv
Effective data analysis ideally requires the analyst to have high expertise as well as high knowledge of the data.
Even with such familiarity, manually pursuing all potential hypotheses and exploring all possible views is impractical.
We present DataSite, a proactive visual analytics system where the burden of selecting and executing appropriate computations is shared by an automatic server-side computation engine.
Salient features identified by these automatic background processes are surfaced as notifications in a feed timeline.
DataSite effectively turns data analysis into a conversation between analyst and computer, thereby reducing the cognitive load and domain knowledge requirements.
We validate the system with a user study comparing it to a recent visualization recommendation system, yielding significant improvement, particularly for complex analyses that existing analytics systems do not support well.
We propose a novel two-layered attention network based on Bidirectional Long Short-Term Memory for sentiment analysis.
The novel two-layered attention network takes advantage of the external knowledge bases to improve the sentiment prediction.
It uses the Knowledge Graph Embedding generated using the WordNet.
We build our model by combining the two-layered attention network with the supervised model based on Support Vector Regression using a Multilayer Perceptron network for sentiment analysis.
We evaluate our model on the benchmark dataset of SemEval 2017 Task 5.
Experimental results show that the proposed model surpasses the top system of SemEval 2017 Task 5.
The model performs significantly better by improving the state-of-the-art system at SemEval 2017 Task 5 by 1.7 and 3.7 points for sub-tracks 1 and 2 respectively.
IP networks became the most dominant type of information networks nowadays.
It provides a number of services and makes it easy for users to be connected.
IP networks provide an efficient way with a large number of services compared to other ways of voice communication.
This leads to the migration to make voice calls via IP networks.
Despite the wide range of IP networks services, availability, and its capabilities, there still a large number of security threats that affect IP networks and for sure affecting other services based on it and voice is one of them.
This paper discusses reasons of migration from making voice calls via IP networks and leaving legacy networks, requirements to be available in IP networks to support voice transport, and concentrating on SPIT attack and its detection methods.
Experiments took place to compare the different approaches used to detect spam over VoIP networks.
Secure communication is a promising technology for wireless networks because it ensures secure transmission of information.
In this paper, we investigate the joint subcarrier (SC) assignment and power allocation problem for non-orthogonal multiple access (NOMA) amplify-and-forward two-way relay wireless networks, in the presence of eavesdroppers.
By exploiting cooperative jamming (CJ) to enhance the security of the communication link, we aim to maximize the achievable secrecy energy efficiency by jointly designing the SC assignment, user pair scheduling and power allocation.
Assuming the perfect knowledge of the channel state information (CSI) at the relay station, we propose a low-complexity subcarrier assignment scheme (SCAS-1), which is equivalent to many-to-many matching games, and then SCAS-2 is formulated as a secrecy energy efficiency maximization problem.
The secure power allocation problem is modeled as a convex geometric programming problem, and then solved by interior point methods.
Simulation results demonstrate that the effectiveness of the proposed SSPA algorithms under scenarios of using and not using CJ, respectively.
3D image processing constitutes nowadays a challenging topic in many scientific fields such as medicine, computational physics and informatics.
Therefore, development of suitable tools that guaranty a best treatment is a necessity.
Spherical shapes are a big class of 3D images whom processing necessitates adoptable tools.
This encourages researchers to develop spherical wavelets and spherical harmonics as special mathematical bases able for 3D spherical shapes.
The present work lies in the whole topic of 3D image processing with the special spherical harmonics bases.
A spherical harmonics based approach is proposed for the reconstruction of images provided with spherical harmonics Shannon-type entropy to evaluate the order/disorder of the reconstructed image.
Efficiency and accuracy of the approach is demonstrated by a simulation study on several spherical models.
Bug fixing is generally a manually-intensive task.
However, recent work has proposed the idea of automated program repair, which aims to repair (at least a subset of) bugs in different ways such as code mutation, etc.
Following in the same line of work as automated bug repair, in this paper we aim to leverage past fixes to propose fixes of current/future bugs.
Specifically, we propose Ratchet, a corrective patch generation system using neural machine translation.
By learning corresponding pre-correction and post-correction code in past fixes with a neural sequence-to-sequence model, Ratchet is able to generate a fix code for a given bug-prone code query.
We perform an empirical study with five open source projects, namely Ambari, Camel, Hadoop, Jetty and Wicket, to evaluate the effectiveness of Ratchet.
Our findings show that Ratchet can generate syntactically valid statements 98.7% of the time, and achieve an F1-measure between 0.41-0.83 with respect to the actual fixes adopted in the code base.
In addition, we perform a qualitative validation using 20 participants to see whether the generated statements can be helpful in correcting bugs.
Our survey showed that Ratchet's output was considered to be helpful in fixing the bugs on many occasions, even if fix was not 100% correct.
We perform a statistical analysis of scientific-publication data with a goal to provide quantitative analysis of scientific process.
Such an investigation belongs to the newly established field of scientometrics: a branch of the general science of science that covers all quantitative methods to analyze science and research process.
As a case study we consider download and citation statistics of the journal `Europhysics Letters' (EPL), as Europe's flagship letters journal of broad interest to the physics community.
While citations are usually considered as an indicator of academic impact, downloads reflect rather the level of attractiveness or popularity of a publication.
We discuss peculiarities of both processes and correlations between them.
Cloud computing services provide a scalable solution for the storage and processing of images and multimedia files.
However, concerns about privacy risks prevent users from sharing their personal images with third-party services.
In this paper, we describe the design and implementation of CryptoImg, a library of modular privacy preserving image processing operations over encrypted images.
By using homomorphic encryption, CryptoImg allows the users to delegate their image processing operations to remote servers without any privacy concerns.
Currently, CryptoImg supports a subset of the most frequently used image processing operations such as image adjustment, spatial filtering, edge sharpening, histogram equalization and others.
We implemented our library as an extension to the popular computer vision library OpenCV.
CryptoImg can be used from either mobile or desktop clients.
Our experimental results demonstrate that CryptoImg is efficient while performing operations over encrypted images with negligible error and reasonable time overheads on the supported platforms
The paper tailors the so-called wave-based control popular in the field of flexible mechanical structures to the field of distributed control of vehicular platoons.
The proposed solution augments the symmetric bidirectional control algorithm with a wave-absorbing controller implemented on the leader, and/or on the rear-end vehicle.
The wave-absorbing controller actively absorbs an incoming wave of positional changes in the platoon and thus prevents oscillations of inter-vehicle distances.
The proposed controller significantly improves the performance of platoon manoeuvrers such as acceleration/deceleration or changing the distances between vehicles without making the platoon string unstable.
Numerical simulations show that the wave-absorbing controller performs efficiently even for platoons with a large number of vehicles, for which other platooning algorithms are inefficient or require wireless communication between vehicles.
The availability of corpora is a major factor in building natural language processing applications.
However, the costs of acquiring corpora can prevent some researchers from going further in their endeavours.
The ease of access to freely available corpora is urgent needed in the NLP research community especially for language such as Arabic.
Currently, there is not easy was to access to a comprehensive and updated list of freely available Arabic corpora.
We present in this paper, the results of a recent survey conducted to identify the list of the freely available Arabic corpora and language resources.
Our preliminary results showed an initial list of 66 sources.
We presents our findings in the various categories studied and we provided the direct links to get the data when possible.
A code of the natural numbers is a uniquely-decodable binary code of the natural numbers with non-decreasing codeword lengths, which satisfies Kraft's inequality tightly.
We define a natural partial order on the set of codes, and show how to construct effectively a code better than a given sequence of codes, in a certain precise sense.
As an application, we prove that the existence of a scale of codes (a well-ordered set of codes which contains a code better than any given code) is independent of ZFC.
With the rapid advancements in digital imaging systems and networking, low-cost hand-held image capture devices equipped with network connectivity are becoming ubiquitous.
This ease of digital image capture and sharing is also accompanied by widespread usage of user-friendly image editing software.
Thus, we are in an era where digital images can be very easily used for the massive spread of false information and their integrity need to be seriously questioned.
Application of multiple lossy compressions on images is an essential part of any image editing pipeline involving lossy compressed images.
This paper aims to address the problem of classifying images based on the number of JPEG compressions they have undergone, by utilizing deep convolutional neural networks in DCT domain.
The proposed system incorporates a well designed pre-processing step before feeding the image data to CNN to capture essential characteristics of compression artifacts and make the system image content independent.
Detailed experiments are performed to optimize different aspects of the system, such as depth of CNN, number of DCT frequencies, and execution time.
Results on the standard UCID dataset demonstrate that the proposed system outperforms existing systems for multiple JPEG compression detection and is capable of classifying more number of re-compression cycles then existing systems.
The total variation (TV) model and its related variants have already been proposed for image processing in previous literature.
In this paper a novel total variation model based on kernel functions is proposed.
In this novel model, we first map each pixel value of an image into a Hilbert space by using a nonlinear map, and then define a coupled image of an original image in order to construct a kernel function.
Finally, the proposed model is solved in a kernel function space instead of in the projecting space from a nonlinear map.
For the proposed model, we theoretically show under what conditions the mapping image is in the space of bounded variation when the original image is in the space of bounded variation.
It is also found that the proposed model further extends the generalized TV model and the information from three different channels of color images can be fused by adopting various kernel functions.
A series of experiments on some gray and color images are carried out to demonstrate the effectiveness of the proposed model.
For nanotechnology, the semiconductor device is scaled down dramatically with additional strain engineering for device enhancement, the overall device characteristic is no longer dominated by the device size but also circuit layout.
The higher order layout effects, such as well proximity effect (WPE), oxide spacing effect (OSE) and poly spacing effect (PSE), play an important role for the device performance, it is critical to understand Design for Manufacturability (DFM) impacts with various layout topology toward the overall circuit performance.
Currently, the layout effects (WPE, OSE and PSE) are validated through digital standard cell and analog differential pair test structure.
However, two analog layout structures: the guard ring and dummy fill impact are not well studied yet, then, this paper describes the current mirror test circuit to examine the guard ring and dummy fills DFM impacts using TSMC 28nm HPM process.
KAF consists of a process and some templates to guide the planning and execution of audits of knowledge resources, with emphasis on sharing.
KAF is based on methodological blueprint provided by the Data Audit Framework (DAF)conceived by the JISC-funded DAFD project.KAF enables organisations to find out what knowledge resources are associated with the project, and how they are shared.KAF is available in two versionsKAF-g (generic, domain independent) KAF-se (targets systems enegineering knowledge)
We present a technique for static enforcement of high-level, declarative information flow policies.
Given a program that manipulates sensitive data and a set of declarative policies on the data, our technique automatically inserts policy-enforcing code throughout the program to make it provably secure with respect to the policies.
We achieve this through a new approach we call type-targeted program synthesis, which enables the application of traditional synthesis techniques in the context of global policy enforcement.
The key insight is that, given an appropriate encoding of policy compliance in a type system, we can use type inference to decompose a global policy enforcement problem into a series of small, local program synthesis problems that can be solved independently.
We implement this approach in Lifty, a core DSL for data-centric applications.
Our experience using the DSL to implement three case studies shows that (1) Lifty's centralized, declarative policy definitions make it easier to write secure data-centric applications, and (2) the Lifty compiler is able to efficiently synthesize all necessary policy-enforcing code, including the code required to prevent several reported real-world information leaks.
Humans learn to solve tasks of increasing complexity by building on top of previously acquired knowledge.
Typically, there exists a natural progression in the tasks that we learn - most do not require completely independent solutions, but can be broken down into simpler subtasks.
We propose to represent a solver for each task as a neural module that calls existing modules (solvers for simpler tasks) in a functional program-like manner.
Lower modules are a black box to the calling module, and communicate only via a query and an output.
Thus, a module for a new task learns to query existing modules and composes their outputs in order to produce its own output.
Our model effectively combines previous skill-sets, does not suffer from forgetting, and is fully differentiable.
We test our model in learning a set of visual reasoning tasks, and demonstrate improved performances in all tasks by learning progressively.
By evaluating the reasoning process using human judges, we show that our model is more interpretable than an attention-based baseline.
The ConditionaL Neural Networks (CLNN) and the Masked ConditionaL Neural Networks (MCLNN) exploit the nature of multi-dimensional temporal signals.
The CLNN captures the conditional temporal influence between the frames in a window and the mask in the MCLNN enforces a systematic sparseness that follows a filterbank-like pattern over the network links.
The mask induces the network to learn about time-frequency representations in bands, allowing the network to sustain frequency shifts.
Additionally, the mask in the MCLNN automates the exploration of a range of feature combinations, usually done through an exhaustive manual search.
We have evaluated the MCLNN performance using the Ballroom and Homburg datasets of music genres.
MCLNN has achieved accuracies that are competitive to state-of-the-art handcrafted attempts in addition to models based on Convolutional Neural Networks.
In this work, we propose a multi-modal Convolutional Neural Network (CNN) approach for brain tumor segmentation.
We investigate how to combine different modalities efficiently in the CNN framework.We adapt various fusion methods, which are previously employed on video recognition problem, to the brain tumor segmentation problem,and we investigate their efficiency in terms of memory and performance.Our experiments, which are performed on BRATS dataset, lead us to the conclusion that learning separate representations for each modality and combining them for brain tumor segmentation could increase the performance of CNN systems.
The aim of this article is to present an overview of the major families of state-of-the-art data-base benchmarks, namely: relational benchmarks, object and object-relational benchmarks, XML benchmarks, and decision-support benchmarks, and to discuss the issues, tradeoffs and future trends in database benchmarking.
We particularly focus on XML and decision-support benchmarks, which are currently the most innovative tools that are developed in this area.
Face anti-spoofing is the crucial step to prevent face recognition systems from a security breach.
Previous deep learning approaches formulate face anti-spoofing as a binary classification problem.
Many of them struggle to grasp adequate spoofing cues and generalize poorly.
In this paper, we argue the importance of auxiliary supervision to guide the learning toward discriminative and generalizable cues.
A CNN-RNN model is learned to estimate the face depth with pixel-wise supervision, and to estimate rPPG signals with sequence-wise supervision.
Then we fuse the estimated depth and rPPG to distinguish live vs. spoof faces.
In addition, we introduce a new face anti-spoofing database that covers a large range of illumination, subject, and pose variations.
Experimental results show that our model achieves the state-of-the-art performance on both intra-database and cross-database testing.
In this work, we propose a novel sampling method for Design of Experiments.
This method allows to sample such input values of the parameters of a computational model for which the constructed surrogate model will have the least possible approximation error.
High efficiency of the proposed method is demonstrated by its comparison with other sampling techniques (LHS, Sobol' sequence sampling, and Maxvol sampling) on the problem of least-squares polynomial approximation.
Also, numerical experiments for the Lebesgue constant growth for the points sampled by the proposed method are carried out.
Furui first demonstrated that the identity of both consonant and vowel can be perceived from the C-V transition; later, Stevens proposed that acoustic landmarks are the primary cues for speech perception, and that steady-state regions are secondary or supplemental.
Acoustic landmarks are perceptually salient, even in a language one doesn't speak, and it has been demonstrated that non-speakers of the language can identify features such as the primary articulator of the landmark.
These factors suggest a strategy for developing language-independent automatic speech recognition: landmarks can potentially be learned once from a suitably labeled corpus and rapidly applied to many other languages.
This paper proposes enhancing the cross-lingual portability of a neural network by using landmarks as the secondary task in multi-task learning (MTL).
The network is trained in a well-resourced source language with both phone and landmark labels (English), then adapted to an under-resourced target language with only word labels (Iban).
Landmark-tasked MTL reduces source-language phone error rate by 2.9% relative, and reduces target-language word error rate by 1.9%-5.9% depending on the amount of target-language training data.
These results suggest that landmark-tasked MTL causes the DNN to learn hidden-node features that are useful for cross-lingual adaptation.
The present work provides a new approach to evolve ligand structures which represent possible drug to be docked to the active site of the target protein.
The structure is represented as a tree where each non-empty node represents a functional group.
It is assumed that the active site configuration of the target protein is known with position of the essential residues.
In this paper the interaction energy of the ligands with the protein target is minimized.
Moreover, the size of the tree is difficult to obtain and it will be different for different active sites.
To overcome the difficulty, a variable tree size configuration is used for designing ligands.
The optimization is done using a quantum discrete PSO.
The result using fixed length and variable length configuration are compared.
Efficient exploration is an unsolved problem in Reinforcement Learning which is usually addressed by reactively rewarding the agent for fortuitously encountering novel situations.
This paper introduces an efficient active exploration algorithm, Model-Based Active eXploration (MAX), which uses an ensemble of forward models to plan to observe novel events.
This is carried out by optimizing agent behaviour with respect to a measure of novelty derived from the Bayesian perspective of exploration, which is estimated using the disagreement between the futures predicted by the ensemble members.
We show empirically that in semi-random discrete environments where directed exploration is critical to make progress, MAX is at least an order of magnitude more efficient than strong baselines.
MAX scales to high-dimensional continuous environments where it builds task-agnostic models that can be used for any downstream task.
Network Functions Virtualization (NFV) and Software-Defined Networking (SDN) are new paradigms in the move towards open software and network hardware.
While NFV aims to virtualize network functions and deploy them into general purpose hardware, SDN makes networks programmable by separating the control and data planes.
NFV and SDN are complementary technologies capable of providing one network solution.
SDN can provide connectivity between Virtual Network Functions (VNFs) in a flexible and automated way, whereas NFV can use SDN as part of a service function chain.
There are many studies designing NFV/SDN architectures in different environments.
Researchers have been trying to address reliability, performance, and scalability problems using different architectural designs.
This Systematic Literature Review (SLR) focuses on integrated NFV/SDN architectures, with the following goals: i) to investigate and provide an in-depth review of the state-of-the-art of NFV/SDN architectures, ii) to synthesize their architectural designs, and iii) to identify areas for further improvements.
Broadly, this SLR will encourage researchers to advance the current stage of development (i.e., the state-of-the-practice) of integrated NFV/SDN architectures, and shed some light on future research efforts and the challenges faced.
Facial analysis technologies have recently measured up to the capabilities of expert clinicians in syndrome identification.
To date, these technologies could only identify phenotypes of a few diseases, limiting their role in clinical settings where hundreds of diagnoses must be considered.
We developed a facial analysis framework, DeepGestalt, using computer vision and deep learning algorithms, that quantifies similarities to hundreds of genetic syndromes based on unconstrained 2D images.
DeepGestalt is currently trained with over 26,000 patient cases from a rapidly growing phenotype-genotype database, consisting of tens of thousands of validated clinical cases, curated through a community-driven platform.
DeepGestalt currently achieves 91% top-10-accuracy in identifying over 215 different genetic syndromes and has outperformed clinical experts in three separate experiments.
We suggest that this form of artificial intelligence is ready to support medical genetics in clinical and laboratory practices and will play a key role in the future of precision medicine.
We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene.
This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns.
We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains.
However, the choice of similarity measure for matching exemplars to a query image is essential to good performance.
For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness.
Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions.
We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images.
Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.
In ultra-wideband (UWB) communication systems with impulse radio (IR) modulation, the bandwidth is usually 1GHz or more.
To process the received signal digitally, high sampling rate analog-digital-converters (ADC) are required.
Due to the high complexity and large power consumption, monobit ADC is appropriate.
The optimal monobit receiver has been derived.
But it is not efficient to combat intersymbol interference (ISI).
Decision feedback equalization (DFE) is an effect way dealing with ISI.
In this paper, we proposed a algorithm that combines Viterbi decoding and DFE together for monobit receivers.
In this way, we suppress the impact of ISI effectively, thus improving the bit error rate (BER) performance.
By state expansion, we achieve better performance.
The simulation results show that the algorithm has about 1dB SNR gain compared to separate demodulation and decoding method and 1dB loss compared to the BER performance in the channel without ISI.
Compare to the full resolution detection in fading channel without ISI, it has 3dB SNR loss after state expansion.
This work tackles the face recognition task on images captured using thermal camera sensors which can operate in the non-light environment.
While it can greatly increase the scope and benefits of the current security surveillance systems, performing such a task using thermal images is a challenging problem compared to face recognition task in the Visible Light Domain (VLD).
This is partly due to the much smaller amount number of thermal imagery data collected compared to the VLD data.
Unfortunately, direct application of the existing very strong face recognition models trained using VLD data into the thermal imagery data will not produce a satisfactory performance.
This is due to the existence of the domain gap between the thermal and VLD images.
To this end, we propose a Thermal-to-Visible Generative Adversarial Network (TV-GAN) that is able to transform thermal face images into their corresponding VLD images whilst maintaining identity information which is sufficient enough for the existing VLD face recognition models to perform recognition.
Some examples are presented in Figure 1.
Unlike the previous methods, our proposed TV-GAN uses an explicit closed-set face recognition loss to regularize the discriminator network training.
This information will then be conveyed into the generator network in the forms of gradient loss.
In the experiment, we show that by using this additional explicit regularization for the discriminator network, the TV-GAN is able to preserve more identity information when translating a thermal image of a person which is not seen before by the TV-GAN.
We analyse a generalisation of the Quicksort algorithm, where k uniformly at random chosen pivots are used for partitioning an array of n distinct keys.
Specifically, the expected cost of this scheme is obtained, under the assumption of linearity of the cost needed for the partition process.
The integration constants of the expected cost are computed using Vandermonde matrices.
We develop a novel framework that aims to create bridges between the computational social choice and the database management communities.
This framework enriches the tasks currently supported in computational social choice with relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions.
At the conceptual level, we give rigorous semantics to queries in this framework by introducing the notions of necessary answers and possible answers to queries.
At the technical level, we embark on an investigation of the computational complexity of the necessary answers.
We establish a number of results about the complexity of the necessary answers of conjunctive queries involving positional scoring rules that contrast sharply with earlier results about the complexity of the necessary winners.
The importance of hierarchically structured representations for tractable planning has long been acknowledged.
However, the questions of how people discover such abstractions and how to define a set of optimal abstractions remain open.
This problem has been explored in cognitive science in the problem solving literature and in computer science in hierarchical reinforcement learning.
Here, we emphasize an algorithmic perspective on learning hierarchical representations in which the objective is to efficiently encode the structure of the problem, or, equivalently, to learn an algorithm with minimal length.
We introduce a novel problem-solving paradigm that links problem solving and program induction under the Markov Decision Process (MDP) framework.
Using this task, we target the question of whether humans discover hierarchical solutions by maximizing efficiency in number of actions they generate or by minimizing the complexity of the resulting representation and find evidence for the primacy of representational efficiency.
In this paper we perform a comparative analysis of three models for feature representation of text documents in the context of document classification.
In particular, we consider the most often used family of models bag-of-words, recently proposed continuous space models word2vec and doc2vec, and the model based on the representation of text documents as language networks.
While the bag-of-word models have been extensively used for the document classification task, the performance of the other two models for the same task have not been well understood.
This is especially true for the network-based model that have been rarely considered for representation of text documents for classification.
In this study, we measure the performance of the document classifiers trained using the method of random forests for features generated the three models and their variants.
The results of the empirical comparison show that the commonly used bag-of-words model has performance comparable to the one obtained by the emerging continuous-space model of doc2vec.
In particular, the low-dimensional variants of doc2vec generating up to 75 features are among the top-performing document representation models.
The results finally point out that doc2vec shows a superior performance in the tasks of classifying large documents.
A prognostic watch of the electric power system (EPS)is framed up, which detects the threat to EPS for a day ahead according to the characteristic times for a day ahead and according to the droop for a day ahead.
Therefore, a prognostic analysis of the EPS development for a day ahead is carried out.
Also the power grid, the electricity marker state, the grid state and the level of threat for a power grid are found for a day ahead.
The accuracy of the built up prognostic watch is evaluated.
While neural networks demonstrate stronger capabilities in pattern recognition nowadays, they are also becoming larger and deeper.
As a result, the effort needed to train a network also increases dramatically.
In many cases, it is more practical to use a neural network intellectual property (IP) that an IP vendor has already trained.
As we do not know about the training process, there can be security threats in the neural IP: the IP vendor (attacker) may embed hidden malicious functionality, i.e. neural Trojans, into the neural IP.
We show that this is an effective attack and provide three mitigation techniques: input anomaly detection, re-training, and input preprocessing.
All the techniques are proven effective.
The input anomaly detection approach is able to detect 99.8% of Trojan triggers although with 12.2% false positive.
The re-training approach is able to prevent 94.1% of Trojan triggers from triggering the Trojan although it requires that the neural IP be reconfigurable.
In the input preprocessing approach, 90.2% of Trojan triggers are rendered ineffective and no assumption about the neural IP is needed.
In simulations, probabilistic algorithms and statistical tests, we often generate random integers in an interval (e.g., [0,s)).
For example, random integers in an interval are essential to the Fisher-Yates random shuffle.
Consequently, popular languages like Java, Python, C++, Swift and Go include ranged random integer generation functions as part of their runtime libraries.
Pseudo-random values are usually generated in words of a fixed number of bits (e.g., 32 bits, 64 bits) using algorithms such as a linear congruential generator.
We need functions to convert such random words to random integers in an interval ([0,s)) without introducing statistical biases.
The standard functions in programming languages such as Java involve integer divisions.
Unfortunately, division instructions are relatively expensive.
We review an unbiased function to generate ranged integers from a source of random words that avoids integer divisions with high probability.
To establish the practical usefulness of the approach, we show that this algorithm can multiply the speed of unbiased random shuffling on x64 processors.
Our proposed approach has been adopted by the Go language for its implementation of the shuffle function.
In this paper, we present a detailed design of dynamic video segmentation network (DVSNet) for fast and efficient semantic video segmentation.
DVSNet consists of two convolutional neural networks: a segmentation network and a flow network.
The former generates highly accurate semantic segmentations, but is deeper and slower.
The latter is much faster than the former, but its output requires further processing to generate less accurate semantic segmentations.
We explore the use of a decision network to adaptively assign different frame regions to different networks based on a metric called expected confidence score.
Frame regions with a higher expected confidence score traverse the flow network.
Frame regions with a lower expected confidence score have to pass through the segmentation network.
We have extensively performed experiments on various configurations of DVSNet, and investigated a number of variants for the proposed decision network.
The experimental results show that our DVSNet is able to achieve up to 70.4% mIoU at 19.8 fps on the Cityscape dataset.
A high speed version of DVSNet is able to deliver an fps of 30.4 with 63.2% mIoU on the same dataset.
DVSNet is also able to reduce up to 95% of the computational workloads.
Face recognition performance has improved remarkably in the last decade.
Much of this success can be attributed to the development of deep learning techniques such as convolutional neural networks (CNNs).
While CNNs have pushed the state-of-the-art forward, their training process requires a large amount of clean and correctly labelled training data.
If a CNN is intended to tolerate facial pose, then we face an important question: should this training data be diverse in its pose distribution, or should face images be normalized to a single pose in a pre-processing step?
To address this question, we evaluate a number of popular facial landmarking and pose correction algorithms to understand their effect on facial recognition performance.
Additionally, we introduce a new, automatic, single-image frontalization scheme that exceeds the performance of current algorithms.
CNNs trained using sets of different pre-processing methods are used to extract features from the Point and Shoot Challenge (PaSC) and CMU Multi-PIE datasets.
We assert that the subsequent verification and recognition performance serves to quantify the effectiveness of each pose correction scheme.
Researchers have proposed various methods to extract 3D keypoints from the surface of 3D mesh models over the last decades, but most of them are based on geometric methods, which lack enough flexibility to meet the requirements for various applications.
In this paper, we propose a new method on the basis of deep learning by formulating the 3D keypoint detection as a regression problem using deep neural network (DNN) with sparse autoencoder (SAE) as our regression model.
Both local information and global information of a 3D mesh model in multi-scale space are fully utilized to detect whether a vertex is a keypoint or not.
SAE can effectively extract the internal structure of these two kinds of information and formulate high-level features for them, which is beneficial to the regression model.
Three SAEs are used to formulate the hidden layers of the DNN and then a logistic regression layer is trained to process the high-level features produced in the third SAE.
Numerical experiments show that the proposed DNN based 3D keypoint detection algorithm outperforms current five state-of-the-art methods for various 3D mesh models.
We extend the idea of end-to-end learning of communications systems through deep neural network (NN)-based autoencoders to orthogonal frequency division multiplexing (OFDM) with cyclic prefix (CP).
Our implementation has the same benefits as a conventional OFDM system, namely singletap equalization and robustness against sampling synchronization errors, which turned out to be one of the major challenges in previous single-carrier implementations.
This enables reliable communication over multipath channels and makes the communication scheme suitable for commodity hardware with imprecise oscillators.
We show that the proposed scheme can be realized with state-of-the-art deep learning software libraries as transmitter and receiver solely consist of differentiable layers required for gradient-based training.
We compare the performance of the autoencoder-based system against that of a state-of-the-art OFDM baseline over frequency-selective fading channels.
Finally, the impact of a non-linear amplifier is investigated and we show that the autoencoder inherently learns how to deal with such hardware impairments.
This paper presents a distributed painting algorithm for painting a priori known rectangular region by swarm of autonomous mobile robots.
We assume that the region is obstacle free and of rectangular in shape.
The basic approach is to divide the region into some cells, and to let each robot to paint one of these cells.
Assignment of different cells to the robots is done by ranking the robots according to their relative positions.
In this algorithm, the robots follow the basic Wait-Observe-Compute-Move model together with the synchronous timing model.
This paper also presents a simulation of the proposed algorithm.
The simulation is performed using the Player/Stage Robotic Simulator on Ubuntu 10.04 (Lucid Lynx) platform.
Parameter sweeping is a widely used algorithmic technique in computational science.
It is specially suited for high-throughput computing since the jobs evaluating the parameter space are loosely coupled or independent.
A tool that integrates the modeling of a parameter study with the control of jobs in a distributed architecture is presented.
The main task is to facilitate the creation and deletion of job templates, which are the elements describing the jobs to be run.
Extra functionality relies upon the GridWay Metascheduler, acting as the middleware layer for job submission and control.
It supports interesting features like multi-dimensional sweeping space, wildcarding of parameters, functional evaluation of ranges, value-skipping and job template automatic indexation.
The use of this tool increases the reliability of the parameter sweep study thanks to the systematic bookkeping of job templates and respective job statuses.
Furthermore, it simplifies the porting of the target application to the grid reducing the required amount of time and effort.
We consider the problem of learning optimal reserve price in repeated auctions against non-myopic bidders, who may bid strategically in order to gain in future rounds even if the single-round auctions are truthful.
Previous algorithms, e.g., empirical pricing, do not provide non-trivial regret rounds in this setting in general.
We introduce algorithms that obtain small regret against non-myopic bidders either when the market is large, i.e., no bidder appears in a constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one.
Our approach carefully controls what information is revealed to each bidder, and builds on techniques from differentially private online learning as well as the recent line of works on jointly differentially private algorithms.
This paper presents and analyses the implementation of a novel active queue management (AQM) named FavorQueue that aims to improve delay transfer of short lived TCP flows over best-effort networks.
The idea is to dequeue packets that do not belong to a flow previously enqueued first.
The rationale is to mitigate the delay induced by long-lived TCP flows over the pace of short TCP data requests and to prevent dropped packets at the beginning of a connection and during recovery period.
Although the main target of this AQM is to accelerate short TCP traffic, we show that FavorQueue does not only improve the performance of short TCP traffic but also improves the performance of all TCP traffic in terms of drop ratio and latency whatever the flow size.
In particular, we demonstrate that FavorQueue reduces the loss of a retransmitted packet, decreases the number of dropped packets recovered by RTO and improves the latency up to 30% compared to DropTail.
Finally, we show that this scheme remains compliant with recent TCP updates such as the increase of the initial slow-start value.
This paper proposes a novel channel estimation method and a cluster-based opportunistic scheduling policy, for a wireless energy transfer (WET) system consisting of multiple low-complex energy receivers (ERs) with limited processing capabilities.
Firstly, in the training stage, the energy transmitter (ET) obtains a set of Received Signal Strength Indicator (RSSI) feedback values from all ERs, and these values are used to estimate the channels between the ET and all ERs.
Next, based on the channel estimates, the ERs are grouped into clusters, and the cluster that has its members closest to its centroid in phase is selected for dedicated WET.
The beamformer that maximizes the minimum harvested energy among all ERs in the selected cluster is found by solving a convex optimization problem.
All ERs have the same chance of being selected regardless of their distances from the ET, and hence, this scheduling policy can be considered to be opportunistic as well as fair.
It is shown that the proposed method achieves significant performance gains over benchmark schemes.
Hybrid multiple-antenna transceivers, which combine large-dimensional analog pre/postprocessing with lower-dimensional digital processing, are the most promising approach for reducing the hardware cost and training overhead in massive MIMO systems.
This paper provides a comprehensive survey of the various incarnations of such structures that have been proposed in the literature.
We provide a taxonomy in terms of the required channel state information (CSI), namely whether the processing adapts to the instantaneous or the average (second-order) CSI; while the former provides somewhat better signal-to-noise and interference ratio (SNIR), the latter has much lower overhead for CSI acquisition.
We furthermore distinguish hardware structures of different complexities.
Finally, we point out the special design aspects for operation at millimeter-wave frequencies.
We propose a new 2D shape decomposition method based on the short-cut rule.
The short-cut rule originates from cognition research, and states that the human visual system prefers to partition an object into parts using the shortest possible cuts.
We propose and implement a computational model for the short-cut rule and apply it to the problem of shape decomposition.
The model we proposed generates a set of cut hypotheses passing through the points on the silhouette which represent the negative minima of curvature.
We then show that most part-cut hypotheses can be eliminated by analysis of local properties of each.
Finally, the remaining hypotheses are evaluated in ascending length order, which guarantees that of any pair of conflicting cuts only the shortest will be accepted.
We demonstrate that, compared with state-of-the-art shape decomposition methods, the proposed approach achieves decomposition results which better correspond to human intuition as revealed in psychological experiments.
The recently advancement in Wireless Sensor Network (WSN) technology has brought new distributed sensing applications such as water quality monitoring.
With sensing capabilities and using parameters like pH, conductivity and temperature, the quality of water can be known.
This paper proposes a novel design based on IEEE 802.15.4 (Zig-Bee protocol) and solar energy called Autonomous Water Quality Monitoring Prototype (AWQMP).
The prototype is designed to use ECHERP routing protocol and Adruino Mega 2560, an open-source electronic prototyping platform for data acquisition.
AWQMP is expected to give real time data acquirement and to reduce the cost of manual water quality monitoring due to its autonomous characteristic.
Moreover, the proposed prototype will help to study the behavior of aquatic animals in deployed water bodies.
Recent accounts from researchers, journalists, as well as federal investigators, reached a unanimous conclusion: social media are systematically exploited to manipulate and alter public opinion.
Some disinformation campaigns have been coordinated by means of bots, social media accounts controlled by computer scripts that try to disguise themselves as legitimate human users.
In this study, we describe one such operation occurred in the run up to the 2017 French presidential election.
We collected a massive Twitter dataset of nearly 17 million posts occurred between April 27 and May 7, 2017 (Election Day).
We then set to study the MacronLeaks disinformation campaign: By leveraging a mix of machine learning and cognitive behavioral modeling techniques, we separated humans from bots, and then studied the activities of the two groups taken independently, as well as their interplay.
We provide a characterization of both the bots and the users who engaged with them and oppose it to those users who didn't.
Prior interests of disinformation adopters pinpoint to the reasons of the scarce success of this campaign: the users who engaged with MacronLeaks are mostly foreigners with a preexisting interest in alt-right topics and alternative news media, rather than French users with diverse political views.
Concluding, anomalous account usage patterns suggest the possible existence of a black-market for reusable political disinformation bots.
This paper shows experimental results on learning based randomized bin-picking combined with iterative visual recognition.
We use the random forest to predict whether or not a robot will successfully pick an object for given depth images of the pile taking the collision between a finger and a neighboring object into account.
For the discriminator to be accurate, we consider estimating objects' poses by merging multiple depth images of the pile captured from different points of view by using a depth sensor attached at the wrist.
We show that, even if a robot is predicted to fail in picking an object with a single depth image due to its large occluded area, it is finally predicted as success after merging multiple depth images.
In addition, we show that the random forest can be trained with the small number of training data.
The paper introduces concentric Echo State Network, an approach to design reservoir topologies that tries to bridge the gap between deterministically constructed simple cycle models and deep reservoir computing approaches.
We show how to modularize the reservoir into simple unidirectional and concentric cycles with pairwise bidirectional jump connections between adjacent loops.
We provide a preliminary experimental assessment showing how concentric reservoirs yield to superior predictive accuracy and memory capacity with respect to single cycle reservoirs and deep reservoir models.
As online shopping becomes ever more prevalent, customers rely increasingly on product rating websites for making purchase decisions.
The reliability of online ratings, however, is potentially compromised by the so-called herding effect: when rating a product, customers may be biased to follow other customers' previous ratings of the same product.
This is problematic because it skews long-term customer perception through haphazard early ratings.
The study of herding poses methodological challenges.
In particular, observational studies are impeded by the lack of counterfactuals: simply correlating early with subsequent ratings is insufficient because we cannot know what the subsequent ratings would have looked like had the first ratings been different.
The methodology introduced here exploits a setting that comes close to an experiment, although it is purely observational---a natural experiment.
Our key methodological device consists in studying the same product on two separate rating sites, focusing on products that received a high first rating on one site, and a low first rating on the other.
This largely controls for confounds such as a product's inherent quality, advertising, and producer identity, and lets us isolate the effect of the first rating on subsequent ratings.
In a case study, we focus on beers as products and jointly study two beer rating sites, but our method applies to any pair of sites across which products can be matched.
We find clear evidence of herding in beer ratings.
For instance, if a beer receives a very high first rating, its second rating is on average half a standard deviation higher, compared to a situation where the identical beer receives a very low first rating.
Moreover, herding effects tend to last a long time and are noticeable even after 20 or more ratings.
Our results have important implications for the design of better rating systems.
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces a hierarchical organization of clusters in a dataset w.r.t. a parameter mpts.
While the performance of HDBSCAN* is robust w.r.t. mpts in the sense that a small change in mpts typically leads to only a small or no change in the clustering structure, choosing a "good" mpts value can be challenging: depending on the data distribution, a high or low value for mpts may be more appropriate, and certain data clusters may reveal themselves at different values of mpts.
To explore results for a range of mpts values, however, one has to run HDBSCAN* for each value in the range independently, which is computationally inefficient.
In this paper, we propose an efficient approach to compute all HDBSCAN* hierarchies for a range of mpts values by replacing the graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain the required information.
An extensive experimental evaluation shows that with our approach one can obtain over one hundred hierarchies for the computational cost equivalent to running HDBSCAN* about 2 times.
Silent speech interfaces have been recently proposed as a way to enable communication when the acoustic signal is not available.
This introduces the need to build visual speech recognition systems for silent and whispered speech.
However, almost all the recently proposed systems have been trained on vocalised data only.
This is in contrast with evidence in the literature which suggests that lip movements change depending on the speech mode.
In this work, we introduce a new audiovisual database which is publicly available and contains normal, whispered and silent speech.
To the best of our knowledge, this is the first study which investigates the differences between the three speech modes using the visual modality only.
We show that an absolute decrease in classification rate of up to 3.7% is observed when training and testing on normal and whispered, respectively, and vice versa.
An even higher decrease of up to 8.5% is reported when the models are tested on silent speech.
This reveals that there are indeed visual differences between the 3 speech modes and the common assumption that vocalized training data can be used directly to train a silent speech recognition system may not be true.
In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration.
The parallelization scheme arises naturally from the modular computational structure w.r.t. datapoints in the sparse Gaussian process formulation.
Additionally, the computational bottleneck is implemented with GPU acceleration for further speed up.
Combining both techniques allows applying Gaussian process models to millions of datapoints.
The efficiency of our algorithm is demonstrated with a synthetic dataset.
Its source code has been integrated into our popular software library GPy.
The complex interconnections between heterogeneous critical infrastructure sectors make the system of systems (SoS) vulnerable to natural or human-made disasters and lead to cascading failures both within and across sectors.
Hence, the robustness and resilience of the interdependent critical infrastructures (ICIs) against extreme events are essential for delivering reliable and efficient services to our society.
To this end, we first establish a holistic probabilistic network model to model the interdependencies between infrastructure components.
To capture the underlying failure and recovery dynamics of ICIs, we further propose a Markov decision processes (MDP) model in which the repair policy determines a long-term performance of the ICIs.
To address the challenges that arise from the curse of dimensionality of the MDP, we reformulate the problem as an approximate linear program and then simplify it using factored graphs.
We further obtain the distributed optimal control for ICIs under mild assumptions.
Finally, we use a case study of the interdependent power and subway systems to corroborate the results and show that the optimal resilience resource planning and allocation can reduce the failure probability and mitigate the impact of failures caused by natural or artificial disasters.
Virtual reality (VR) training simulators of liver needle insertion in the hepatic area of breathing virtual patients currently need 4D data acquisitions as a prerequisite.
Here, first a population-based breathing virtual patient 4D atlas can be built and second the requirement of a dose-relevant or expensive acquisition of a 4D data set for a new static 3D patient can be mitigated by warping the mean atlas motion.
The breakthrough contribution of this work is the construction and reuse of population-based learned 4D motion models.
In this paper, we propose a distributed control strategy for the design of an energy market.
The method relies on a hierarchical structure of aggregators for the coordination of prosumers (agents which can produce and consume energy).
The hierarchy reflects the voltage level separations of the electrical grid and allows aggregating prosumers in pools, while taking into account the grid operational constraints.
To reach optimal coordination, the prosumers communicate their forecasted power profile to the upper level of the hierarchy.
Each time the information crosses upwards a level of the hierarchy, it is first aggregated, both to strongly reduce the data flow and to preserve the privacy.
In the first part of the paper, the decomposition algorithm, which is based on the alternating direction method of multipliers (ADMM), is presented.
In the second part, we explore how the proposed algorithm scales with increasing number of prosumers and hierarchical levels, through extensive simulations based on randomly generated scenarios.
In many economic, social and political situations individuals carry out activities in groups (coalitions) rather than alone and on their own.
Examples range from households and sport clubs to research networks, political parties and trade unions.
The underlying game theoretic framework is known as coalition formation.
This survey discusses the notion of core stability in hedonic coalition formation (where each player's happiness only depends on the other members of his coalition but not on how the remaining players outside his coalition are grouped).
We present the central concepts and algorithmic approaches in the area, provide many examples, and pose a number of open problems.
Most face super-resolution methods assume that low-resolution and high-resolution manifolds have similar local geometrical structure, hence learn local models on the lowresolution manifolds (e.g. sparse or locally linear embedding models), which are then applied on the high-resolution manifold.
However, the low-resolution manifold is distorted by the oneto-many relationship between low- and high- resolution patches.
This paper presents a method which learns linear models based on the local geometrical structure on the high-resolution manifold rather than on the low-resolution manifold.
For this, in a first step, the low-resolution patch is used to derive a globally optimal estimate of the high-resolution patch.
The approximated solution is shown to be close in Euclidean space to the ground-truth but is generally smooth and lacks the texture details needed by state-ofthe-art face recognizers.
This first estimate allows us to find the support of the high-resolution manifold using sparse coding (SC), which are then used as support for learning a local projection (or upscaling) model between the low-resolution and the highresolution manifolds using Multivariate Ridge Regression (MRR).
Experimental results show that the proposed method outperforms six face super-resolution methods in terms of both recognition and quality.
These results also reveal that the recognition and quality are significantly affected by the method used for stitching all super-resolved patches together, where quilting was found to better preserve the texture details which helps to achieve higher recognition rates.
A domain analysis & description calculus is introduced.
It is shown to alleviate the issue of implicit semantics.
The claim is made that domain descriptions, whether informal, or as also here, formal, amount to an explicit semantics for what is otherwise implicit if not described.
This paper introduces an analytical framework to investigate optimal design choices for the placement of virtual controllers along the cloud-to-things continuum.
The main application scenarios include low-latency cyber-physical systems in which real-time control actions are required in response to the changes in states of an IoT node.
In such cases, deploying controller software on a cloud server is often not tolerable due to delay from the network edge to the cloud.
Hence, it is desirable to trade reliability with latency by moving controller logic closer to the network edge.
Modeling the IoT node as a dynamical system that evolves linearly in time with quadratic penalty for state deviations, recursive expressions for the optimum control policy and the resulting minimum cost value are obtained by taking virtual fog controller reliability and response time latency into account.
Our results indicate that latency is more critical than reliability in provisoning virtualized control services over fog endpoints, as it determines the swiftness of the fog control system as well as the timeliness of state measurements.
Based on a realistic drone trajectory tracking model, an extensive simulation study is also performed to illustrate the influence of reliability and latency on the control of autonomous vehicles over fog.
Quality diversity is a recent family of evolutionary search algorithms which focus on finding several well-performing (quality) yet different (diversity) solutions with the aim to maintain an appropriate balance between divergence and convergence during search.
While quality diversity has already delivered promising results in complex problems, the capacity of divergent search variants for quality diversity remains largely unexplored.
Inspired by the notion of surprise as an effective driver of divergent search and its orthogonal nature to novelty this paper investigates the impact of the former to quality diversity performance.
For that purpose we introduce three new quality diversity algorithms which employ surprise as a diversity measure, either on its own or combined with novelty, and compare their performance against novelty search with local competition, the state of the art quality diversity algorithm.
The algorithms are tested in a robot navigation task across 60 highly deceptive mazes.
Our findings suggest that allowing surprise and novelty to operate synergistically for divergence and in combination with local competition leads to quality diversity algorithms of significantly higher efficiency, speed and robustness.
It is a well-known fact that feedback does not increase the capacity of point-to-point memoryless channels, however, its effect in secure communications is not fully understood yet.
In this work, an achievable scheme for the wiretap channel with generalized feedback is presented.
This scheme, which uses the feedback signal to generate a shared secret key between the legitimate users, encrypts the message to be sent at the bit level.
New capacity results for a class of channels are provided, as well as some new insights into the secret key agreement problem.
Moreover, this scheme recovers previously reported rate regions from the literature, and thus it can be seen as a generalization that unifies several results in the field.
Only a few studies have been reported regarding human ear recognition in long wave infrared band.
Thus, we have created ear database based on long wave infrared band.
We have called that the database is long wave infrared band MIDAS consisting of 2430 records of 81 subjects.
Thermal band provides seamless operation both night and day, robust against spoofing with understanding live ear and invariant to illumination conditions for human ear recognition.
We have proposed to use different algorithms to reveal the distinctive features.
Then, we have reduced the number of dimensions using subspace methods.
Finally, the dimension of data is reduced in accordance with the classifier methods.
After this, the decision is determined by the best sores or combining some of the best scores with matching fusion.
The results have showed that the fusion technique was successful.
We have reached 97.71% for rank-1 with 567 test probes.
Furthermore, we have defined the perfect rank which is rank number when recognition rate reaches 100% in cumulative matching curve.
This evaluation is important for especially forensics, for example corpse identification, criminal investigation etc.
Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items.
Many of the patterns in rating datasets reflect important real-world differences between the various users and items in the data; other patterns may be irrelevant or possibly undesirable for social or ethical reasons, particularly if they reflect undesired discrimination, such as discrimination in publishing or purchasing against authors who are women or ethnic minorities.
In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to a dimension of social concern, namely content creator gender.
Using publicly-available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data.
We find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution.
This paper addresses the task of community detection and proposes a local approach based on a distributed list building, where each vertex broadcasts basic information that only depends on its degree and that of its neighbours.
A decentralised external process then unveils the community structure.
The relevance of the proposed method is experimentally shown on both artificial and real data.
The quantity of event logs available is increasing rapidly, be they produced by industrial processes, computing systems, or life tracking, for instance.
It is thus important to design effective ways to uncover the information they contain.
Because event logs often record repetitive phenomena, mining periodic patterns is especially relevant when considering such data.
Indeed, capturing such regularities is instrumental in providing condensed representations of the event sequences.
We present an approach for mining periodic patterns from event logs while relying on a Minimum Description Length (MDL) criterion to evaluate candidate patterns.
Our goal is to extract a set of patterns that suitably characterises the periodic structure present in the data.
We evaluate the interest of our approach on several real-world event log datasets.
A geometrical pattern is a set of points with all pairwise distances (or, more generally, relative distances) specified.
Finding matches to such patterns has applications to spatial data in seismic, astronomical, and transportation contexts.
For example, a particularly interesting geometric pattern in astronomy is the Einstein cross, which is an astronomical phenomenon in which a single quasar is observed as four distinct sky objects (due to gravitational lensing) when captured by earth telescopes.
Finding such crosses, as well as other geometric patterns, is a challenging problem as the potential number of sets of elements that compose shapes is exponentially large in the size of the dataset and the pattern.
In this paper, we denote geometric patterns as constellation queries and propose algorithms to find them in large data applications.
Our methods combine quadtrees, matrix multiplication, and unindexed join processing to discover sets of points that match a geometric pattern within some additive factor on the pairwise distances.
Our distributed experiments show that the choice of composition algorithm (matrix multiplication or nested loops) depends on the freedom introduced in the query geometry through the distance additive factor.
Three clearly identified blocks of threshold values guide the choice of the best composition algorithm.
Finally, solving the problem for relative distances requires a novel continuous-to-discrete transformation.
To the best of our knowledge this paper is the first to investigate constellation queries at scale.
In a world of global trading, maritime safety, security and efficiency are crucial issues.
We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams.
We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular timesampling.
We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification.
Many geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given noisy measurements of a subset of their pairwise relative transforms.
This problem is typically formulated as a maximum-likelihood estimation that requires solving a nonconvex nonlinear program, which is computationally intractable in general.
Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of this estimation problem in a non-adversarial noise regime.
The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides the exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate.
We develop a specialized optimization scheme for solving large-scale instances of this semidefinite relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem on a low-dimensional Riemannian manifold, and then design a Riemannian truncated-Newton trust-region method to solve this reduction efficiently.
We combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync.
Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is capable of recovering globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics applications, and does so at a computational cost that scales comparably with that of direct Newton-type local search techniques.
As humans, we regularly interpret images based on the relations between image regions.
For example, a person riding object X, or a plank bridging two objects.
Current methods provide limited support to search for images based on such relations.
We present RAID, a relation-augmented image descriptor that supports queries based on inter-region relations.
The key idea of our descriptor is to capture the spatial distribution of simple point-to-region relationships to describe more complex relationships between two image regions.
We evaluate the proposed descriptor by querying into a large subset of the Microsoft COCO database and successfully extract nontrivial images demonstrating complex inter-region relations, which are easily missed or erroneously classified by existing methods.
In this paper, we propose a neural-based coding scheme in which an artificial neural network is exploited to automatically compress and decompress speech signals by a trainable approach.
Having a two-stage training phase, the system can be fully specified to each speech frame and have robust performance across different speakers and wide range of spoken utterances.
Indeed, Frame-based nonlinear predictive coding (FNPC) would code a frame in the procedure of training to predict the frame samples.
The motivating objective is to analyze the system behavior in regenerating not only the envelope of spectra, but also the spectra phase.
This scheme has been evaluated in time and discrete cosine transform (DCT) domains and the output of predicted phonemes show the potentiality of the FNPC to reconstruct complicated signals.
The experiments were conducted on three voiced plosive phonemes, b/d/g/ in time and DCT domains versus the number of neurons in the hidden layer.
Experiments approve the FNPC capability as an automatic coding system by which /b/d/g/ phonemes have been reproduced with a good accuracy.
Evaluations revealed that the performance of FNPC system, trained to predict DCT coefficients is more desirable, particularly for frames with the wider distribution of energy, compared to time samples.
This paper proposes an extension to the Generative Adversarial Networks (GANs), namely as ARTGAN to synthetically generate more challenging and complex images such as artwork that have abstract characteristics.
This is in contrast to most of the current solutions that focused on generating natural images such as room interiors, birds, flowers and faces.
The key innovation of our work is to allow back-propagation of the loss function w.r.t. the labels (randomly assigned to each generated images) to the generator from the discriminator.
With the feedback from the label information, the generator is able to learn faster and achieve better generated image quality.
Empirically, we show that the proposed ARTGAN is capable to create realistic artwork, as well as generate compelling real world images that globally look natural with clear shape on CIFAR-10.
We have been developing a system for recognising human activity given a symbolic representation of video content.
The input of our system is a set of time-stamped short-term activities detected on video frames.
The output of our system is a set of recognised long-term activities, which are pre-defined temporal combinations of short-term activities.
The constraints on the short-term activities that, if satisfied, lead to the recognition of a long-term activity, are expressed using a dialect of the Event Calculus.
We illustrate the expressiveness of the dialect by showing the representation of several typical complex activities.
Furthermore, we present a detailed evaluation of the system through experimentation on a benchmark dataset of surveillance videos.
The emergence of distributed ledger technologies in the vehicular applications' arena is decisively contributing to their improvement and shaping of the public opinion about their future.
The Tangle is a technology at its infancy, but showing enormous potential to become a key solution by addressing several of the blockchain's limitations.
This paper focuses the use of the Tangle to improve the security of both in-vehicle and off-vehicle functions in vehicular applications.
To this end, key operational performance parameters are identified, evaluated and discussed with emphasis on their limitations and potential impact in future vehicular applications.
A fruitful approach for solving signal deconvolution problems consists of resorting to a frame-based convex variational formulation.
In this context, parallel proximal algorithms and related alternating direction methods of multipliers have become popular optimization techniques to approximate iteratively the desired solution.
Until now, in most of these methods, either Lipschitz differentiability properties or tight frame representations were assumed.
In this paper, it is shown that it is possible to relax these assumptions by considering a class of non necessarily tight frame representations, thus offering the possibility of addressing a broader class of signal restoration problems.
In particular, it is possible to use non necessarily maximally decimated filter banks with perfect reconstruction, which are common tools in digital signal processing.
The proposed approach allows us to solve both frame analysis and frame synthesis problems for various noise distributions.
In our simulations, it is applied to the deconvolution of data corrupted with Poisson noise or Laplacian noise by using (non-tight) discrete dual-tree wavelet representations and filter bank structures.
It is often the case that the best performing language model is an ensemble of a neural language model with n-grams.
In this work, we propose a method to improve how these two models are combined.
By using a small network which predicts the mixture weight between the two models, we adapt their relative importance at each time step.
Because the gating network is small, it trains quickly on small amounts of held out data, and does not add overhead at scoring time.
Our experiments carried out on the One Billion Word benchmark show a significant improvement over the state of the art ensemble without retraining of the basic modules.
Automatic Word problem solving has always posed a great challenge for the NLP community.
Usually a word problem is a narrative comprising of a few sentences and a question is asked about a quantity referred in the sentences.
Solving word problem involves reasoning across sentences, identification of operations, their order, relevant quantities and discarding irrelevant quantities.
In this paper, we present a novel approach for automatic arithmetic word problem solving.
Our approach starts with frame identification.
Each frame can either be classified as a state or an action frame.
The frame identification is dependent on the verb in a sentence.
Every frame is unique and is identified by its slots.
The slots are filled using dependency parsed output of a sentence.
The slots are entity holder, entity, quantity of the entity, recipient, additional information like place, time.
The slots and frames helps to identify the type of question asked and the entity referred.
Action frames act on state frame(s) which causes a change in quantities of the state frames.
The frames are then used to build a graph where any change in quantities can be propagated to the neighboring nodes.
Most of the current solvers can only answer questions related to the quantity, while our system can answer different kinds of questions like `who', `what' other than the quantity related questions `how many'.
There are three major contributions of this paper.1Frame Annotated Corpus (with a frame annotation tool) 2.Frame Identification Module 3.A new easily understandable Framework for word problem solving
Model transformations are the cornerstone of Model-Driven Engineering, and provide the essential mechanisms for manipulating and transforming models.
Checking whether the output of a model transformation is correct is a manual and error-prone task, this is referred to as the oracle problem in the software testing literature.
The correctness of the model transformation program is crucial for the proper generation of its output, so it should be tested.
Metamorphic testing is a testing technique to alleviate the oracle problem consisting on exploiting the relations between different inputs and outputs of the program under test, so-called metamorphic relations.
In this paper we give an insight into our approach to generically define metamorphic relations for model transformations, which can be automatically instantiated given any specific model transformation.
We provide accurate upper bounds on the Boolean circuit complexity of the standard and the Karatsuba methods of integer multiplication
Detecting PE malware files is now commonly approached using statistical and machine learning models.
While these models commonly use features extracted from the structure of PE files, we propose that icons from these files can also help better predict malware.
We propose an innovative machine learning approach to extract information from icons.
Our proposed approach consists of two steps: 1) extracting icon features using summary statics, histogram of gradients (HOG), and a convolutional autoencoder, 2) clustering icons based on the extracted icon features.
Using publicly available data and by using machine learning experiments, we show our proposed icon clusters significantly boost the efficacy of malware prediction models.
In particular, our experiments show an average accuracy increase of 10% when icon clusters are used in the prediction model.
Agent-Based Computing is a diverse research domain concerned with the building of intelligent software based on the concept of "agents".
In this paper, we use Scientometric analysis to analyze all sub-domains of agent-based computing.
Our data consists of 1,064 journal articles indexed in the ISI web of knowledge published during a twenty year period: 1990-2010.
These were retrieved using a topic search with various keywords commonly used in sub-domains of agent-based computing.
In our proposed approach, we have employed a combination of two applications for analysis, namely Network Workbench and CiteSpace - wherein Network Workbench allowed for the analysis of complex network aspects of the domain, detailed visualization-based analysis of the bibliographic data was performed using CiteSpace.
Our results include the identification of the largest cluster based on keywords, the timeline of publication of index terms, the core journals and key subject categories.
We also identify the core authors, top countries of origin of the manuscripts along with core research institutes.
Finally, our results have interestingly revealed the strong presence of agent-based computing in a number of non-computing related scientific domains including Life Sciences, Ecological Sciences and Social Sciences.
The recent progress of computing, machine learning, and especially deep learning, for image recognition brings a meaningful effect for automatic detection of various diseases from chest X-ray images (CXRs).
Here efficiency of lung segmentation and bone shadow exclusion techniques is demonstrated for analysis of 2D CXRs by deep learning approach to help radiologists identify suspicious lesions and nodules in lung cancer patients.
Training and validation was performed on the original JSRT dataset (dataset #01), BSE-JSRT dataset, i.e. the same JSRT dataset, but without clavicle and rib shadows (dataset #02), original JSRT dataset after segmentation (dataset #03), and BSE-JSRT dataset after segmentation (dataset #04).
The results demonstrate the high efficiency and usefulness of the considered pre-processing techniques in the simplified configuration even.
The pre-processed dataset without bones (dataset #02) demonstrates the much better accuracy and loss results in comparison to the other pre-processed datasets after lung segmentation (datasets #02 and #03).
In this paper, we aim to improve the state-of-the-art video generative adversarial networks (GANs) with a view towards multi-functional applications.
Our improved video GAN model does not separate foreground from background nor dynamic from static patterns, but learns to generate the entire video clip conjointly.
Our model can thus be trained to generate - and learn from - a broad set of videos with no restriction.
This is achieved by designing a robust one-stream video generation architecture with an extension of the state-of-the-art Wasserstein GAN framework that allows for better convergence.
The experimental results show that our improved video GAN model outperforms state-of-theart video generative models on multiple challenging datasets.
Furthermore, we demonstrate the superiority of our model by successfully extending it to three challenging problems: video colorization, video inpainting, and future prediction.
To the best of our knowledge, this is the first work using GANs to colorize and inpaint video clips.
User modeling is a very important task for making relevant suggestions of venues to the users.
These suggestions are often based on matching the venues' features with the users' preferences, which can be collected from previously visited locations.
In this paper, we present a set of relevance scores for making personalized suggestions of points of interest.
These scores model each user by focusing on the different types of information extracted from venues that they have previously visited.
In particular, we focus on scores extracted from social information available on location-based social networks.
Our experiments, conducted on the dataset of the TREC Contextual Suggestion Track, show that social scores are more effective than scores based venues' content.
In the panoply of pattern classification techniques, few enjoy the intuitive appeal and simplicity of the nearest neighbor rule: given a set of samples in some metric domain space whose value under some function is known, we estimate the function anywhere in the domain by giving the value of the nearest sample per the metric.
More generally, one may use the modal value of the m nearest samples, where m is a fixed positive integer (although m=1 is known to be admissible in the sense that no larger value is asymptotically superior in terms of prediction error).
The nearest neighbor rule is nonparametric and extremely general, requiring in principle only that the domain be a metric space.
The classic paper on the technique, proving convergence under independent, identically-distributed (iid) sampling, is due to Cover and Hart (1967).
Because taking samples is costly, there has been much research in recent years on selective sampling, in which each sample is selected from a pool of candidates ranked by a heuristic; the heuristic tries to guess which candidate would be the most "informative" sample.
Lindenbaum et al.(2004) apply selective sampling to the nearest neighbor rule, but their approach sacrifices the austere generality of Cover and Hart; furthermore, their heuristic algorithm is complex and computationally expensive.
Here we report recent results that enable selective sampling in the original Cover-Hart setting.
Our results pose three selection heuristics and prove that their nearest neighbor rule predictions converge to the true pattern.
Two of the algorithms are computationally cheap, with complexity growing linearly in the number of samples.
We believe that these results constitute an important advance in the art.
We propose a framework for localization and classification of masses in breast ultrasound (BUS) images.
We have experimentally found that training convolutional neural network based mass detectors with large, weakly annotated datasets presents a non-trivial problem, while overfitting may occur with those trained with small, strongly annotated datasets.
To overcome these problems, we use a weakly annotated dataset together with a smaller strongly annotated dataset in a hybrid manner.
We propose a systematic weakly and semi-supervised training scenario with appropriate training loss selection.
Experimental results show that the proposed method can successfully localize and classify masses with less annotation effort.
The results trained with only 10 strongly annotated images along with weakly annotated images were comparable to results trained from 800 strongly annotated images, with the 95% confidence interval of difference -3.00%--5.00%, in terms of the correct localization (CorLoc) measure, which is the ratio of images with intersection over union with ground truth higher than 0.5.
With the same number of strongly annotated images, additional weakly annotated images can be incorporated to give a 4.5% point increase in CorLoc, from 80.00% to 84.50% (with 95% confidence intervals 76.00%--83.75% and 81.00%--88.00%).
The effects of different algorithmic details and varied amount of data are presented through ablative analysis.
Time-varying renewable energy generation can result in serious under-/over-voltage conditions in future distribution grids.
Augmenting conventional utility-owned voltage regulating equipment with the reactive power capabilities of distributed generation units is a viable solution.
Local control options attaining global voltage regulation optimality at fast convergence rates is the goal here.
In this context, novel reactive power control rules are analyzed under a unifying linearized grid model.
For single-phase grids, our proximal gradient scheme has computational complexity comparable to that of the rule suggested by the IEEE 1547.8 standard, but it enjoys well-characterized convergence guarantees.
Adding memory to the scheme results in accelerated convergence.
For three-phase grids, it is shown that reactive injections have a counter-intuitive effect on bus voltage magnitudes across phases.
Nevertheless, when our control scheme is applied to unbalanced conditions, it is shown to reach an equilibrium point.
Yet this point may not correspond to the minimizer of a voltage regulation problem.
Numerical tests using the IEEE 13-bus, the IEEE 123-bus, and a Southern California Edison 47-bus feeder with increased renewable penetration verify the convergence properties of the schemes and their resiliency to grid topology reconfigurations.
This paper begins with a discussion of integration over probability types (p-types).
After doing that, the paper re-visits 3 mainstay problems of classical (non-quantum) Shannon Information Theory (SIT): source coding without distortion, channel coding, and source coding with distortion.
The paper proves well-known, conventional results for each of these 3 problems.
However, the proofs given for these results are not conventional.
They are based on complex integration techniques (approximations obtained by applying the method of steepest descent to p-type integrals) instead of the usual delta & epsilon and typical sequences arguments.
Another unconventional feature of this paper is that we make ample use of classical Bayesian networks (CB nets).
This paper showcases some of the benefits of using CB nets to do classical SIT.
A family of graphs optimized as the topologies for supercomputer interconnection networks is proposed.
The special needs of such network topologies, minimal diameter and mean path length, are met by special constructions of the weight vectors in a representation of the symplectic algebra.
Such theoretical design of topologies can conveniently reconstruct the mesh and hypercubic graphs, widely used as today's network topologies.
Our symplectic algebraic approach helps generate many classes of graphs suitable for network topologies.
The ability to map descriptions of scenes to 3D geometric representations has many applications in areas such as art, education, and robotics.
However, prior work on the text to 3D scene generation task has used manually specified object categories and language that identifies them.
We introduce a dataset of 3D scenes annotated with natural language descriptions and learn from this data how to ground textual descriptions to physical objects.
Our method successfully grounds a variety of lexical terms to concrete referents, and we show quantitatively that our method improves 3D scene generation over previous work using purely rule-based methods.
We evaluate the fidelity and plausibility of 3D scenes generated with our grounding approach through human judgments.
To ease evaluation on this task, we also introduce an automated metric that strongly correlates with human judgments.
Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images.
In this paper we advocate for explicitly modeling the partial order structure of this hierarchy.
Towards this goal, we introduce a general method for learning ordered representations, and show how it can be applied to a variety of tasks involving images and language.
We show that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.
A system is described for exchanging encrypted short messages between computers which remain permanently isolated from any network accessible to the attacker.
The main advantage is effective protection of these computers from malware which could circumvent the encryption.
For transmission, the ciphertext is passed between isolated and connected computers in the form of a QR code, which is displayed on and scanned from a screen.
The security of qrypt0 therefore rests on the cryptography and the computer's physical isolation rather than on the computer security of the encrypting device.
Fault Tree Analysis (FTA) is a dependability analysis technique that has been widely used to predict reliability, availability and safety of many complex engineering systems.
Traditionally, these FTA-based analyses are done using paper-and-pencil proof methods or computer simulations, which cannot ascertain absolute correctness due to their inherent limitations.
As a complementary approach, we propose to use the higher-order-logic theorem prover HOL4 to conduct the FTA-based analysis of safety-critical systems where accuracy of failure analysis is a dire need.
In particular, the paper presents a higher-order-logic formalization of generic Fault Tree gates, i.e., AND, OR, NAND, NOR, XOR and NOT and the formal verification of their failure probability expressions.
Moreover, we have formally verified the generic probabilistic inclusion-exclusion principle, which is one of the foremost requirements for conducting the FTA-based failure analysis of any given system.
For illustration purposes, we conduct the FTA-based failure analysis of a solar array that is used as the main source of power for the Dong Fang Hong-3 (DFH-3) satellite.
We propose a new learning-based method for estimating 2D human pose from a single image, using Dual-Source Deep Convolutional Neural Networks (DS-CNN).
Recently, many methods have been developed to estimate human pose by using pose priors that are estimated from physiologically inspired graphical models or learned from a holistic perspective.
In this paper, we propose to integrate both the local (body) part appearance and the holistic view of each local part for more accurate human pose estimation.
Specifically, the proposed DS-CNN takes a set of image patches (category-independent object proposals for training and multi-scale sliding windows for testing) as the input and then learns the appearance of each local part by considering their holistic views in the full body.
Using DS-CNN, we achieve both joint detection, which determines whether an image patch contains a body joint, and joint localization, which finds the exact location of the joint in the image patch.
Finally, we develop an algorithm to combine these joint detection/localization results from all the image patches for estimating the human pose.
The experimental results show the effectiveness of the proposed method by comparing to the state-of-the-art human-pose estimation methods based on pose priors that are estimated from physiologically inspired graphical models or learned from a holistic perspective.
We have recently seen many successful applications of recurrent neural networks (RNNs) on electronic medical records (EMRs), which contain histories of patients' diagnoses, medications, and other various events, in order to predict the current and future states of patients.
Despite the strong performance of RNNs, it is often challenging for users to understand why the model makes a particular prediction.
Such black-box nature of RNNs can impede its wide adoption in clinical practice.
Furthermore, we have no established methods to interactively leverage users' domain expertise and prior knowledge as inputs for steering the model.
Therefore, our design study aims to provide a visual analytics solution to increase interpretability and interactivity of RNNs via a joint effort of medical experts, artificial intelligence scientists, and visual analytics researchers.
Following the iterative design process between the experts, we design, implement, and evaluate a visual analytics tool called RetainVis, which couples a newly improved, interpretable and interactive RNN-based model called RetainEX and visualizations for users' exploration of EMR data in the context of prediction tasks.
Our study shows the effective use of RetainVis for gaining insights into how individual medical codes contribute to making risk predictions, using EMRs of patients with heart failure and cataract symptoms.
Our study also demonstrates how we made substantial changes to the state-of-the-art RNN model called RETAIN in order to make use of temporal information and increase interactivity.
This study will provide a useful guideline for researchers that aim to design an interpretable and interactive visual analytics tool for RNNs.
An optimal data partitioning in parallel & distributed implementation of clustering algorithms is a necessary computation as it ensures independent task completion, fair distribution, less number of affected points and better & faster merging.
Though partitioning using Kd Tree is being conventionally used in academia, it suffers from performance drenches and bias (non equal distribution) as dimensionality of data increases and hence is not suitable for practical use in industry where dimensionality can be of order of 100s to 1000s.
To address these issues we propose two new partitioning techniques using existing mathematical models & study their feasibility, performance (bias and partitioning speed) & possible variants in choosing initial seeds.
First method uses an n dimensional hashed grid based approach which is based on mapping the points in space to a set of cubes which hashes the points.
Second method uses a tree of voronoi planes where each plane corresponds to a partition.
We found that grid based approach was computationally impractical, while using a tree of voronoi planes (using scalable K-Means++ initial seeds) drastically outperformed the Kd-tree tree method as dimensionality increased.
Context: This paper presents the concept of open programming language interpreters and the implementation of a framework-level metaobject protocol (MOP) to support them.
Inquiry: We address the problem of dynamic interpreter adaptation to tailor the interpreter's behavior on the task to be solved and to introduce new features to fulfill unforeseen requirements.
Many languages provide a MOP that to some degree supports reflection.
However, MOPs are typically language-specific, their reflective functionality is often restricted, and the adaptation and application logic are often mixed which hardens the understanding and maintenance of the source code.
Our system overcomes these limitations.
Approach: We designed and implemented a system to support open programming language interpreters.
The prototype implementation is integrated in the Neverlang framework.
The system exposes the structure, behavior and the runtime state of any Neverlang-based interpreter with the ability to modify it.
Knowledge: Our system provides a complete control over interpreter's structure, behavior and its runtime state.
The approach is applicable to every Neverlang-based interpreter.
Adaptation code can potentially be reused across different language implementations.
Grounding: Having a prototype implementation we focused on feasibility evaluation.
The paper shows that our approach well addresses problems commonly found in the research literature.
We have a demonstrative video and examples that illustrate our approach on dynamic software adaptation, aspect-oriented programming, debugging and context-aware interpreters.
Importance: To our knowledge, our paper presents the first reflective approach targeting a general framework for language development.
Our system provides full reflective support for free to any Neverlang-based interpreter.
We are not aware of any prior application of open implementations to programming language interpreters in the sense defined in this paper.
Rather than substituting other approaches, we believe our system can be used as a complementary technique in situations where other approaches present serious limitations.
Unpaired image-to-image translation is the problem of mapping an image in the source domain to one in the target domain, without requiring corresponding image pairs.
To ensure the translated images are realistically plausible, recent works, such as Cycle-GAN, demands this mapping to be invertible.
While, this requirement demonstrates promising results when the domains are unimodal, its performance is unpredictable in a multi-modal scenario such as in an image segmentation task.
This is because, invertibility does not necessarily enforce semantic correctness.
To this end, we present a semantically-consistent GAN framework, dubbed Sem-GAN, in which the semantics are defined by the class identities of image segments in the source domain as produced by a semantic segmentation algorithm.
Our proposed framework includes consistency constraints on the translation task that, together with the GAN loss and the cycle-constraints, enforces that the images when translated will inherit the appearances of the target domain, while (approximately) maintaining their identities from the source domain.
We present experiments on several image-to-image translation tasks and demonstrate that Sem-GAN improves the quality of the translated images significantly, sometimes by more than 20% on the FCN score.
Further, we show that semantic segmentation models, trained with synthetic images translated via Sem-GAN, leads to significantly better segmentation results than other variants.
As the popularity of electric vehicles increases, the demand for more power can increase more rapidly than our ability to install additional generating capacity.
In the long term we expect that the supply and demand will become balanced.
However, in the interim the rate at which electric vehicles can be deployed will depend on our ability to charge these vehicles without inconveniencing their owners.
In this paper, we investigate using fairness mechanisms to distribute power to electric vehicles on a smart grid.
We assume that during peak demand there is insufficient power to charge all the vehicles simultaneously.
In each five minute interval of time we select a subset of the vehicles to charge, based upon information about the vehicles.
We evaluate the selection mechanisms using published data on the current demand for electric power as a function of time of day, current driving habits for commuting, and the current rates at which electric vehicles can be charged on home outlets.
We found that conventional selection strategies, such as first-come-first-served or round robin, may delay a significant fraction of the vehicles by more than two hours, even when the total available power over the course of a day is two or three times the power required by the vehicles.
However, a selection mechanism that minimizes the maximum delay can reduce the delays to a few minutes, even when the capacity available for charging electric vehicles exceeds their requirements by as little as 5%.
A new certification authority authorization (CAA) resource record for the domain name system (DNS) was standardized in 2013.
Motivated by the later 2017 decision to enforce mandatory CAA checking for most certificate authorities, this paper surveys the early adoption of CAA by using an empirical sample collected from the Alexa's top-million domains.
According to the results, (i) the adoption of CAA is still at a modest level; only a little below two percent of the popular domains sampled have adopted CAA.
Among the domains that have adopted CAA, (ii) authorizations dealing with wildcard certificates are rare compared to conventional certificates.
Interestingly, (iii) the results only partially reflect the market structure of the global certificate business.
With these timely results, the paper contributes to the ongoing large-scale empirical research on the use of encryption technologies.
This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles.
The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account.
Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion.
Then, we employ the alternating direction method of multipliers (ADMM) to solve the centralized optimization in a parallel way, which scales more favorably to large-scale instances.
Also, Taylor series is used to linearize nonconvex constraints caused by coupling collision avoidance constraints among interactive vehicles.
Simulations with two typical traffic scenes for multiple vehicles demonstrate the effectiveness and efficiency of our method.
This paper considers the design of the beamformers for a multiple-input single-output (MISO) downlink system that seeks to mitigate the impact of the imperfections in the channel state information (CSI) that is available at the base station (BS).
The goal of the design is to minimize the outage probability of specified signal-to-interference-and-noise ratio (SINR) targets, while satisfying per-antenna power constraints (PAPCs), and to do so at a low computational cost.
Based on insights from the offset maximization technique for robust beamforming, and observations regarding the structure of the optimality conditions, low-complexity iterative algorithms that involve the evaluation of closed-form expressions are developed.
To further reduce the computational cost, algorithms are developed for per-antenna power-constrained variants of the zero-forcing (ZF) and maximum ratio transmission (MRT) beamforming directions.
In the MRT case, our low-complexity version for systems with a large number of antennas may be of independent interest.
The proposed algorithms are extended to systems with both PAPCs and a total power constraint.
Simulation results show that the proposed robust designs can provide substantial gains in the outage probability while satisfying the PAPCs.
Flow fields are often represented by a set of static arrows to illustrate scientific vulgarization, documentary film, meteorology, etc.
This simple schematic representation lets an observer intuitively interpret the main properties of a flow: its orientation and velocity magnitude.
We propose to generate dynamic versions of such representations for 2D unsteady flow fields.
Our algorithm smoothly animates arrows along the flow while controlling their density in the domain over time.
Several strategies have been combined to lower the unavoidable popping artifacts arising when arrows appear and disappear and to achieve visually pleasing animations.
Disturbing arrow rotations in low velocity regions are also handled by continuously morphing arrow glyphs to semi-transparent discs.
To substantiate our method, we provide results for synthetic and real velocity field datasets.
Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously.
Typically, users deploy Data-Intensive Workflows (DIWs) for their analytical tasks.
These DIWs of different users share many common parts (i.e, 50-80%), which can be materialized to reuse them in future executions.
The materialization improves the overall processing time of DIWs and also saves computational resources.
Current solutions for materialization store data on Distributed File Systems (DFS) by using a fixed data format.
However, a fixed choice might not be the optimal one for every situation.
For example, it is well-known that different data fragmentation strategies (i.e., horizontal, vertical or hybrid) behave better or worse according to the access patterns of the subsequent operations.
In this paper, we present a cost-based approach which helps deciding the most appropriate storage format in every situation.
A generic cost-based storage format selector framework considering the three fragmentation strategies is presented.
Then, we use our framework to instantiate cost models for specific Hadoop data formats (namely SequenceFile, Avro and Parquet), and test it with realistic use cases.
Our solution gives on average 33% speedup over SequenceFile, 11% speedup over Avro, 32% speedup over Parquet, and overall, it provides upto 25% performance gain.
In recent years, online communities have formed around suicide and self-harm prevention.
While these communities offer support in moment of crisis, they can also normalize harmful behavior, discourage professional treatment, and instigate suicidal ideation.
In this work, we focus on how interaction with others in such a community affects the mental state of users who are seeking support.
We first build a dataset of conversation threads between users in a distressed state and community members offering support.
We then show how to construct a classifier to predict whether distressed users are helped or harmed by the interactions in the thread, and we achieve a macro-F1 score of up to 0.69.
Mobile gaming has emerged as a promising market with billion-dollar revenues.
A variety of mobile game platforms and services have been developed around the world.
One critical challenge for these platforms and services is to understand user churn behavior in mobile games.
Accurate churn prediction will benefit many stakeholders such as game developers, advertisers, and platform operators.
In this paper, we present the first large-scale churn prediction solution for mobile games.
In view of the common limitations of the state-of-the-art methods built upon traditional machine learning models, we devise a novel semi-supervised and inductive embedding model that jointly learns the prediction function and the embedding function for user-app relationships.
We model these two functions by deep neural networks with a unique edge embedding technique that is able to capture both contextual information and relationship dynamics.
We also design a novel attributed random walk technique that takes into consideration both topological adjacency and attribute similarities.
To evaluate the performance of our solution, we collect real-world data from the Samsung Game Launcher platform that includes tens of thousands of games and hundreds of millions of user-app interactions.
The experimental results with this data demonstrate the superiority of our proposed model against existing state-of-the-art methods.
We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development.
We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement.
We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict.
Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology.
In this work, we define a parabolic equation on digital spaces and study its properties.
The equation can be used in investigation of mechanical, aerodynamic, structural and technological properties of a Moebius strip, which is used as a basic element of a new configuration of an airplane wing.
Condition for existence of exact solutions by a matrix method and a method of separation of variables are studied and determined.
As examples, numerical solutions on Moebius strip and projective plane are presented.
The key idea of variational auto-encoders (VAEs) resembles that of traditional auto-encoder models in which spatial information is supposed to be explicitly encoded in the latent space.
However, the latent variables in VAEs are vectors, which can be interpreted as multiple feature maps of size 1x1.
Such representations can only convey spatial information implicitly when coupled with powerful decoders.
In this work, we propose spatial VAEs that use feature maps of larger size as latent variables to explicitly capture spatial information.
This is achieved by allowing the latent variables to be sampled from matrix-variate normal (MVN) distributions whose parameters are computed from the encoder network.
To increase dependencies among locations on latent feature maps and reduce the number of parameters, we further propose spatial VAEs via low-rank MVN distributions.
Experimental results show that the proposed spatial VAEs outperform original VAEs in capturing rich structural and spatial information.
Background: Clinical decision support systems (CDSS) are a category of health information technologies that can assist clinicians to choose optimal treatments.
These support systems are based on clinical trials and expert knowledge; however, the amount of data available to these systems is limited.
For this reason, CDSSs could be significantly improved by using the knowledge obtained by treating patients.
This knowledge is mainly contained in patient records, whose usage is restricted due to privacy and confidentiality constraints.
Methods: A treatment effectiveness measure, containing valuable information for treatment prescription, was defined and a method to extract this measure from patient records was developed.
This method uses an advanced cryptographic technology, known as secure Multiparty Computation (henceforth referred to as MPC), to preserve the privacy of the patient records and the confidentiality of the clinicians' decisions.
Results: Our solution enables to compute the effectiveness measure of a treatment based on patient records, while preserving privacy.
Moreover, clinicians are not burdened with the computational and communication costs introduced by the privacy-preserving techniques that are used.
Our system is able to compute the effectiveness of 100 treatments for a specific patient in less than 24 minutes, querying a database containing 20,000 patient records.
Conclusion: This paper presents a novel and efficient clinical decision support system, that harnesses the potential and insights acquired from treatment data, while preserving the privacy of patient records and the confidentiality of clinician decisions.
Computational models for sarcasm detection have often relied on the content of utterances in isolation.
However, the speaker's sarcastic intent is not always apparent without additional context.
Focusing on social media discussions, we investigate three issues: (1) does modeling conversation context help in sarcasm detection; (2) can we identify what part of conversation context triggered the sarcastic reply; and (3) given a sarcastic post that contains multiple sentences, can we identify the specific sentence that is sarcastic.
To address the first issue, we investigate several types of Long Short-Term Memory (LSTM) networks that can model both the conversation context and the current turn.
We show that LSTM networks with sentence-level attention on context and current turn, as well as the conditional LSTM network (Rocktaschel et al.2016), outperform the LSTM model that reads only the current turn.
As conversation context, we consider the prior turn, the succeeding turn or both.
Our computational models are tested on two types of social media platforms: Twitter and discussion forums.
We discuss several differences between these datasets ranging from their size to the nature of the gold-label annotations.
To address the last two issues, we present a qualitative analysis of attention weights produced by the LSTM models (with attention) and discuss the results compared with human performance on the two tasks.
We propose a measure and a metric on the sets of infinite traces generated by a set of atomic propositions.
To compute these quantities, we first map properties to subsets of the real numbers and then take the Lebesgue measure of the resulting sets.
We analyze how this measure is computed for Linear Temporal Logic (LTL) formulas.
An implementation for computing the measure of bounded LTL properties is provided and explained.
This implementation leverages SAT model counting and effects independence checks on subexpressions to compute the measure and metric compositionally.
We present sum-set inequalities specialized to the generalized degrees of freedom (GDoF) framework.
These are information theoretic lower bounds on the entropy of bounded density linear combinations of discrete, power-limited dependent random variables in terms of the joint entropies of arbitrary linear combinations of new random variables that are obtained by power level partitioning of the original random variables.
These bounds generalize the aligned image sets approach, and are useful instruments to obtain GDoF characterizations for wireless networks, especially with multiple antenna nodes, subject to arbitrary channel strength and channel uncertainty levels.
To demonstrate the utility of these bounds, we consider a non-trivial instance of wireless networks - a two user interference channel with different number of antennas at each node, and different levels of partial channel knowledge available to the transmitters.
We obtain tight GDoF characterization for specific instance of this channel with the aid of sum-set inequalities.
Loss of thrust emergencies-e.g., induced by bird/drone strikes or fuel exhaustion-create the need for dynamic data-driven flight trajectory planning to advise pilots or control UAVs.
While total loss of thrust trajectories to nearby airports can be pre-computed for all initial points in a 3D flight plan, dynamic aspects such as partial power and airplane surface damage must be considered for accuracy.
In this paper, we propose a new Dynamic Data-Driven Avionics Software (DDDAS) approach which during flight updates a damaged aircraft performance model, used in turn to generate plausible flight trajectories to a safe landing site.
Our damaged aircraft model is parameterized on a baseline glide ratio for a clean aircraft configuration assuming best gliding airspeed on straight flight.
The model predicts purely geometric criteria for flight trajectory generation, namely, glide ratio and turn radius for different bank angles and drag configurations.
Given actual aircraft performance data, we dynamically infer the baseline glide ratio to update the damaged aircraft model.
Our new flight trajectory generation algorithm thus can significantly improve upon prior Dubins based trajectory generation work by considering these data-driven geometric criteria.
We further introduce a trajectory utility function to rank trajectories for safety.
As a use case, we consider the Hudson River ditching of US Airways 1549 in January 2009 using a flight simulator to evaluate our trajectories and to get sensor data.
In this case, a baseline glide ratio of 17.25:1 enabled us to generate trajectories up to 28 seconds after the birds strike, whereas, a 19:1 baseline glide ratio enabled us to generate trajectories up to 36 seconds after the birds strike.
DDDAS can significantly improve the accuracy of generated flight trajectories thereby enabling better decision support systems for pilots in emergency conditions.
This paper proposes a deep leaning method to address the challenging facial attractiveness prediction problem.
The method constructs a convolutional neural network of facial beauty prediction using a new deep cascaded fine-turning scheme with various face inputting channels, such as the original RGB face image, the detail layer image, and the lighting layer image.
With a carefully designed CNN model of deep structure, large input size and small convolutional kernels, we have achieved a high prediction correlation of 0.88.
This result convinces us that the problem of facial attractiveness prediction can be solved by deep learning approach, and it also shows the important roles of the facial smoothness, lightness, and color information that were involved in facial beauty perception, which is consistent with the result of recent psychology studies.
Furthermore, we analyze the high-level features learnt by CNN through visualization of its hidden layers, and some interesting phenomena were observed.
It is found that the contours and appearance of facial features, especially eyes and moth, are the most significant facial attributes for facial attractiveness prediction, which is also consistent with the visual perception intuition of human.
The Kaczmarz algorithm is popular for iteratively solving an overdetermined system of linear equations.
The traditional Kaczmarz algorithm can approximate the solution in few sweeps through the equations but a randomized version of the Kaczmarz algorithm was shown to converge exponentially and independent of number of equations.
Recently an algorithm for finding sparse solution to a linear system of equations has been proposed based on weighted randomized Kaczmarz algorithm.
These algorithms solves single measurement vector problem; however there are applications were multiple-measurements are available.
In this work, the objective is to solve a multiple measurement vector problem with common sparse support by modifying the randomized Kaczmarz algorithm.
We have also modeled the problem of face recognition from video as the multiple measurement vector problem and solved using our proposed technique.
We have compared the proposed algorithm with state-of-art spectral projected gradient algorithm for multiple measurement vectors on both real and synthetic datasets.
The Monte Carlo simulations confirms that our proposed algorithm have better recovery and convergence rate than the MMV version of spectral projected gradient algorithm under fairness constraints.
We investigate sparse representations for control in reinforcement learning.
While these representations are widely used in computer vision, their prevalence in reinforcement learning is limited to sparse coding where extracting representations for new data can be computationally intensive.
Here, we begin by demonstrating that learning a control policy incrementally with a representation from a standard neural network fails in classic control domains, whereas learning with a representation obtained from a neural network that has sparsity properties enforced is effective.
We provide evidence that the reason for this is that the sparse representation provides locality, and so avoids catastrophic interference, and particularly keeps consistent, stable values for bootstrapping.
We then discuss how to learn such sparse representations.
We explore the idea of Distributional Regularizers, where the activation of hidden nodes is encouraged to match a particular distribution that results in sparse activation across time.
We identify a simple but effective way to obtain sparse representations, not afforded by previously proposed strategies, making it more practical for further investigation into sparse representations for reinforcement learning.
Data pre-processing is one of the most time consuming and relevant steps in a data analysis process (e.g., classification task).
A given data pre-processing operator (e.g., transformation) can have positive, negative or zero impact on the final result of the analysis.
Expert users have the required knowledge to find the right pre-processing operators.
However, when it comes to non-experts, they are overwhelmed by the amount of pre-processing operators and it is challenging for them to find operators that would positively impact their analysis (e.g., increase the predictive accuracy of a classifier).
Existing solutions either assume that users have expert knowledge, or they recommend pre-processing operators that are only "syntactically" applicable to a dataset, without taking into account their impact on the final analysis.
In this work, we aim at providing assistance to non-expert users by recommending data pre-processing operators that are ranked according to their impact on the final analysis.
We developed a tool PRESISTANT, that uses Random Forests to learn the impact of pre-processing operators on the performance (e.g., predictive accuracy) of 5 different classification algorithms, such as J48, Naive Bayes, PART, Logistic Regression, and Nearest Neighbor.
Extensive evaluations on the recommendations provided by our tool, show that PRESISTANT can effectively help non-experts in order to achieve improved results in their analytical tasks.
It is well known that matched filtering and sampling (MFS) demodulation together with minimum Euclidean distance (MD) detection constitute the optimal receiver for the additive white Gaussian noise channel.
However, for a general nonlinear transmission medium, MFS does not provide sufficient statistics, and therefore is suboptimal.
Nonetheless, this receiver is widely used in optical systems, where the Kerr nonlinearity is the dominant impairment at high powers.
In this paper, we consider a suite of receivers for a two-user channel subject to a type of nonlinear interference that occurs in wavelength-division-multiplexed channels.
The asymptotes of the symbol error rate (SER) of the considered receivers at high powers are derived or bounded analytically.
Moreover, Monte-Carlo simulations are conducted to evaluate the SER for all the receivers.
Our results show that receivers that are based on MFS cannot achieve arbitrary low SERs, whereas the SER goes to zero as the power grows for the optimal receiver.
Furthermore, we devise a heuristic demodulator, which together with the MD detector yields a receiver that is simpler than the optimal one and can achieve arbitrary low SERs.
The SER performance of the proposed receivers is also evaluated for some single-span fiber-optical channels via split-step Fourier simulations.
In this paper, we propose a new autonomous braking system based on deep reinforcement learning.
The proposed autonomous braking system automatically decides whether to apply the brake at each time step when confronting the risk of collision using the information on the obstacle obtained by the sensors.
The problem of designing brake control is formulated as searching for the optimal policy in Markov decision process (MDP) model where the state is given by the relative position of the obstacle and the vehicle's speed, and the action space is defined as whether brake is stepped or not.
The policy used for brake control is learned through computer simulations using the deep reinforcement learning method called deep Q-network (DQN).
In order to derive desirable braking policy, we propose the reward function which balances the damage imposed to the obstacle in case of accident and the reward achieved when the vehicle runs out of risk as soon as possible.
DQN is trained for the scenario where a vehicle is encountered with a pedestrian crossing the urban road.
Experiments show that the control agent exhibits desirable control behavior and avoids collision without any mistake in various uncertain environments.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles.
We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations.
The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links.
Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features.
We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.
Let G denote a graph and let K be a subset of vertices that are a set of target vertices of G. The K-terminal reliability of G is defined as the probability that all target vertices in K are connected, considering the possible failures of non-target vertices of G. The problem of computing K-terminal reliability is known to be #P-complete for polygon-circle graphs, and can be solved in polynomial-time for t-polygon graphs, which are a subclass of polygon-circle graphs.
The class of circle graphs is a subclass of polygon-circle graphs and a superclass of t-polygon graphs.
Therefore, the problem of computing K-terminal reliability for circle graphs is of particular interest.
This paper proves that the problem remains #P-complete even for circle graphs.
Additionally, this paper proposes a linear-time algorithm for solving the problem for proper circular-arc graphs, which are a subclass of circle graphs and a superclass of proper interval graphs.
The labyrinth game is a simple yet challenging platform, not only for humans but also for control algorithms and systems.
The game is easy to understand but still very hard to master.
From a system point of view, the ball behaviour is in general easy to model but close to the obstacles there are severe non-linearities.
Additionally, the far from flat surface on which the ball rolls provides for changing dynamics depending on the ball position.
The general dynamics of the system can easliy be handled by traditional automatic control methods.
Taking the obstacles and uneaven surface into accout would require very detailed models of the system.
A simple deterministic control algorithm is combined with a learning control method.
The simple control method provides initial training data.
As the learning method is trained, the system can learn from the results of its own actions and the performance improves well beyond the performance of the initial controller.
A vision system and image analysis is used to estimate the ball position while a combination of a PID controller and a learning controller based on LWPR is used to learn to navigate the ball through the maze.
Flow correlation is the core technique used in a multitude of deanonymization attacks on Tor.
Despite the importance of flow correlation attacks on Tor, existing flow correlation techniques are considered to be ineffective and unreliable in linking Tor flows when applied at a large scale, i.e., they impose high rates of false positive error rates or require impractically long flow observations to be able to make reliable correlations.
In this paper, we show that, unfortunately, flow correlation attacks can be conducted on Tor traffic with drastically higher accuracies than before by leveraging emerging learning mechanisms.
We particularly design a system, called DeepCorr, that outperforms the state-of-the-art by significant margins in correlating Tor connections.
DeepCorr leverages an advanced deep learning architecture to learn a flow correlation function tailored to Tor's complex network this is in contrast to previous works' use of generic statistical correlation metrics to correlated Tor flows.
We show that with moderate learning, DeepCorr can correlate Tor connections (and therefore break its anonymity) with accuracies significantly higher than existing algorithms, and using substantially shorter lengths of flow observations.
For instance, by collecting only about 900 packets of each target Tor flow (roughly 900KB of Tor data), DeepCorr provides a flow correlation accuracy of 96% compared to 4% by the state-of-the-art system of RAPTOR using the same exact setting.
We hope that our work demonstrates the escalating threat of flow correlation attacks on Tor given recent advances in learning algorithms, calling for the timely deployment of effective countermeasures by the Tor community.
Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training.
First, using policy gradient and mixed-loss methods for reinforcement learning, we directly optimize sentence-level task-based metrics (as rewards), achieving significant improvements over the baseline, based on both automatic metrics and human evaluation on multiple datasets.
Next, we propose a novel entailment-enhanced reward (CIDEnt) that corrects phrase-matching based metrics (such as CIDEr) to only allow for logically-implied partial matches and avoid contradictions, achieving further significant improvements over the CIDEr-reward model.
Overall, our CIDEnt-reward model achieves the new state-of-the-art on the MSR-VTT dataset.
In this work, we are interested in generalizing convolutional neural networks (CNNs) from low-dimensional regular grids, where image, video and speech are represented, to high-dimensional irregular domains, such as social networks, brain connectomes or words' embedding, represented by graphs.
We present a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs.
Importantly, the proposed technique offers the same linear computational complexity and constant learning complexity as classical CNNs, while being universal to any graph structure.
Experiments on MNIST and 20NEWS demonstrate the ability of this novel deep learning system to learn local, stationary, and compositional features on graphs.
In this paper, we present a novel way to summarize the structure of large graphs, based on non-parametric estimation of edge density in directed multigraphs.
Following coclustering approach, we use a clustering of the vertices, with a piecewise constant estimation of the density of the edges across the clusters, and address the problem of automatically and reliably inferring the number of clusters, which is the granularity of the coclustering.
We use a model selection technique with data-dependent prior and obtain an exact evaluation criterion for the posterior probability of edge density estimation models.
We demonstrate, both theoretically and empirically, that our data-dependent modeling technique is consistent, resilient to noise, valid non asymptotically and asymptotically behaves as an universal approximator of the true edge density in directed multigraphs.
We evaluate our method using artificial graphs and present its practical interest on real world graphs.
The method is both robust and scalable.
It is able to extract insightful patterns in the unsupervised learning setting and to provide state of the art accuracy when used as a preparation step for supervised learning.
Accurately determining dependency structure is critical to discovering a system's causal organization.
We recently showed that the transfer entropy fails in a key aspect of this---measuring information flow---due to its conflation of dyadic and polyadic relationships.
We extend this observation to demonstrate that this is true of all such Shannon information measures when used to analyze multivariate dependencies.
This has broad implications, particularly when employing information to express the organization and mechanisms embedded in complex systems, including the burgeoning efforts to combine complex network theory with information theory.
Here, we do not suggest that any aspect of information theory is wrong.
Rather, the vast majority of its informational measures are simply inadequate for determining the meaningful dependency structure within joint probability distributions.
Therefore, such information measures are inadequate for discovering intrinsic causal relations.
We close by demonstrating that such distributions exist across an arbitrary set of variables.
We investigate the connection between measure, capacity and algorithmic randomness for the space of closed sets.
For any computable measure m, a computable capacity T may be defined by letting T(Q) be the measure of the family of closed sets K which have nonempty intersection with Q.
We prove an effective version of Choquet's capacity theorem by showing that every computable capacity may be obtained from a computable measure in this way.
We establish conditions on the measure m that characterize when the capacity of an m-random closed set equals zero.
This includes new results in classical probability theory as well as results for algorithmic randomness.
For certain computable measures, we construct effectively closed sets with positive capacity and with Lebesgue measure zero.
We show that for computable measures, a real q is upper semi-computable if and only if there is an effectively closed set with capacity q.
How can we design a product or movie that will attract, for example, the interest of Pennsylvania adolescents or liberal newspaper critics?
What should be the genre of that movie and who should be in the cast?
In this work, we seek to identify how we can design new movies with features tailored to a specific user population.
We formulate the movie design as an optimization problem over the inference of user-feature scores and selection of the features that maximize the number of attracted users.
Our approach, PNP, is based on a heterogeneous, tripartite graph of users, movies and features (e.g., actors, directors, genres), where users rate movies and features contribute to movies.
We learn the preferences by leveraging user similarities defined through different types of relations, and show that our method outperforms state-of-the-art approaches, including matrix factorization and other heterogeneous graph-based analysis.
We evaluate PNP on publicly available real-world data and show that it is highly scalable and effectively provides movie designs oriented towards different groups of users, including men, women, and adolescents.
Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information.
In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time.
First, a multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself.
Second, three corresponding mapping functions are learned in an adversarial manner in order to map Gaussian noise to the learned embedding feature space, for each factor respectively.
Using the proposed framework, we can manipulate the foreground, background and pose of the input image, and also sample new embedding features to generate such targeted manipulations, that provide more control over the generation process.
Experiments on Market-1501 and Deepfashion datasets show that our model does not only generate realistic person images with new foregrounds, backgrounds and poses, but also manipulates the generated factors and interpolates the in-between states.
Another set of experiments on Market-1501 shows that our model can also be beneficial for the person re-identification task.
Impact analysis is concerned with the identification of consequences of changes and is therefore an important activity for software evolution.
In modelbased software development, models are core artifacts, which are often used to generate essential parts of a software system.
Changes to a model can thus substantially affect different artifacts of a software system.
In this paper, we propose a modelbased approach to impact analysis, in which explicit impact rules can be specified in a domain specific language (DSL).
These impact rules define consequences of designated UML class diagram changes on software artifacts and the need of dependent activities such as data evolution.
The UML class diagram changes are identified automatically using model differencing.
The advantage of using explicit impact rules is that they enable the formalization of knowledge about a product.
By explicitly defining this knowledge, it is possible to create a checklist with hints about development steps that are (potentially) necessary to manage the evolution.
To validate the feasibility of our approach, we provide results of a case study.
In this paper, we present the system we have used for the Implicit WASSA 2018 Implicit Emotion Shared Task.
The task is to predict the emotion of a tweet of which the explicit mentions of emotion terms have been removed.
The idea is to come up with a model which has the ability to implicitly identify the emotion expressed given the context words.
We have used a Gated Recurrent Neural Network (GRU) and a Capsule Network based model for the task.
Pre-trained word embeddings have been utilized to incorporate contextual knowledge about words into the model.
GRU layer learns latent representations using the input word embeddings.
Subsequent Capsule Network layer learns high-level features from that hidden representation.
The proposed model managed to achieve a macro-F1 score of 0.692.
A developmental disorder that severely damages communicative and social functions, the Autism Spectrum Disorder (ASD) also presents aspects related to mental rigidity, repetitive behavior, and difficulty in abstract reasoning.
More, imbalances between excitatory and inhibitory brain states, in addition to cortical connectivity disruptions, are at the source of the autistic behavior.
Our main goal consists in unveiling the way by which these local excitatory imbalances and/or long brain connections disruptions are linked to the above mentioned cognitive features.
We developed a theoretical model based on Self-Organizing Maps (SOM), where a three-level artificial neural network qualitatively incorporates these kinds of alterations observed in brains of patients with ASD.
Computational simulations of our model indicate that high excitatory states or long distance under-connectivity are at the origins of cognitive alterations, as difficulty in categorization and mental rigidity.
More specifically, the enlargement of excitatory synaptic reach areas in a cortical map development conducts to low categorization (over-selectivity) and poor concepts formation.
And, both the over-strengthening of local excitatory synapses and the long distance under-connectivity, although through distinct mechanisms, contribute to impaired categorization (under-selectivity) and mental rigidity.
Our results indicate how, together, both local and global brain connectivity alterations give rise to spoiled cortical structures in distinct ways and in distinct cortical areas.
These alterations would disrupt the codification of sensory stimuli, the representation of concepts and, thus, the process of categorization - by this way imposing serious limits to the mental flexibility and to the capacity of generalization in the autistic reasoning.
In a world of pervasive cameras, public spaces are often captured from multiple perspectives by cameras of different types, both fixed and mobile.
An important problem is to organize these heterogeneous collections of videos by finding connections between them, such as identifying correspondences between the people appearing in the videos and the people holding or wearing the cameras.
In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i.e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos.
Unlike previous work which requires ground truth bounding boxes to estimate the correspondences, we perform person segmentation and identification jointly.
We find that solving these two problems simultaneously is mutually beneficial, because better fine-grained segmentation allows us to better perform matching across views, and information from multiple views helps us perform more accurate segmentation.
We evaluate our approach on two challenging datasets of interacting people captured from multiple wearable cameras, and show that our proposed method performs significantly better than the state-of-the-art on both person segmentation and identification.
This is the preprint version of our paper on 2015 9th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth2015).
An assistive training tool software for rehabilitation of dysphonic patients is evaluated according to the practical clinical feedback from the treatments.
One stroke sufferer and one parkinson sufferer have provided earnest suggestions for the improvement of our tool software.
The assistive tool employs a serious game as the attractive logic part, and running on the tablet with normal microphone as input device.
Seven pitch estimation algorithms have been evaluated and compared with selected patients voice database.
A series of benchmarks have been generated during the evaluation process for technology selection.
This paper explores the problem of breast tissue classification of microscopy images.
Based on the predominant cancer type the goal is to classify images into four categories of normal, benign, in situ carcinoma, and invasive carcinoma.
Given a suitable training dataset, we utilize deep learning techniques to address the classification problem.
Due to the large size of each image in the training dataset, we propose a patch-based technique which consists of two consecutive convolutional neural networks.
The first "patch-wise" network acts as an auto-encoder that extracts the most salient features of image patches while the second "image-wise" network performs classification of the whole image.
The first network is pre-trained and aimed at extracting local information while the second network obtains global information of an input image.
We trained the networks using the ICIAR 2018 grand challenge on BreAst Cancer Histology (BACH) dataset.
The proposed method yields 95 % accuracy on the validation set compared to previously reported 77 % accuracy rates in the literature.
Our code is publicly available at https://github.com/ImagingLab/ICIAR2018
This paper focuses on a class of important two-hop relay mobile ad hoc networks (MANETs) with limited-buffer constraint and any mobility model that leads to the uniform distribution of the locations of nodes in steady state, and develops a general theoretical framework for the end-to-end (E2E) delay modeling there.
We first combine the theories of Fixed-Point, Quasi-Birth-and-Death process and embedded Markov chain to model the limiting distribution of the occupancy states of a relay buffer, and then apply the absorbing Markov chain theory to characterize the packet delivery process, such that a complete theoretical framework is developed for the E2E delay analysis.
With the help of this framework, we derive a general and exact expression for the E2E delay based on the modeling of both packet queuing delay and delivery delay.
To demonstrate the application of our framework, case studies are further provided under two network scenarios with different MAC protocols to show how the E2E delay can be analytically determined for a given network scenario.
Finally, we present extensive simulation and numerical results to illustrate the efficiency of our delay analysis as well as the impacts of network parameters on delay performance.
This paper focuses on improved edge model based on Curvelet coefficients analysis.
Curvelet transform is a powerful tool for multiresolution representation of object with anisotropic edge.
Curvelet coefficients contributions have been analyzed using Scale Invariant Feature Transform (SIFT), commonly used to study local structure in images.
The permutation of Curvelet coefficients from original image and edges image obtained from gradient operator is used to improve original edges.
Experimental results show that this method brings out details on edges when the decomposition scale increases.
In this paper, we present the results of an online study with the aim to shed light on the impact that semantic context cues have on the user acceptance of tag recommendations.
Therefore, we conducted a work-integrated social bookmarking scenario with 17 university employees in order to compare the user acceptance of a context-aware tag recommendation algorithm called 3Layers with the user acceptance of a simple popularity-based baseline.
In this scenario, we validated and verified the hypothesis that semantic context cues have a higher impact on the user acceptance of tag recommendations in a collaborative tagging setting than in an individual tagging setting.
With this paper, we contribute to the sparse line of research presenting online recommendation studies.
Multi-person articulated pose tracking in unconstrained videos is an important while challenging problem.
In this paper, going along the road of top-down approaches, we propose a decent and efficient pose tracker based on pose flows.
First, we design an online optimization framework to build the association of cross-frame poses and form pose flows (PF-Builder).
Second, a novel pose flow non-maximum suppression (PF-NMS) is designed to robustly reduce redundant pose flows and re-link temporal disjoint ones.
Extensive experiments show that our method significantly outperforms best-reported results on two standard Pose Tracking datasets by 13 mAP 25 MOTA and 6 mAP 3 MOTA respectively.
Moreover, in the case of working on detected poses in individual frames, the extra computation of pose tracker is very minor, guaranteeing online 10FPS tracking.
Our source codes are made publicly available(https://github.com/YuliangXiu/PoseFlow).
Generative Adversarial Networks (GANs) can successfully approximate a probability distribution and produce realistic samples.
However, open questions such as sufficient convergence conditions and mode collapse still persist.
In this paper, we build on existing work in the area by proposing a novel framework for training the generator against an ensemble of discriminator networks, which can be seen as a one-student/multiple-teachers setting.
We formalize this problem within the full-information adversarial bandit framework, where we evaluate the capability of an algorithm to select mixtures of discriminators for providing the generator with feedback during learning.
To this end, we propose a reward function which reflects the progress made by the generator and dynamically update the mixture weights allocated to each discriminator.
We also draw connections between our algorithm and stochastic optimization methods and then show that existing approaches using multiple discriminators in literature can be recovered from our framework.
We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.
On the other hand, highly expressive discriminators ensure samples quality.
Finally, experimental results show that our approach improves samples quality and diversity over existing baselines by effectively learning a curriculum.
These results also support the claim that weaker discriminators have higher entropy improving modes coverage.
High resolution magnetic resonance (MR) imaging is desirable in many clinical applications due to its contribution to more accurate subsequent analyses and early clinical diagnoses.
Single image super resolution (SISR) is an effective and cost efficient alternative technique to improve the spatial resolution of MR images.
In the past few years, SISR methods based on deep learning techniques, especially convolutional neural networks (CNNs), have achieved state-of-the-art performance on natural images.
However, the information is gradually weakened and training becomes increasingly difficult as the network deepens.
The problem is more serious for medical images because lacking high quality and effective training samples makes deep models prone to underfitting or overfitting.
Nevertheless, many current models treat the hierarchical features on different channels equivalently, which is not helpful for the models to deal with the hierarchical features discriminatively and targetedly.
To this end, we present a novel channel splitting network (CSN) to ease the representational burden of deep models.
The proposed CSN model divides the hierarchical features into two branches, i.e., residual branch and dense branch, with different information transmissions.
The residual branch is able to promote feature reuse, while the dense branch is beneficial to the exploration of new features.
Besides, we also adopt the merge-and-run mapping to facilitate information integration between different branches.
Extensive experiments on various MR images, including proton density (PD), T1 and T2 images, show that the proposed CSN model achieves superior performance over other state-of-the-art SISR methods.
Key substitution vulnerable signature schemes are signature schemes that permit an intruder, given a public verification key and a signed message, to compute a pair of signature and verification keys such that the message appears to be signed with the new signature key.
A digital signature scheme is said to be vulnerable to destructive exclusive ownership property (DEO) If it is computationaly feasible for an intruder, given a public verification key and a pair of message and its valid signature relatively to the given public key, to compute a pair of signature and verification keys and a new message such that the given signature appears to be valid for the new message relatively to the new verification key.
In this paper, we prove decidability of the insecurity problem of cryptographic protocols where the signature schemes employed in the concrete realisation have this two properties.
In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds.
We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm.
We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head.
A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions.
We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure.
We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction.
We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech.
We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework.
The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources.
Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.
We propose a strikingly novel, simple, and effective approach to model online user behavior: we extract and analyze digital DNA sequences from user online actions and we use Twitter as a benchmark to test our proposal.
We obtain an incisive and compact DNA-inspired characterization of user actions.
Then, we apply standard DNA analysis techniques to discriminate between genuine and spambot accounts on Twitter.
An experimental campaign supports our proposal, showing its effectiveness and viability.
To the best of our knowledge, we are the first ones to identify and adapt DNA-inspired techniques to online user behavioral modeling.
While Twitter spambot detection is a specific use case on a specific social media, our proposed methodology is platform and technology agnostic, hence paving the way for diverse behavioral characterization tasks.
This thesis describes the development of fast algorithms for the computation of PERcentage CLOSure of eyes (PERCLOS) and Saccadic Ratio (SR).
PERCLOS and SR are two ocular parameters reported to be measures of alertness levels in human beings.
PERCLOS is the percentage of time in which at least 80% of the eyelid remains closed over the pupil.
Saccades are fast and simultaneous movement of both the eyes in the same direction.
SR is the ratio of peak saccadic velocity to the saccadic duration.
This thesis addresses the issues of image based estimation of PERCLOS and SR, prevailing in the literature such as illumination variation, poor illumination conditions, head rotations etc.
In this work, algorithms for real-time PERCLOS computation has been developed and implemented on an embedded platform.
The platform has been used as a case study for assessment of loss of attention in automotive drivers.
The SR estimation has been carried out offline as real-time implementation requires high frame rates of processing which is difficult to achieve due to hardware limitations.
The accuracy in estimation of the loss of attention using PERCLOS and SR has been validated using brain signals, which are reported to be an authentic cue for estimating the state of alertness in human beings.
The major contributions of this thesis include database creation, design and implementation of fast algorithms for estimating PERCLOS and SR on embedded computing platforms.
Recently, many approaches have been introduced by several researchers to identify plants.
Now, applications of texture, shape, color and vein features are common practices.
However, there are many possibilities of methods can be developed to improve the performance of such identification systems.
Therefore, several experiments had been conducted in this research.
As a result, a new novel approach by using combination of Gray-Level Co-occurrence Matrix, lacunarity and Shen features and a Bayesian classifier gives a better result compared to other plant identification systems.
For comparison, this research used two kinds of several datasets that were usually used for testing the performance of each plant identification system.
The results show that the system gives an accuracy rate of 97.19% when using the Flavia dataset and 95.00% when using the Foliage dataset and outperforms other approaches.
In this paper, we introduce a shape-based, time-scale invariant feature descriptor for 1-D sensor signals.
The time-scale invariance of the feature allows us to use feature from one training event to describe events of the same semantic class which may take place over varying time scales such as walking slow and walking fast.
Therefore it requires less training set.
The descriptor takes advantage of the invariant location detection in the scale space theory and employs a high level shape encoding scheme to capture invariant local features of events.
Based on this descriptor, a scale-invariant classifier with "R" metric (SIC-R) is designed to recognize multi-scale events of human activities.
The R metric combines the number of matches of keypoint in scale space with the Dynamic Time Warping score.
SICR is tested on various types of 1-D sensors data from passive infrared, accelerometer and seismic sensors with more than 90% classification accuracy.
Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated translations.
This can be attributed to the limitation of SEQ2SEQ models in capturing lexical and syntactic variations in a parallel corpus resulting from different styles, genres, topics, or ambiguity of the translation process.
In this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a single translation model.
Each mixture component selects its own training dataset via optimization of the marginal loglikelihood, which leads to a soft clustering of the parallel corpus.
Experiments on four language pairs demonstrate the superiority of our mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted beam search.
Our mixture model uses negligible additional parameters and incurs no extra computation cost during decoding.
We present monaa, a monitoring tool over a real-time property specified by either a timed automaton or a timed regular expression.
It implements a timed pattern matching algorithm that combines 1) features suited for online monitoring, and 2) acceleration by automata-based skipping.
Our experiments demonstrate monaa's performance advantage, especially in online usage.
In 2002 Jurdzinski and Lorys settled a long-standing conjecture that palindromes are not a Church-Rosser language.
Their proof required a sophisticated theory about computation graphs of 2-stack automata.
We present their proof in terms of 1-tape Turing machines.We also provide an alternative proof of Buntrock and Otto's result that the set of non-square bitstrings, which is context-free, is not Church-Rosser.
We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics.
We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition.
We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system using a full-order observer.
This observer is based on the use of an extended high-gain observer to provide estimates of the output and its derivatives plus a signal used by an extended Kalman filter to provide estimates of the remaining states.
Finally, we provide a simulation example that illustrates the design procedure.
In this paper, we generalize a secured direct communication process between N users with partial and full cooperation of quantum server.
The security analysis of authentication and communication processes against many types of attacks proved that the attacker cannot gain any information during intercepting either authentication or communication processes.
Hence, the security of transmitted message among N users is ensured as the attacker introduces an error probability irrespective of the sequence of measurement.
In this work, we consider diffusion-based molecular communication with and without drift between two static nano-machines.
We employ type-based information encoding, releasing a single molecule per information bit.
At the receiver, we consider an asynchronous detection algorithm which exploits the arrival order of the molecules.
In such systems, transposition errors fundamentally undermine reliability and capacity.
Thus, in this work we study the impact of transpositions on the system performance.
Towards this, we present an analytical expression for the exact bit error probability (BEP) caused by transpositions and derive computationally tractable approximations of the BEP for diffusion-based channels with and without drift.
Based on these results, we analyze the BEP when background is not negligible and derive the optimal bit interval that minimizes the BEP.
Simulation results confirm the theoretical results and show the error and goodput performance for different parameters such as block size or noise generation rate.
While for the evaluation of robustness of eye tracking algorithms the use of real-world data is essential, there are many applications where simulated, synthetic eye images are of advantage.
They can generate labelled ground-truth data for appearance based gaze estimation algorithms or enable the development of model based gaze estimation techniques by showing the influence on gaze estimation error of different model factors that can then be simplified or extended.
We extend the generation of synthetic eye images by a simulation of refraction and reflection for eyeglasses.
On the one hand this allows for the testing of pupil and glint detection algorithms under different illumination and reflection conditions, on the other hand the error of gaze estimation routines can be estimated in conjunction with different eyeglasses.
We show how a polynomial function fitting calibration performs equally well with and without eyeglasses, and how a geometrical eye model behaves when exposed to glasses.
The quantum error correction theory is as a rule formulated in a rather convoluted way, in comparison to classical algebraic theory.
This work revisits the error correction in a noisy quantum channel so as to make it intelligible to engineers.
An illustrative example is presented of a naive perfect quantum code (Hamming-like code) with five-qubits for transmitting a single qubit of information.
Also the (9,1)-Shor codes is addressed.
This article describes our experiments in neural machine translation using the recent Tensor2Tensor framework and the Transformer sequence-to-sequence model (Vaswani et al., 2017).
We examine some of the critical parameters that affect the final translation quality, memory usage, training stability and training time, concluding each experiment with a set of recommendations for fellow researchers.
In addition to confirming the general mantra "more data and larger models", we address scaling to multiple GPUs and provide practical tips for improved training regarding batch size, learning rate, warmup steps, maximum sentence length and checkpoint averaging.
We hope that our observations will allow others to get better results given their particular hardware and data constraints.
Future cellular systems based on the use of above-6 GHz frequencies, the so-called millimeter wave (mmWave) bandwidths, will heavily rely on the use of antenna arrays both at the transmitter and at the receiver, possibly with a large number of elements.
For complexity reasons, fully digital precoding and postcoding structures may turn out to be unfeasible, and thus suboptimal structures, making use of simplified hardware and a limited number of RF chains, have been investigated.
This paper considers and makes a comparative assessment, both from a spectral efficiency and energy efficiency point of view, of several suboptimal precoding and postcoding beamforming structures for the downlink of a cellular multiuser MIMO (MU-MIMO) system.
Based on the most recently available data for the energy consumption of phase shifters and switches, we show that there are cases where fully-digital beamformers may achieve a larger energy efficiency than lower-complexity solutions, as well as that structures based on the exclusive use of switches achieve quite unsatisfactory performance in realistic scenarios.
Medical images with specific pathologies are scarce, but a large amount of data is usually required for a deep convolutional neural network (DCNN) to achieve good accuracy.
We consider the problem of segmenting the left ventricular (LV) myocardium on late gadolinium enhancement (LGE) cardiovascular magnetic resonance (CMR) scans of which only some of the scans have scar tissue.
We propose ScarGAN to simulate scar tissue on healthy myocardium using chained generative adversarial networks (GAN).
Our novel approach factorizes the simulation process into 3 steps: 1) a mask generator to simulate the shape of the scar tissue; 2) a domain-specific heuristic to produce the initial simulated scar tissue from the simulated shape; 3) a refining generator to add details to the simulated scar tissue.
Unlike other approaches that generate samples from scratch, we simulate scar tissue on normal scans resulting in highly realistic samples.
We show that experienced radiologists are unable to distinguish between real and simulated scar tissue.
Training a U-Net with additional scans with scar tissue simulated by ScarGAN increases the percentage of scar pixels correctly included in LV myocardium prediction from 75.9% to 80.5%.
Optic disk segmentation is a prerequisite step in automatic retinal screening systems.
In this paper, we propose an algorithm for optic disk segmentation based on a local adaptive thresholding method.
Location of the optic disk is validated by intensity and average vessel width of retinal images.
Then an adaptive thresholding is applied on the temporal and nasal part of the optic disc separately.
Adaptive thresholding, makes our algorithm robust to illumination variations and various image acquisition conditions.
Moreover, experimental results on the DRIVE and KHATAM databases show promising results compared to the recent literature.
In the DRIVE database, the optic disk in all images is correctly located and the mean overlap reached to 43.21%.
The optic disk is correctly detected in 98% of the images with the mean overlap of 36.32% in the KHATAM database.
Applications such as web search and social networking have been moving from centralized to decentralized cloud architectures to improve their scalability.
MapReduce, a programming framework for processing large amounts of data using thousands of machines in a single cloud, also needs to be scaled out to multiple clouds to adapt to this evolution.
The challenge of building a multi-cloud distributed architecture is substantial.
Notwithstanding, the ability to deal with the new types of faults introduced by such setting, such as the outage of a whole datacenter or an arbitrary fault caused by a malicious cloud insider, increases the endeavor considerably.
In this paper we propose Medusa, a platform that allows MapReduce computations to scale out to multiple clouds and tolerate several types of faults.
Our solution fulfills four objectives.
First, it is transparent to the user, who writes her typical MapReduce application without modification.
Second, it does not require any modification to the widely used Hadoop framework.
Third, the proposed system goes well beyond the fault-tolerance offered by MapReduce to tolerate arbitrary faults, cloud outages, and even malicious faults caused by corrupt cloud insiders.
Fourth, it achieves this increased level of fault tolerance at reasonable cost.
We performed an extensive experimental evaluation in the ExoGENI testbed, demonstrating that our solution significantly reduces execution time when compared to traditional methods that achieve the same level of resilience.
Given the cost of HPC clusters, making best use of them is crucial to improve infrastructure ROI.
Likewise, reducing failed HPC jobs and related waste in terms of user wait times is crucial to improve HPC user productivity (aka human ROI).
While most efforts (e.g.,debugging HPC programs) explore technical aspects to improve ROI of HPC clusters, we hypothesize non-technical (human) aspects are worth exploring to make non-trivial ROI gains; specifically, understanding non-technical aspects and how they contribute to the failure of HPC jobs.
In this regard, we conducted a case study in the context of Beocat cluster at Kansas State University.
The purpose of the study was to learn the reasons why users terminate jobs and to quantify wasted computations in such jobs in terms of system utilization and user wait time.
The data from the case study helped identify interesting and actionable reasons why users terminate HPC jobs.
It also helped confirm that user terminated jobs may be associated with non-trivial amount of wasted computation, which if reduced can help improve the ROI of HPC clusters.
When performing a national research assessment, some countries rely on citation metrics whereas others, such as the UK, primarily use peer review.
In the influential Metric Tide report, a low agreement between metrics and peer review in the UK Research Excellence Framework (REF) was found.
However, earlier studies observed much higher agreement between metrics and peer review in the REF and argued in favour of using metrics.
This shows that there is considerable ambiguity in the discussion on agreement between metrics and peer review.
We provide clarity in this discussion by considering four important points: (1) the level of aggregation of the analysis; (2) the use of either a size-dependent or a size-independent perspective; (3) the suitability of different measures of agreement; and (4) the uncertainty in peer review.
In the context of the REF, we argue that agreement between metrics and peer review should be assessed at the institutional level rather than at the publication level.
Both a size-dependent and a size-independent perspective are relevant in the REF.
The interpretation of correlations may be problematic and as an alternative we therefore use measures of agreement that are based on the absolute or relative differences between metrics and peer review.
To get an idea of the uncertainty in peer review, we rely on a model to bootstrap peer review outcomes.
We conclude that particularly in Physics, Clinical Medicine, and Public Health, metrics agree quite well with peer review and may offer an alternative to peer review.
In this paper we develop a new framework that captures the common landscape underlying the common non-convex low-rank matrix problems including matrix sensing, matrix completion and robust PCA.
In particular, we show for all above problems (including asymmetric cases): 1) all local minima are also globally optimal; 2) no high-order saddle points exists.
These results explain why simple algorithms such as stochastic gradient descent have global converge, and efficiently optimize these non-convex objective functions in practice.
Our framework connects and simplifies the existing analyses on optimization landscapes for matrix sensing and symmetric matrix completion.
The framework naturally leads to new results for asymmetric matrix completion and robust PCA.
Vertex colouring is a well-known problem in combinatorial optimisation, whose alternative integer programming formulations have recently attracted considerable attention.
This paper briefly surveys seven known formulations of vertex colouring and introduces a formulation of vertex colouring using a suitable clique partition of the graph.
This formulation is applicable in timetabling applications, where such a clique partition of the conflict graph is given implicitly.
In contrast with some alternatives, the presented formulation can also be easily extended to accommodate complex performance indicators (``soft constraints'') imposed in a number of real-life course timetabling applications.
Its performance depends on the quality of the clique partition, but encouraging empirical results for the Udine Course Timetabling problem are reported.
This paper presents a simple, robust and (almost) unsupervised dictionary-based method, qwn-ppv (Q-WordNet as Personalized PageRanking Vector) to automatically generate polarity lexicons.
We show that qwn-ppv outperforms other automatically generated lexicons for the four extrinsic evaluations presented here.
It also shows very competitive and robust results with respect to manually annotated ones.
Results suggest that no single lexicon is best for every task and dataset and that the intrinsic evaluation of polarity lexicons is not a good performance indicator on a Sentiment Analysis task.
The qwn-ppv method allows to easily create quality polarity lexicons whenever no domain-based annotated corpora are available for a given language.
The interpretation of propositional dynamic logic (PDL) through Kripke models requires the relations constituting the interpreting Kripke model to closely observe the syntax of the modal operators.
This poses a significant challenge for an interpretation of PDL through stochastic Kripke models, because the programs' operations do not always have a natural counterpart in the set of stochastic relations.
We use rewrite rules for building up an interpretation of PDL.
It is shown that each program corresponds to an essentially unique irreducible tree, which in turn is assigned a predicate lifting, serving as the program's interpretation.
The paper establishes and studies this interpretation.
It discusses the expressivity of probabilistic models for PDL and relates properties like logical and behavioral equivalence or bisimilarity to the corresponding properties of a Kripke model for a closely related non-dynamic logic of the Hennessy-Milner type.
We present a C-language implementation of the lambda-pi calculus by extending the (call-by-need) stack machine of Ariola, Chang and Felleisen to hold types, using a typeless- tagless- final interpreter strategy.
It has the advantage of expressing all operations as folds over terms, including by-need evaluation, recovery of the initial syntax-tree encoding for any term, and eliminating most garbage-collection tasks.
These are made possible by a disciplined approach to handling the spine of each term, along with a robust stack-based API.
Type inference is not covered in this work, but also derives several advantages from the present stack transformation.
Timing and maximum stack space usage results for executing benchmark problems are presented.
We discuss how the design choices for this interpreter allow the language to be used as a high-level scripting language for automatic distributed parallel execution of common scientific computing workflows.
With the rapid increasing of software project size and maintenance cost, adherence to coding standards especially by managing identifier naming, is attracting a pressing concern from both computer science educators and software managers.
Software developers mainly use identifier names to represent the knowledge recorded in source code.
However, the popularity and adoption consistency of identifier naming conventions have not been revealed yet in this field.
Taking forty-eight popular open source projects written in three top-ranking programming languages Java, C and C++ as examples, an identifier extraction tool based on regular expression matching is developed.
In the subsequent investigation, some interesting findings are obtained.
For the identifier naming popularity, it is found that Camel and Pascal naming conventions are leading the road while Hungarian notation is vanishing.
For the identifier naming consistency, we have found that the projects written in Java have a much better performance than those written in C and C++.
Finally, academia and software industry are urged to adopt the most popular naming conventions consistently in their practices so as to lead the identifier naming to a standard, unified and high-quality road.
Bilateral filters have wide spread use due to their edge-preserving properties.
The common use case is to manually choose a parametric filter type, usually a Gaussian filter.
In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data.
This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces.
We build on the permutohedral lattice construction for efficient filtering.
The ability to learn more general forms of high-dimensional filters can be used in several diverse applications.
First, we demonstrate the use in applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks.
Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data.
This view provides new ways to encode model structure into network architectures.
A diverse set of experiments empirically validates the usage of general forms of filters.
This paper presents a novel hierarchical spatiotemporal orientation representation for spacetime image analysis.
It is designed to combine the benefits of the multilayer architecture of ConvNets and a more controlled approach to spacetime analysis.
A distinguishing aspect of the approach is that unlike most contemporary convolutional networks no learning is involved; rather, all design decisions are specified analytically with theoretical motivations.
This approach makes it possible to understand what information is being extracted at each stage and layer of processing as well as to minimize heuristic choices in design.
Another key aspect of the network is its recurrent nature, whereby the output of each layer of processing feeds back to the input.
To keep the network size manageable across layers, a novel cross-channel feature pooling is proposed.
The multilayer architecture that results systematically reveals hierarchical image structure in terms of multiscale, multiorientation properties of visual spacetime.
To illustrate its utility, the network has been applied to the task of dynamic texture recognition.
Empirical evaluation on multiple standard datasets shows that it sets a new state-of-the-art.
We propose to use deep convolutional neural networks to address the problem of cross-view image geolocalization, in which the geolocation of a ground-level query image is estimated by matching to georeferenced aerial images.
We use state-of-the-art feature representations for ground-level images and introduce a cross-view training approach for learning a joint semantic feature representation for aerial images.
We also propose a network architecture that fuses features extracted from aerial images at multiple spatial scales.
To support training these networks, we introduce a massive database that contains pairs of aerial and ground-level images from across the United States.
Our methods significantly out-perform the state of the art on two benchmark datasets.
We also show, qualitatively, that the proposed feature representations are discriminative at both local and continental spatial scales.
Faster R-CNN is one of the most representative and successful methods for object detection, and has been becoming increasingly popular in various objection detection applications.
In this report, we propose a robust deep face detection approach based on Faster R-CNN.
In our approach, we exploit several new techniques including new multi-task loss function design, online hard example mining, and multi-scale training strategy to improve Faster R-CNN in multiple aspects.
The proposed approach is well suited for face detection, so we call it Face R-CNN.
Extensive experiments are conducted on two most popular and challenging face detection benchmarks, FDDB and WIDER FACE, to demonstrate the superiority of the proposed approach over state-of-the-arts.
Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation.
In Malthusian RL, increases in a subpopulation's average return drive subsequent increases in its size, just as Thomas Malthus argued in 1798 was the relationship between preindustrial income levels and population growth.
Malthusian reinforcement learning harnesses the competitive pressures arising from growing and shrinking population size to drive agents to explore regions of state and policy spaces that they could not otherwise reach.
Furthermore, in environments where there are potential gains from specialization and division of labor, we show that Malthusian reinforcement learning is better positioned to take advantage of such synergies than algorithms based on self-play.
The basic idea behind an active queue management (AQM) is to sense the congestion level within the network and inform the packet sources about, so that they reduce their sending rate.
In literature a lot off mechanisms of AQM are studied.
But there are not used in the context of the DiffServ architecture where different types of packet with different requirements of QoS share the same link.
In this paper, we study an access control mechanism for RT and NRT packets arriving in a buffer implemented at an end user in HSDPA.
The mechanism uses thresholds to mange access in the buffer and gives access priority to RT packets.
In order to control the arrival rate of the NRT packets in the buffer an active queue management is used.
We study the effect of the feedback function on the QoS parameters for both kinds of packets .Mathematical description and analytical results are given, and numerical results show that the proposed function achieves higher QoS for the NRT packets in the system.
A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors?
We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images.
We first present a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors.
Recognition rates further increase when multiple views of the shapes are provided.
In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance.
The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes.
We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives.
Political identity is often manifested in language variation, but the relationship between the two is still relatively unexplored from a quantitative perspective.
This study examines the use of Catalan, a language local to the semi-autonomous region of Catalonia in Spain, on Twitter in discourse related to the 2017 independence referendum.
We corroborate prior findings that pro-independence tweets are more likely to include the local language than anti-independence tweets.
We also find that Catalan is used more often in referendum-related discourse than in other contexts, contrary to prior findings on language variation.
This suggests a strong role for the Catalan language in the expression of Catalonian political identity.
Regarding some papers and notes submitted to, or presented at, the second congress of the International Torah Codes Society in Jerusalem, Israel, June 2000.
Perception is often described as a predictive process based on an optimal inference with respect to a generative model.
We study here the principled construction of a generative model specifically crafted to probe motion perception.
In that context, we first provide an axiomatic, biologically-driven derivation of the model.
This model synthesizes random dynamic textures which are defined by stationary Gaussian distributions obtained by the random aggregation of warped patterns.
Importantly, we show that this model can equivalently be described as a stochastic partial differential equation.
Using this characterization of motion in images, it allows us to recast motion-energy models into a principled Bayesian inference framework.
Finally, we apply these textures in order to psychophysically probe speed perception in humans.
In this framework, while the likelihood is derived from the generative model, the prior is estimated from the observed results and accounts for the perceptual bias in a principled fashion.
We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics.
So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing.
However, as the publication of research articles accelerates, the expressivity and the richness of the SW ontology turns into a double-edged sword: a more fine-grained characterization of articles is possible, but at the cost of introducing more spurious relations among them.
In this context, the challenge of continuously recommending relevant articles to users lies in tackling a network partitioning problem, where nodes represent articles and co-occurring concepts create edges between them.
In this paper, we discuss the three research directions we have taken for solving this issue: i) the identification of generic concepts to reinforce inter-article similarities; ii) the adoption of a bipartite network representation to improve scalability; iii) the design of a clustering algorithm to identify concepts for cross-disciplinary articles and obtain fine-grained topics for all articles.
Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption.
In contrast, recently developed methods of classification from positive and unlabeled data (PU classification) use unlabeled data for risk evaluation, i.e., label information is directly extracted from unlabeled data.
In this paper, we extend PU classification to also incorporate negative data and propose a novel semi-supervised classification approach.
We establish generalization error bounds for our novel methods and show that the bounds decrease with respect to the number of unlabeled data without the distributional assumptions that are required in existing semi-supervised classification methods.
Through experiments, we demonstrate the usefulness of the proposed methods.
Unfortunately, the article "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches" has a problem in the statistical analysis which was pointed out almost immediately after the pre-print of the article appeared online.
While the problem does not negate the contribution of the the article and all key findings remain the same, it does alter some rankings of approaches used in the study.
Within this correction, we will explain the problem, how we resolved it, and present the updated results.
Networks created from real-world data contain some inaccuracies or noise, manifested as small changes in the network structure.
An important question is whether these small changes can significantly affect the analysis results.
In this paper, we study the effect of noise in changing ranks of the high centrality vertices.
We compare, using the Jaccard Index (JI), how many of the top-k high centrality nodes from the original network are also part of the top-k ranked nodes from the noisy network.
We deem a network as stable if the JI value is high.
We observe two features that affect the stability.
First, the stability is dependent on the number of top-ranked vertices considered.
When the vertices are ordered according to their centrality values, they group into clusters.
Perturbations to the network can change the relative ranking within the cluster, but vertices rarely move from one cluster to another.
Second, the stability is dependent on the local connections of the high ranking vertices.
The network is highly stable if the high ranking vertices are connected to each other.
Our findings show that the stability of a network is affected by the local properties of high centrality vertices, rather than the global properties of the entire network.
Based on these local properties we can identify the stability of a network, without explicitly applying a noise model.
The upcoming technology support for semantic web promises fresh directions for Software Engineering community.
Also semantic web has its roots in knowledge engineering that provoke software engineers to look for application of ontology applications throughout the Software Engineering life cycle.
The internal components of a semantic web are light weight and may be of less quality standards than the externally visible modules.
In fact the internal components are generated from external (ontological) component.
That is the reason agile development approaches such as feature driven development are suitable for applications internal component development.
As yet there is no particular procedure that describes the role of ontology in the processes.
Therefore we propose an ontology based feature driven development for semantic web application that can be used form application model development to feature design and implementation.
Features are precisely defined in the OWL-based domain model.
Transition from OWL based domain model to feature list is directly defined in transformation rules.
On the other hand the ontology based overall model can be easily validated through automated tools.
Advantages of ontology-based feature Driven development are also discussed.
In principle, a network can transfer data at nearly the speed of light.
Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light.
Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting new applications.
Thus, we propose a grand challenge for the networking research community: building a speed-of-light Internet.
Towards addressing this goal, we begin by investigating the causes of latency inflation in the Internet across the network stack.
Our analysis reveals that while protocol overheads, which have dominated the community's attention, are indeed important, infrastructural inefficiencies are a significant and under-explored problem.
Thus, we propose a radical, yet surprisingly low-cost approach to mitigating latency inflation at the lowest layers and building a nearly speed-of-light Internet infrastructure.
Reinforcement learning in multi-agent systems has been studied in the fields of economic game theory, artificial intelligence and statistical physics by developing an analytical understanding of the learning dynamics (often in relation to the replicator dynamics of evolutionary game theory).
However, the majority of these analytical studies focuses on repeated normal form games, which only have a single environmental state.
Environmental dynamics, i.e. changes in the state of an environment affecting the agents' payoffs has received less attention, lacking a universal method to obtain deterministic equations from established multi-state reinforcement learning algorithms.
In this work we present a novel methodology to derive the deterministic limit resulting from an interaction-adaptation time scales separation of a general class of reinforcement learning algorithms, called temporal difference learning.
This form of learning is equipped to function in more realistic multi-state environments by using the estimated value of future environmental states to adapt the agent's behavior.
We demonstrate the potential of our method with the three well established learning algorithms Q learning, SARSA learning and Actor-Critic learning.
Illustrations of their dynamics on two multi-agent, multi-state environments reveal a wide range of different dynamical regimes, such as convergence to fixed points, limit cycles and even deterministic chaos.
We introduce a novel notion of invariance feedback entropy to quantify the state information that is required by any controller that enforces a given subset of the state space to be invariant.
We establish a number of elementary properties, e.g. we provide conditions that ensure that the invariance feedback entropy is finite and show for the deterministic case that we recover the well-known notion of entropy for deterministic control systems.
We prove the data rate theorem, which shows that the invariance entropy is a tight lower bound of the data rate of any coder-controller that achieves invariance in the closed loop.
We analyze uncertain linear control systems and derive a universal lower bound of the invariance feedback entropy.
The lower bound depends on the absolute value of the determinant of the system matrix and a ratio involving the volume of the invariant set and the set of uncertainties.
Furthermore, we derive a lower bound of the data rate of any static, memoryless coder-controller.
Both lower bounds are intimately related and for certain cases it is possible to bound the performance loss due to the restriction to static coder-controllers by 1 bit/time unit.
We provide various examples throughout the paper to illustrate and discuss different definitions and results.
This paper investigates how retailers at different stages of e-commerce maturity evaluate their entry to e-commerce activities.
The study was conducted using qualitative approach interviewing 16 retailers in Saudi Arabia.
It comes up with 22 factors that are believed the most influencing factors for retailers in Saudi Arabia.
Interestingly, there seem to be differences between retailers in companies at different maturity stages in terms of having different attitudes regarding the issues of using e-commerce.
The businesses that have reached a high stage of e-commerce maturity provide practical evidence of positive and optimistic attitudes and practices regarding use of e-commerce, whereas the businesses that have not reached higher levels of maturity provide practical evidence of more negative and pessimistic attitudes and practices.
The study, therefore, should contribute to efforts leading to greater e-commerce development in Saudi Arabia and other countries with similar context.
We present a novel approach for the reconstruction of dynamic geometric shapes using a single hand-held consumer-grade RGB-D sensor at real-time rates.
Our method does not require a pre-defined shape template to start with and builds up the scene model from scratch during the scanning process.
Geometry and motion are parameterized in a unified manner by a volumetric representation that encodes a distance field of the surface geometry as well as the non-rigid space deformation.
Motion tracking is based on a set of extracted sparse color features in combination with a dense depth-based constraint formulation.
This enables accurate tracking and drastically reduces drift inherent to standard model-to-depth alignment.
We cast finding the optimal deformation of space as a non-linear regularized variational optimization problem by enforcing local smoothness and proximity to the input constraints.
The problem is tackled in real-time at the camera's capture rate using a data-parallel flip-flop optimization strategy.
Our results demonstrate robust tracking even for fast motion and scenes that lack geometric features.
Image captioning is an important but challenging task, applicable to virtual assistants, editing tools, image indexing, and support of the disabled.
Its challenges are due to the variability and ambiguity of possible image descriptions.
In recent years significant progress has been made in image captioning, using Recurrent Neural Networks powered by long-short-term-memory (LSTM) units.
Despite mitigating the vanishing gradient problem, and despite their compelling ability to memorize dependencies, LSTM units are complex and inherently sequential across time.
To address this issue, recent work has shown benefits of convolutional networks for machine translation and conditional image generation.
Inspired by their success, in this paper, we develop a convolutional image captioning technique.
We demonstrate its efficacy on the challenging MSCOCO dataset and demonstrate performance on par with the baseline, while having a faster training time per number of parameters.
We also perform a detailed analysis, providing compelling reasons in favor of convolutional language generation approaches.
Smart contracts are computer programs that can be consistently executed by a network of mutually distrusting nodes, without the arbitration of a trusted authority.
Because of their resilience to tampering, smart contracts are appealing in many scenarios, especially in those which require transfers of money to respect certain agreed rules (like in financial services and in games).
Over the last few years many platforms for smart contracts have been proposed, and some of them have been actually implemented and used.
We study how the notion of smart contract is interpreted in some of these platforms.
Focussing on the two most widespread ones, Bitcoin and Ethereum, we quantify the usage of smart contracts in relation to their application domain.
We also analyse the most common programming patterns in Ethereum, where the source code of smart contracts is available.
We study a general class of dynamic games with asymmetric information where agents' beliefs are strategy dependent, i.e. signaling occurs.
We show that the notion of sufficient information, introduced in the companion paper team, can be used to effectively compress the agents' information in a mutually consistent manner that is sufficient for decision-making purposes.
We present instances of dynamic games with asymmetric information where we can characterize a time-invariant information state for each agent.
Based on the notion of sufficient information, we define a class of equilibria for dynamic games called Sufficient Information Based Perfect Bayesian Equilibrium (SIB-PBE).
Utilizing the notion of SIB-PBE, we provide a sequential decomposition of dynamic games with asymmetric information over time; this decomposition leads to a dynamic program that determines SIB-PBE of dynamic games.
Furthermore, we provide conditions under which we can guarantee the existence of SIB-PBE.
Recurrent Neural Networks (RNNs) are an important class of neural networks designed to retain and incorporate context into current decisions.
RNNs are particularly well suited for machine learning problems in which context is important, such as speech recognition or language translation.
This work presents RNNFast, a hardware accelerator for RNNs that leverages an emerging class of non-volatile memory called domain-wall memory (DWM).
We show that DWM is very well suited for RNN acceleration due to its very high density and low read/write energy.
At the same time, the sequential nature of input/weight processing of RNNs mitigates one of the downsides of DWM, which is the linear (rather than constant) data access time.
RNNFast is very efficient and highly scalable, with flexible mapping of logical neurons to RNN hardware blocks.
The basic hardware primitive, the RNN processing element (PE) includes custom DWM-based multiplication, sigmoid and tanh units for high density and low-energy.
The accelerator is designed to minimize data movement by closely interleaving DWM storage and computation.
We compare our design with a state-of-the-art GPGPU and find 21.8x better performance with 70x lower energy.
Photovoltaic (PV) power production increased drastically in Europe throughout the last years.
About the 6% of electricity in Italy comes from PV and for an efficient management of the power grid an accurate and reliable forecasting of production would be needed.
Starting from a dataset of electricity production of 65 Italian solar plants for the years 2011-2012 we investigate the possibility to forecast daily production from one to ten days of lead time without using on site measurements.
Our study is divided in two parts: an assessment of the predictability of meteorological variables using weather forecasts and an analysis on the application of data-driven modelling in predicting solar power production.
We calibrate a SVM model using available observations and then we force the same model with the predicted variables from weather forecasts with a lead time from one to ten days.
As expected, solar power production is strongly influenced by cloudiness and clear sky, in fact we observe that while during summer we obtain a general error under the 10% (slightly lower in south Italy), during winter the error is abundantly above the 20%.
This paper describes a Naive-Bayesian predictive model for 2016 U.S. Presidential Election based on Twitter data.
We use 33,708 tweets gathered since December 16, 2015 until February 29, 2016.
We introduce a simpler data preprocessing method to label the data and train the model.
The model achieves 95.8% accuracy on 10-fold cross validation and predicts Ted Cruz and Bernie Sanders as Republican and Democratic nominee respectively.
It achieves a comparable result to those in its competitor methods.
In contrary to traditional media streaming services where a unique media content is delivered to different users, interactive multiview navigation applications enable users to choose their own viewpoints and freely navigate in a 3-D scene.
The interactivity brings new challenges in addition to the classical rate-distortion trade-off, which considers only the compression performance and viewing quality.
On the one hand, interactivity necessitates sufficient viewpoints for richer navigation; on the other hand, it requires to provide low bandwidth and delay costs for smooth navigation during view transitions.
In this paper, we formally describe the novel trade-offs posed by the navigation interactivity and classical rate-distortion criterion.
Based on an original formulation, we look for the optimal design of the data representation by introducing novel rate and distortion models and practical solving algorithms.
Experiments show that the proposed data representation method outperforms the baseline solution by providing lower resource consumptions and higher visual quality in all navigation configurations, which certainly confirms the potential of the proposed data representation in practical interactive navigation systems.
This paper considers the problem of efficiently answering reachability queries over views of provenance graphs, derived from executions of workflows that may include recursion.
Such views include composite modules and model fine-grained dependencies between module inputs and outputs.
A novel view-adaptive dynamic labeling scheme is developed for efficient query evaluation, in which view specifications are labeled statically (i.e. as they are created) and data items are labeled dynamically as they are produced during a workflow execution.
Although the combination of fine-grained dependencies and recursive workflows entail, in general, long (linear-size) data labels, we show that for a large natural class of workflows and views, labels are compact (logarithmic-size) and reachability queries can be evaluated in constant time.
Experimental results demonstrate the benefit of this approach over the state-of-the-art technique when applied for labeling multiple views.
We present the state of the art in representing and reasoning with fuzzy knowledge in Semantic Web Languages such as triple languages RDF/RDFS, conceptual languages of the OWL 2 family and rule languages.
We further show how one may generalise them to so-called annotation domains, that cover also e.g. temporal and provenance extensions.
The use of educational games for pedagogical practice can provide new conceptions of teaching-learning in an interactive environment stimulating the acquisition of new knowledge.
The so-called serious games are focused on the goal of transmitting educational content or training to the user.
In the context of entrepreneurship, serious games appear to have greater importance due to the multidisciplinary of the knowledge needed.
Therefore, we propose the adoption of the Entrexplorer game in the context of a university classroom.
The game is a cloud-based serious game about the theme of entrepreneurship where users can access learning contents that will assist them in the acquisition of entrepreneurial skills.
The organization of the game in eight levels with six additional floors let students learn the different dimensions of an entrepreneurship project while progressing during the gameplay.
This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs.
This formulation is motivated by an un-supervised grammar induction problem from computational linguistics.
We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm.
Nonnegative matrix factorization (NMF) is one of the most frequently-used matrix factorization models in data analysis.
A significant reason to the popularity of NMF is its interpretability and the `parts of whole' interpretation of its components.
Recently, max-times, or subtropical, matrix factorization (SMF) has been introduced as an alternative model with equally interpretable `winner takes it all' interpretation.
In this paper we propose a new mixed linear--tropical model, and a new algorithm, called Latitude, that combines NMF and SMF, being able to smoothly alternate between the two.
In our model, the data is modeled using the latent factors and latent parameters that control whether the factors are interpreted as NMF or SMF features, or their mixtures.
We present an algorithm for our novel matrix factorization.
Our experiments show that our algorithm improves over both baselines, and can yield interpretable results that reveal more of the latent structure than either NMF or SMF alone.
Autonomous vehicles require knowledge of the surrounding road layout, which can be predicted by state-of-the-art CNNs.
This work addresses the current lack of data for determining lane instances, which are needed for various driving manoeuvres.
The main issue is the time-consuming manual labelling process, typically applied per image.
We notice that driving the car is itself a form of annotation.
Therefore, we propose a semi-automated method that allows for efficient labelling of image sequences by utilising an estimated road plane in 3D based on where the car has driven and projecting labels from this plane into all images of the sequence.
The average labelling time per image is reduced to 5 seconds and only an inexpensive dash-cam is required for data capture.
We are releasing a dataset of 24,000 images and additionally show experimental semantic segmentation and instance segmentation results.
Online fashion sales present a challenging use case for personalized recommendation: Stores offer a huge variety of items in multiple sizes.
Small stocks, high return rates, seasonality, and changing trends cause continuous turnover of articles for sale on all time scales.
Customers tend to shop rarely, but often buy multiple items at once.
We report on backtest experiments with sales data of 100k frequent shoppers at Zalando, Europe's leading online fashion platform.
To model changing customer and store environments, our recommendation method employs a pair of neural networks: To overcome the cold start problem, a feedforward network generates article embeddings in "fashion space," which serve as input to a recurrent neural network that predicts a style vector in this space for each client, based on their past purchase sequence.
We compare our results with a static collaborative filtering approach, and a popularity ranking baseline.
Satellite Communication systems are a promising solution to extend and complement terrestrial networks in unserved or under-served areas.
This aspect is reflected by recent commercial and standardisation endeavours.
In particular, 3GPP recently initiated a Study Item for New Radio-based, i.e., 5G, Non-Terrestrial Networks aimed at deploying satellite systems either as a stand-alone solution or as an integration to terrestrial networks in mobile broadband and machine-type communication scenarios.
However, typical satellite channel impairments, as large path losses, delays, and Doppler shifts, pose severe challenges to the realisation of a satellite-based NR network.
In this paper, based on the architecture options currently being discussed in the standardisation fora, we discuss and assess the impact of the satellite channel characteristics on the physical and Medium Access Control layers, both in terms of transmitted waveforms and procedures for enhanced Mobile BroadBand (eMBB) and NarrowBand-Internet of Things (NB-IoT) applications.
The proposed analysis shows that the main technical challenges are related to the PHY/MAC procedures, in particular Random Access (RA), Timing Advance (TA), and Hybrid Automatic Repeat reQuest (HARQ) and, depending on the considered service and architecture, different solutions are proposed.
Target-oriented sentiment classification aims at classifying sentiment polarities over individual opinion targets in a sentence.
RNN with attention seems a good fit for the characteristics of this task, and indeed it achieves the state-of-the-art performance.
After re-examining the drawbacks of attention mechanism and the obstacles that block CNN to perform well in this classification task, we propose a new model to overcome these issues.
Instead of attention, our model employs a CNN layer to extract salient features from the transformed word representations originated from a bi-directional RNN layer.
Between the two layers, we propose a component to generate target-specific representations of words in the sentence, meanwhile incorporate a mechanism for preserving the original contextual information from the RNN layer.
Experiments show that our model achieves a new state-of-the-art performance on a few benchmarks.
Workflows specify collections of tasks that must be executed under the responsibility or supervision of human users.
Workflow management systems and workflow-driven applications need to enforce security policies in the form of access control, specifying which users can execute which tasks, and authorization constraints, such as Separation of Duty, further restricting the execution of tasks at run-time.
Enforcing these policies is crucial to avoid frauds and malicious use, but it may lead to situations where a workflow instance cannot be completed without the violation of the policy.
The Workflow Satisfiability Problem (WSP) asks whether there exists an assignment of users to tasks in a workflow such that every task is executed and the policy is not violated.
The WSP is inherently hard, but solutions to this problem have a practical application in reconciling business compliance and business continuity.
Solutions to related problems, such as workflow resiliency (i.e., whether a workflow instance is still satisfiable even in the absence of users), are important to help in policy design.
Several variations of the WSP and similar problems have been defined in the literature and there are many solution methods available.
In this paper, we survey the work done on these problems in the past 20 years.
In this paper we present a queueing network approach to the problem of routing and rebalancing a fleet of self-driving vehicles providing on-demand mobility within a capacitated road network.
We refer to such systems as autonomous mobility-on-demand systems, or AMoD.
We first cast an AMoD system into a closed, multi-class BCMP queueing network model.
Second, we present analysis tools that allow the characterization of performance metrics for a given routing policy, in terms, e.g., of vehicle availabilities, and first and second order moments of vehicle throughput.
Third, we propose a scalable method for the synthesis of routing policies, with performance guarantees in the limit of large fleet sizes.
Finally, we validate our theoretical results on a case study of New York City.
Collectively, this paper provides a unifying framework for the analysis and control of AMoD systems, which subsumes earlier Jackson and network flow models, provides a quite large set of modeling options (e.g., the inclusion of road capacities and general travel time distributions), and allows the analysis of second and higher-order moments for the performance metrics.
Deep learning has been successfully applied to various tasks, but its underlying mechanism remains unclear.
Neural networks associate similar inputs in the visible layer to the same state of hidden variables in deep layers.
The fraction of inputs that are associated to the same state is a natural measure of similarity and is simply related to the cost in bits required to represent these inputs.
The degeneracy of states with the same information cost provides instead a natural measure of noise and is simply related the entropy of the frequency of states, that we call relevance.
Representations with minimal noise, at a given level of similarity (resolution), are those that maximise the relevance.
A signature of such efficient representations is that frequency distributions follow power laws.
We show, in extensive numerical experiments, that deep neural networks extract a hierarchy of efficient representations from data, because they i) achieve low levels of noise (i.e. high relevance) and ii) exhibit power law distributions.
We also find that the layer that is most efficient to reliably generate patterns of training data is the one for which relevance and resolution are traded at the same price, which implies that frequency distribution follows Zipf's law.
The unrelenting increase in the population of mobile users and their traffic demands drive cellular network operators to densify their network infrastructure.
Network densification shrinks the footprint of base stations (BSs) and reduces the number of users associated with each BS, leading to an improved spatial frequency reuse and spectral efficiency, and thus, higher network capacity.
However, the densification gain come at the expense of higher handover rates and network control overhead.
Hence, users mobility can diminish or even nullifies the foreseen densification gain.
In this context, splitting the control plane (C-plane) and user plane (U-plane) is proposed as a potential solution to harvest densification gain with reduced cost in terms of handover rate and network control overhead.
In this article, we use stochastic geometry to develop a tractable mobility-aware model for a two-tier downlink cellular network with ultra-dense small cells and C-plane/U-plane split architecture.
The developed model is then used to quantify the effect of mobility on the foreseen densification gain with and without C-plane/U-plane split.
To this end, we shed light on the handover problem in dense cellular environments, show scenarios where the network fails to support certain mobility profiles, and obtain network design insights.
The strategy of sustainable development in the governance of information and communication technology (ICT) is a sector of advanced research that leads to rising challenges posed by social and environmental requirements in the implementation and establishment of the governance strategy.
This paper offers new generation governance model that we call "ICT Green Governance".
The proposed framework provides an original model based on the Corporate Social Responsibility (CSR) concept and Green IT strategy.
Facing increasing pressure from stakeholders, the model offers a new vision of ICT governance to ensure effective and efficient use of ICT in enabling an enterprise to achieve its goals.
We present here the relevance of our model, on the basis of a literature review, and provide guidelines and principles for effective ICT governance in the way of sustainable development, in order to improve the economic, social and environmental performance of companies.
This work presents a supervised learning based approach to the computer vision problem of frame interpolation.
The presented technique could also be used in the cartoon animations since drawing each individual frame consumes a noticeable amount of time.
The most existing solutions to this problem use unsupervised methods and focus only on real life videos with already high frame rate.
However, the experiments show that such methods do not work as well when the frame rate becomes low and object displacements between frames becomes large.
This is due to the fact that interpolation of the large displacement motion requires knowledge of the motion structure thus the simple techniques such as frame averaging start to fail.
In this work the deep convolutional neural network is used to solve the frame interpolation problem.
In addition, it is shown that incorporating the prior information such as optical flow improves the interpolation quality significantly.
In this paper, cyber attack detection and isolation is studied on a network of UAVs in a formation flying setup.
As the UAVs communicate to reach consensus on their states while making the formation, the communication network among the UAVs makes them vulnerable to a potential attack from malicious adversaries.
Two types of attacks pertinent to a network of UAVs have been considered: a node attack on the UAVs and a deception attack on the communication between the UAVs.
UAVs formation control presented using a consensus algorithm to reach a pre-specified formation.
A node and a communication path deception cyber attacks on the UAV's network are considered with their respective models in the formation setup.
For these cyber attacks detection, a bank of Unknown Input Observer (UIO) based distributed fault detection scheme proposed to detect and identify the compromised UAV in the formation.
A rule based on the residuals generated using the bank of UIOs are used to detect attacks and identify the compromised UAV in the formation.
Further, an algorithm developed to remove the faulty UAV from the network once an attack detected and the compromised UAV isolated while maintaining the formation flight with a missing UAV node.
Semantic labeling for numerical values is a task of assigning semantic labels to unknown numerical attributes.
The semantic labels could be numerical properties in ontologies, instances in knowledge bases, or labeled data that are manually annotated by domain experts.
In this paper, we refer to semantic labeling as a retrieval setting where the label of an unknown attribute is assigned by the label of the most relevant attribute in labeled data.
One of the greatest challenges is that an unknown attribute rarely has the same set of values with the similar one in the labeled data.
To overcome the issue, statistical interpretation of value distribution is taken into account.
However, the existing studies assume a specific form of distribution.
It is not appropriate in particular to apply open data where there is no knowledge of data in advance.
To address these problems, we propose a neural numerical embedding model (EmbNum) to learn useful representation vectors for numerical attributes without prior assumptions on the distribution of data.
Then, the "semantic similarities" between the attributes are measured on these representation vectors by the Euclidean distance.
Our empirical experiments on City Data and Open Data show that EmbNum significantly outperforms state-of-the-art methods for the task of numerical attribute semantic labeling regarding effectiveness and efficiency.
The field of satellite communications is enjoying a renewed interest in the global telecom market, and very high throughput satellites (V/HTS), with their multiple spot-beams, are key for delivering the future rate demands.
In this article, the state-of-the-art and open research challenges of signal processing techniques for V/HTS systems are presented for the first time, with focus on novel approaches for efficient interference mitigation.
The main signal processing topics for the ground, satellite, and user segment are addressed.
Also, the critical components for the integration of satellite and terrestrial networks are studied, such as cognitive satellite systems and satellite-terrestrial backhaul for caching.
All the reviewed techniques are essential in empowering satellite systems to support the increasing demands of the upcoming generation of communication networks.
The connected autonomous vehicle has been often touted as a technology that will become pervasive in society in the near future.
Rather than being stand alone, we examine the need for autonomous vehicles to cooperate and interact within their socio-cyber-physical environments, including the problems cooperation will solve, but also the issues and challenges.
Lung cancer is the deadliest type of cancer for both men and women.
Feature selection plays a vital role in cancer classification.
This paper investigates the feature selection process in Computed Tomographic (CT) lung cancer images using soft set theory.
We propose a new soft set based unsupervised feature selection algorithm.
Nineteen features are extracted from the segmented lung images using gray level co-occurence matrix (GLCM) and gray level different matrix (GLDM).
In this paper, an efficient Unsupervised Soft Set based Quick Reduct (SSUSQR) algorithm is presented.
This method is used to select features from the data set and compared with existing rough set based unsupervised feature selection methods.
Then K-Means and Self Organizing Map (SOM) clustering algorithms are used to cluster the data.
The performance of the feature selection algorithms is evaluated based on performance of clustering techniques.
The results show that the proposed method effectively removes redundant features.
We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps.
The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations.
CSRNet is an easy-trained model because of its pure convolutional structure.
We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance.
In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method.
We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset.
Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.
Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch learning setting, which requires the entire training data to be made available prior to the learning task.
This is not scalable for many real-world scenarios where new data arrives sequentially in a stream form.
We aim to address an open challenge of "Online Deep Learning" (ODL) for learning DNNs on the fly in an online setting.
Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is significantly more challenging since the optimization of the DNN objective function is non-convex, and regular backpropagation does not work well in practice, especially for online learning settings.
In this paper, we present a new online deep learning framework that attempts to tackle the challenges by learning DNN models of adaptive depth from a sequence of training data in an online learning setting.
In particular, we propose a novel Hedge Backpropagation (HBP) method for online updating the parameters of DNN effectively, and validate the efficacy of our method on large-scale data sets, including both stationary and concept drifting scenarios.
Recently, Yuan et al.(2016) have shown the effectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD).
Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was released.
This paper presents the results of a reproduction study of this technique using only openly available datasets (GigaWord, SemCore, OMSTI) and software (TensorFlow).
From them, it emerged that state-of-the-art results can be obtained with much less data than hinted by Yuan et al.
All code and trained models are made freely available.
Beatmania is a rhythm action game where players take on the role of a DJ who performs music by pressing specific controller buttons to mix "Keysounds" (audio samples) at the correct time.
Unlike other rhythm action games such as Dance Dance Revolution, players must play certain notes from up to eight different instruments.
Creating game stages, called "charts," is considered a difficult and time-consuming task, and in this paper we explore approaches in computer generation for these maps.
We present a deep neural network based process for automatically generating Beatmania charts for arbitrary pieces of music.
Given a raw audio track of a song, we identify notes with its corresponding instrument, and use a neural network to classify each note as playable or nonplayable.
The final chart is produced by mapping playable notes to controls.
We achieve an F1-score on the core task of Sample Selection that significantly beats LSTM baselines.
Maurice Gross (1934-2001) was both a great linguist and a pioneer in natural language processing.
This article is written in homage to his memory
The acknowledged model for networks of collaborations is the hypergraph model.
Nonetheless when it comes to be visualized hypergraphs are transformed into simple graphs.
Very often, the transformation is made by clique expansion of the hyperedges resulting in a loss of information for the user and in artificially more complex graphs due to the high number of edges represented.
The extra-node representation gives substantial improvement in the visualisation of hypergraphs and in the retrieval of information.
This paper aims at showing qualitatively and quantitatively how the extra-node representation can improve the visualisation of hypergraphs without loss of information.
The goal of the DSLDI workshop is to bring together researchers and practitioners interested in sharing ideas on how DSLs should be designed, implemented, supported by tools, and applied in realistic application contexts.
We are both interested in discovering how already known domains such as graph processing or machine learning can be best supported by DSLs, but also in exploring new domains that could be targeted by DSLs.
More generally, we are interested in building a community that can drive forward the development of modern DSLs.
These informal post-proceedings contain the submitted talk abstracts to the 3rd DSLDI workshop (DSLDI'15), and a summary of the panel discussion on Language Composition.
As wireless devices boom, and bandwidth-hungry applications (e.g., video and cloud uploading) get popular, today's Wireless Local Area Networks (WLANs) become not only crowded but also stressed at throughput.
Multi-user Multiple-Input and Multiple-Output (MU-MIMO), an advanced form of MIMO, has gained attention due to its huge potential in improving the performance of WLANs.
This paper surveys random access based MAC protocols for MU-MIMO enabled WLANs.
It first provides background information about the evolution and the fundamental MAC schemes of IEEE 802.11 Standards and Amendments, and then identifies the key requirements of designing MU-MIMO MAC protocols for WLANs.
After that, the most representative MU-MIMO MAC proposals in the literature are overviewed by benchmarking their MAC procedures and examining the key components, such as the channel state information acquisition, de/pre-coding and scheduling schemes.
Classifications and discussions on important findings of the surveyed MAC protocols are provided, based on which, the research challenges for designing effective MU-MIMO MAC protocols, as well as the envisaged MAC's role in the future heterogeneous networks, are highlighted.
Several statistical and machine learning methods are proposed to estimate the type and intensity of physical load and accumulated fatigue .
They are based on the statistical analysis of accumulated and moving window data subsets with construction of a kurtosis-skewness diagram.
This approach was applied to the data gathered by the wearable heart monitor for various types and levels of physical activities, and for people with various physical conditions.
The different levels of physical activities, loads, and fitness can be distinguished from the kurtosis-skewness diagram, and their evolution can be monitored.
Several metrics for estimation of the instant effect and accumulated effect (physical fatigue) of physical loads were proposed.
The data and results presented allow to extend application of these methods for modeling and characterization of complex human activity patterns, for example, to estimate the actual and accumulated physical load and fatigue, model the potential dangerous development, and give cautions and advice in real time.
The evaluation of machine learning algorithms in biomedical fields for applications involving sequential data lacks standardization.
Common quantitative scalar evaluation metrics such as sensitivity and specificity can often be misleading depending on the requirements of the application.
Evaluation metrics must ultimately reflect the needs of users yet be sufficiently sensitive to guide algorithm development.
Feedback from critical care clinicians who use automated event detection software in clinical applications has been overwhelmingly emphatic that a low false alarm rate, typically measured in units of the number of errors per 24 hours, is the single most important criterion for user acceptance.
Though using a single metric is not often as insightful as examining performance over a range of operating conditions, there is a need for a single scalar figure of merit.
In this paper, we discuss the deficiencies of existing metrics for a seizure detection task and propose several new metrics that offer a more balanced view of performance.
We demonstrate these metrics on a seizure detection task based on the TUH EEG Corpus.
We show that two promising metrics are a measure based on a concept borrowed from the spoken term detection literature, Actual Term-Weighted Value, and a new metric, Time-Aligned Event Scoring (TAES), that accounts for the temporal alignment of the hypothesis to the reference annotation.
We also demonstrate that state of the art technology based on deep learning, though impressive in its performance, still needs significant improvement before it will meet very strict user acceptance guidelines.
Automatic speech recognition (ASR) has been widely researched with supervised approaches, while many low-resourced languages lack audio-text aligned data, and supervised methods cannot be applied on them.
In this work, we propose a framework to achieve unsupervised ASR on a read English speech dataset, where audio and text are unaligned.
In the first stage, each word-level audio segment in the utterances is represented by a vector representation extracted by a sequence-of-sequence autoencoder, in which phonetic information and speaker information are disentangled.
Secondly, semantic embeddings of audio segments are trained from the vector representations using a skip-gram model.
Last but not the least, an unsupervised method is utilized to transform semantic embeddings of audio segments to text embedding space, and finally the transformed embeddings are mapped to words.
With the above framework, we are towards unsupervised ASR trained by unaligned text and speech only.
Recent technology advancements in the areas of compute, storage and networking, along with the increased demand for organizations to cut costs while remaining responsive to increasing service demands have led to the growth in the adoption of cloud computing services.
Cloud services provide the promise of improved agility, resiliency, scalability and a lowered Total Cost of Ownership (TCO).
This research introduces a framework for minimizing cost and maximizing resource utilization by using an Integer Linear Programming (ILP) approach to optimize the assignment of workloads to servers on Amazon Web Services (AWS) cloud infrastructure.
The model is based on the classical minimum-cost flow model, known as the assignment model.
The success of online auctions has given buyers access to greater product diversity with potentially lower prices.
It has provided sellers with access to large numbers of potential buyers and reduced transaction costs by enabling auctions to take place without regard to time or place.
However it is difficult to spend more time period with system and closely monitor the auction until auction participant wins the bid or closing of the auction.
Determining which items to bid on or what may be the recommended bid and when to bid it are difficult questions to answer for online auction participants.
The multi agent auction advisor system JADE and TRACE, which is connected with decision support system, gives the recommended bid to buyers for online auctions.
The auction advisor system relies on intelligent agents both for the retrieval of relevant auction data and for the processing of that data to enable meaningful recommendations, statistical reports and market prediction report to be made to auction participants.
Semi-supervised learning is attracting increasing attention due to the fact that datasets of many domains lack enough labeled data.
Variational Auto-Encoder (VAE), in particular, has demonstrated the benefits of semi-supervised learning.
The majority of existing semi-supervised VAEs utilize a classifier to exploit label information, where the parameters of the classifier are introduced to the VAE.
Given the limited labeled data, learning the parameters for the classifiers may not be an optimal solution for exploiting label information.
Therefore, in this paper, we develop a novel approach for semi-supervised VAE without classifier.
Specifically, we propose a new model called Semi-supervised Disentangled VAE (SDVAE), which encodes the input data into disentangled representation and non-interpretable representation, then the category information is directly utilized to regularize the disentangled representation via the equality constraint.
To further enhance the feature learning ability of the proposed VAE, we incorporate reinforcement learning to relieve the lack of data.
The dynamic framework is capable of dealing with both image and text data with its corresponding encoder and decoder networks.
Extensive experiments on image and text datasets demonstrate the effectiveness of the proposed framework.
We generalize the class of split graphs to the directed case and show that these split digraphs can be identified from their degree sequences.
The first degree sequence characterization is an extension of the concept of splittance to directed graphs, while the second characterization says a digraph is split if and only if its degree sequence satisfies one of the Fulkerson inequalities (which determine when an integer-pair sequence is digraphic) with equality.
A salient dynamic property of social media is bursting behavior.
In this paper, we study bursting behavior in terms of the temporal relation between a preceding baseline fluctuation and the successive burst response using a frequency time series of 3,000 keywords on Twitter.
We found that there is a fluctuation threshold up to which the burst size increases as the fluctuation increases and that above the threshold, there appears a variety of burst sizes.
We call this threshold the critical threshold.
Investigating this threshold in relation to endogenous bursts and exogenous bursts based on peak ratio and burst size reveals that the bursts below this threshold are endogenously caused and above this threshold, exogenous bursts emerge.
Analysis of the 3,000 keywords shows that all the nouns have both endogenous and exogenous origins of bursts and that each keyword has a critical threshold in the baseline fluctuation value to distinguish between the two.
Having a threshold for an input value for activating the system implies that Twitter is an excitable medium.
These findings are useful for characterizing how excitable a keyword is on Twitter and could be used, for example, to predict the response to particular information on social media.
This work presents a novel approach for the early recognition of the type of a laparoscopic surgery from its video.
Early recognition algorithms can be beneficial to the development of 'smart' OR systems that can provide automatic context-aware assistance, and also enable quick database indexing.
The task is however ridden with challenges specific to videos belonging to the domain of laparoscopy, such as high visual similarity across surgeries and large variations in video durations.
To capture the spatio-temporal dependencies in these videos, we choose as our model a combination of a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network.
We then propose two complementary approaches for improving early recognition performance.
The first approach is a CNN fine-tuning method that encourages surgeries to be distinguished based on the initial frames of laparoscopic videos.
The second approach, referred to as 'Future-State Predicting LSTM', trains an LSTM to predict information related to future frames, which helps in distinguishing between the different types of surgeries.
We evaluate our approaches on a large dataset of 425 laparoscopic videos containing 9 types of surgeries (Laparo425), and achieve on average an accuracy of 75% having observed only the first 10 minutes of a surgery.
These results are quite promising from a practical standpoint and also encouraging for other types of image-guided surgeries.
As wireless ad hoc and mobile networks are emerging and the transferred data become more sensitive, information security measures should make use of all the available contextual resources to secure information flows.
The physical layer security framework provides models, algorithms, and proofs of concept for generating pairwise symmetric keys over single links between two nodes within communication range.
In this study, we focus on cooperative group key generation over multiple Impulse Radio - Ultra Wideband (IR-UWB) channels according to the source model.
The main idea, proposed in previous work, consists in generating receiver-specific signals, also called s-signals, so that only the intended receiver has access to the non-observable channels corresponding to its non-adjacent links.
Herein, we complete the analysis of the proposed protocol and investigate several signal processing algorithms to generate the s-signal expressed as a solution to a deconvolution problem in the case of IR-UWB.
Our findings indicate that it is compulsory to add a parameterizable constraint to the searched s-signal and that the Expectation-Maximization algorithm can provide a stable self-parameterizable solution.
Compared to physical layer key distribution methods, the proposed key generation protocol requires less traffic overhead for small cooperative groups while being robust at medium and high signal-to-noise ratios.
New ideas in distributed systems (algorithms or protocols) are commonly tested by simulation, because experimenting with a prototype deployed on a realistic platform is cumbersome.
However, a prototype not only measures performance but also verifies assumptions about the underlying system.
We developed dfuntest - a testing framework for distributed applications that defines abstractions and test structure, and automates experiments on distributed platforms.
Dfuntest aims to be jUnit's analogue for distributed applications; a framework that enables the programmer to write robust and flexible scenarios of experiments.
Dfuntest requires minimal bindings that specify how to deploy and interact with the application.
Dfuntest's abstractions allow execution of a scenario on a single machine, a cluster, a cloud, or any other distributed infrastructure, e.g. on PlanetLab.
A scenario is a procedure; thus, our framework can be used both for functional tests and for performance measurements.
We show how to use dfuntest to deploy our DHT prototype on 60 PlanetLab nodes and verify whether the prototype maintains a correct topology.
Collaborative object transportation using multiple Micro Aerial Vehicles (MAVs) with limited communication is a challenging problem.
In this paper we address the problem of multiple MAVs mechanically coupled to a bulky object for transportation purposes without explicit communication between agents.
The apparent physical properties of each agent are reshaped to achieve robustly stable transportation.
Parametric uncertainties and unmodeled dynamics of each agent are quantified and techniques from robust control theory are employed to choose the physical parameters of each agent to guarantee stability.
Extensive simulation analysis and experimental results show that the proposed method guarantees stability in worst case scenarios.
Location-Based Services (LBSs) build upon geographic information to provide users with location-dependent functionalities.
In such a context, it is particularly important that geographic locations claimed by users are the actual ones.
Centralized verification approaches proposed in the last few years are not satisfactory, as they entail a high risk to the privacy of users.
In this paper, we present and evaluate a novel decentralized, infrastructure-independent proof-of-location scheme based on the blockchain technology.
Our scheme guarantees both location trustworthiness and user privacy preservation.
Composition and parameterization of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps are a challenging task.
Auto-WEKA is a tool to automate the combined algorithm selection and hyperparameter (CASH) optimization problem.
In this paper, we extend the CASH problem and Auto-WEKA to support the MCPS, including preprocessing steps for both classification and regression tasks.
We define the optimization problem in which the search space consists of suitably parameterized Petri nets forming the sought MCPS solutions.
In the experimental analysis, we focus on examining the impact of considerably extending the search space (from approximately 22,000 to 812 billion possible combinations of methods and categorical hyperparameters).
In a range of extensive experiments, three different optimization strategies are used to automatically compose MCPSs for 21 publicly available data sets.
The diversity of the composed MCPSs found is an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models.
We also present the results on seven data sets from real chemical production processes.
Our findings can have a major impact on the development of high-quality predictive models as well as their maintenance and scalability aspects needed in modern applications and deployment scenarios.
Linear Discriminant Analysis (LDA) is a widely-used supervised dimensionality reduction method in computer vision and pattern recognition.
In null space based LDA (NLDA), a well-known LDA extension, between-class distance is maximized in the null space of the within-class scatter matrix.
However, there are some limitations in NLDA.
Firstly, for many data sets, null space of within-class scatter matrix does not exist, thus NLDA is not applicable to those datasets.
Secondly, NLDA uses arithmetic mean of between-class distances and gives equal consideration to all between-class distances, which makes larger between-class distances can dominate the result and thus limits the performance of NLDA.
In this paper, we propose a harmonic mean based Linear Discriminant Analysis, Multi-Class Discriminant Analysis (MCDA), for image classification, which minimizes the reciprocal of weighted harmonic mean of pairwise between-class distance.
More importantly, MCDA gives higher priority to maximize small between-class distances.
MCDA can be extended to multi-label dimension reduction.
Results on 7 single-label data sets and 4 multi-label data sets show that MCDA has consistently better performance than 10 other single-label approaches and 4 other multi-label approaches in terms of classification accuracy, macro and micro average F1 score.
Human ability of both versatile grasping of given objects and grasping of novel (as of yet unseen) objects is truly remarkable.
This probably arises from the experience infants gather by actively playing around with diverse objects.
Moreover, knowledge acquired during this process is reused during learning of how to grasp novel objects.
We conjecture that this combined process of active and transfer learning boils down to a random search around an object, suitably biased by prior experience, to identify promising grasps.
In this paper we present an active learning method for learning of grasps for given objects, and a transfer learning method for learning of grasps for novel objects.
Our learning methods apply a kernel adaptive Metropolis-Hastings sampler that learns an approximation of the grasps' probability density of an object while drawing grasp proposals from it.
The sampler employs simulated annealing to search for globally-optimal grasps.
Our empirical results show promising applicability of our proposed learning schemes.
We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision.
We consider four syntax tasks at different depths of the parse tree; for each word, we predict its part of speech as well as the first (parent), second (grandparent) and third level (great-grandparent) constituent labels that appear above it.
These predictions are made from representations produced at different depths in networks that are pretrained with one of four objectives: dependency parsing, semantic role labeling, machine translation, or language modeling.
In every case, we find a correspondence between network depth and syntactic depth, suggesting that a soft syntactic hierarchy emerges.
This effect is robust across all conditions, indicating that the models encode significant amounts of syntax even in the absence of an explicit syntactic training supervision.
With the emergence of graph databases, the task of frequent subgraph discovery has been extensively addressed.
Although the proposed approaches in the literature have made this task feasible, the number of discovered frequent subgraphs is still very high to be efficiently used in any further exploration.
Feature selection for graph data is a way to reduce the high number of frequent subgraphs based on exact or approximate structural similarity.
However, current structural similarity strategies are not efficient enough in many real-world applications, besides, the combinatorial nature of graphs makes it computationally very costly.
In order to select a smaller yet structurally irredundant set of subgraphs, we propose a novel approach that mines the top-k topological representative subgraphs among the frequent ones.
Our approach allows detecting hidden structural similarities that existing approaches are unable to detect such as the density or the diameter of the subgraph.
In addition, it can be easily extended using any user defined structural or topological attributes depending on the sought properties.
Empirical studies on real and synthetic graph datasets show that our approach is fast and scalable.
QR decomposition is used prevalently in wireless communication.
In this paper, we express the Givens-rotation-based QR decomposition algorithm on a spatial architecture using T2S (Temporal To Spatial), a high-productivity spatial programming methodology for expressing high-performance spatial designs.
There are interesting challenges: the loop iteration space is not rectangular, and it is not obvious how the imperative algorithm can be expressed in a functional notation, the starting point of T2S.
Using QR decomposition as an example, this paper elucidates some general principle, and de-mystifies high-performance spatial programming.
The paper also serves as a tutorial of spatial programming for programmers who are not mathematicians, not expert programmers, and not experts on spatial architectures, but still hope to intuitively identify a high-performance design and map to spatial architectures efficiently.
Although automated reasoning with diagrams has been possible for some years, tools for diagrammatic reasoning are generally much less sophisticated than their sentential cousins.
The tasks of exploring levels of automation and abstraction in the construction of proofs and of providing explanations of solutions expressed in the proofs remain to be addressed.
In this paper we take an interactive proof assistant for Euler diagrams, Speedith, and add tactics to its reasoning engine, providing a level of automation in the construction of proofs.
By adding tactics to Speedith's repertoire of inferences, we ease the interaction between the user and the system and capture a higher level explanation of the essence of the proof.
We analysed the design options for tactics by using metrics which relate to human readability, such as the number of inferences and the amount of clutter present in diagrams.
Thus, in contrast to the normal case with sentential tactics, our tactics are designed to not only prove the theorem, but also to support explanation.
In this paper, we analytically study the bit error rate (BER) performance of underwater visible light communication (UVLC) systems with binary pulse position modulation (BPPM).
We simulate the channel fading-free impulse response (FFIR) based on Monte Carlo numerical method to take into account the absorption and scattering effects.
Additionally, to characterize turbulence effects, we multiply the aforementioned FFIR by a fading coefficient which for weak oceanic turbulence can be modeled as a lognormal random variable (RV).
Moreover, to mitigate turbulence effects, we employ multiple transmitters and/or receivers, i.e., spatial diversity technique over UVLC links.
Closed-form expressions for the system BER are provided, when equal gain combiner (EGC) is employed at the receiver side, thanks to Gauss-Hermite quadrature formula and approximation to the sum of lognormal RVs.
We further apply saddle-point approximation, an accurate photon-counting-based method, to evaluate the system BER in the presence of shot noise.
Both laser-based collimated and light emitting diode (LED)-based diffusive links are investigated.
Since multiple-scattering effect of UVLC channels on the propagating photons causes considerable inter-symbol interference (ISI), especially for diffusive channels, we also obtain the optimum multiple-symbol detection (MSD) algorithm to significantly alleviate ISI effects and improve the system performance.
Our numerical analysis indicates good matches between the analytical and photon-counting results implying the negligibility of signal-dependent shot noise, and also between analytical results and numerical simulations confirming the accuracy of our derived closed-form expressions for the system BER.
Besides, our results show that spatial diversity significantly mitigates fading impairments while MSD considerably alleviates ISI deteriorations.
Publications in the life sciences are characterized by a large technical vocabulary, with many lexical and semantic variations for expressing the same concept.
Towards addressing the problem of relevance in biomedical literature search, we introduce a deep learning model for the relevance of a document's text to a keyword style query.
Limited by a relatively small amount of training data, the model uses pre-trained word embeddings.
With these, the model first computes a variable-length Delta matrix between the query and document, representing a difference between the two texts, which is then passed through a deep convolution stage followed by a deep feed-forward network to compute a relevance score.
This results in a fast model suitable for use in an online search engine.
The model is robust and outperforms comparable state-of-the-art deep learning approaches.
We consider the problem of learning underlying tree structure from noisy, mixed data obtained from a linear model.
To achieve this, we use the expectation maximization algorithm combined with Chow-Liu minimum spanning tree algorithm.
This algorithm is sub-optimal, but has low complexity and is applicable to model selection problems through any linear model.
A key limitation of current multi-robot systems is a lack of relative localization, particularly in environments without GPS or motion capture systems.
This article presents a centralized method for relatively localizing a 2D swarm using sensors and beacons on the robots themselves.
The UKF-based algorithm as well as the requisite novel and cost-effective sensing hardware are discussed.
Comparisons with a motion capture system show that the method is capable of localization with errors on the order of the size of the robots.
How does one verify that the output of a complicated program is correct?
One can formally prove that the program is correct, but this may be beyond the power of existing methods.
Alternatively one can check that the output produced for a particular input satisfies the desired input-output relation, by running a checker on the input-output pair.
Then one only needs to prove the correctness of the checker.
But for some problems even such a checker may be too complicated to formally verify.
There is a third alternative: augment the original program to produce not only an output but also a correctness certificate, with the property that a very simple program (whose correctness is easy to prove) can use the certificate to verify that the input-output pair satisfies the desired input-output relation.
We consider the following important instance of this general question: How does one verify that the dominator tree of a flow graph is correct?
Existing fast algorithms for finding dominators are complicated, and even verifying the correctness of a dominator tree in the absence of additional information seems complicated.
We define a correctness certificate for a dominator tree, show how to use it to easily verify the correctness of the tree, and show how to augment fast dominator-finding algorithms so that they produce a correctness certificate.
We also relate the dominator certificate problem to the problem of finding independent spanning trees in a flow graph, and we develop algorithms to find such trees.
All our algorithms run in linear time.
Previous algorithms apply just to the special case of only trivial dominators, and they take at least quadratic time.
In this paper, we propose a novel 3D-RecGAN approach, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks.
Unlike the existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid by filling in the occluded/missing regions.
The key idea is to combine the generative capabilities of autoencoders and the conditional Generative Adversarial Networks (GAN) framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space.
Extensive experiments on large synthetic datasets show that the proposed 3D-RecGAN significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects.
Our code and data are available at: https://github.com/Yang7879/3D-RecGAN.
Segmentation of histological images is one of the most crucial tasks for many biomedical analyses including quantification of certain tissue type.
However, challenges are posed by high variability and complexity of structural features in such images, in addition to imaging artifacts.
Further, the conventional approach of manual thresholding is labor-intensive, and highly sensitive to inter- and intra-image intensity variations.
An accurate and robust automated segmentation method is of high interest.
We propose and evaluate an elegant convolutional neural network (CNN) designed for segmentation of histological images, particularly those with Masson's trichrome stain.
The network comprises of 11 successive convolutional - rectified linear unit - batch normalization layers, and outperformed state-of-the-art CNNs on a dataset of cardiac histological images (labeling fibrosis, myocytes, and background) with a Dice similarity coefficient of 0.947.
With 100 times fewer (only 300 thousand) trainable parameters, our CNN is less susceptible to overfitting, and is efficient.
Additionally, it retains image resolution from input to output, captures fine-grained details, and can be trained end-to-end smoothly.
To the best of our knowledge, this is the first deep CNN tailored for the problem of concern, and may be extended to solve similar segmentation tasks to facilitate investigations into pathology and clinical treatment.
In this paper, we theoretically address three fundamental problems involving deep convolutional networks regarding invariance, depth and hierarchy.
We introduce the paradigm of Transformation Networks (TN) which are a direct generalization of Convolutional Networks (ConvNets).
Theoretically, we show that TNs (and thereby ConvNets) are can be invariant to non-linear transformations of the input despite pooling over mere local translations.
Our analysis provides clear insights into the increase in invariance with depth in these networks.
Deeper networks are able to model much richer classes of transformations.
We also find that a hierarchical architecture allows the network to generate invariance much more efficiently than a non-hierarchical network.
Our results provide useful insight into these three fundamental problems in deep learning using ConvNets.
High-throughput data acquisition in synthetic biology leads to an abundance of data that need to be processed and aggregated into useful biological models.
Building dynamical models based on this wealth of data is of paramount importance to understand and optimize designs of synthetic biology constructs.
However, building models manually for each data set is inconvenient and might become infeasible for highly complex synthetic systems.
In this paper, we present state-of-the-art system identification techniques and combine them with chemical reaction network theory (CRNT) to generate dynamic models automatically.
On the system identification side, Sparse Bayesian Learning offers methods to learn from data the sparsest set of dictionary functions necessary to capture the dynamics of the system into ODE models; on the CRNT side, building on such sparse ODE models, all possible network structures within a given parameter uncertainty region can be computed.
Additionally, the system identification process can be complemented with constraints on the parameters to, for example, enforce stability or non-negativity---thus offering relevant physical constraints over the possible network structures.
In this way, the wealth of data can be translated into biologically relevant network structures, which then steers the data acquisition, thereby providing a vital step for closed-loop system identification.
Bankruptcy prediction is very important for all the organization since it affects the economy and rise many social problems with high costs.
There are large number of techniques have been developed to predict the bankruptcy, which helps the decision makers such as investors and financial analysts.
One of the bankruptcy prediction models is the hybrid model using Fuzzy C-means clustering and MARS, which uses static ratios taken from the bank financial statements for prediction, which has its own theoretical advantages.
The performance of existing bankruptcy model can be improved by selecting the best features dynamically depend on the nature of the firm.
This dynamic selection can be accomplished by Genetic Algorithm and it improves the performance of prediction model.
The discrete logarithm problem is one of the backbones in public key cryptography.
In this paper we study the discrete logarithm problem in the group of circulant matrices over a finite field.
This gives rise to secure and fast public key cryptosystems.
Despite its remarkable empirical success as a highly competitive branch of artificial intelligence, deep learning is often blamed for its widely known low interpretation and lack of firm and rigorous mathematical foundation.
However, most theoretical endeavor is devoted in discriminative deep learning case, whose complementary part is generative deep learning.
To the best of our knowledge, we firstly highlight landscape of empirical error in generative case to complete the full picture through exquisite design of image super resolution under norm based capacity control.
Our theoretical advance in interpretation of the training dynamic is achieved from both mathematical and biological sides.
Image registration between histology and magnetic resonance imaging (MRI) is a challenging task due to differences in structural content and contrast.
Too thick and wide specimens cannot be processed all at once and must be cut into smaller pieces.
This dramatically increases the complexity of the problem, since each piece should be individually and manually pre-aligned.
To the best of our knowledge, no automatic method can reliably locate such piece of tissue within its respective whole in the MRI slice, and align it without any prior information.
We propose here a novel automatic approach to the joint problem of multimodal registration between histology and MRI, when only a fraction of tissue is available from histology.
The approach relies on the representation of images using their level lines so as to reach contrast invariance.
Shape elements obtained via the extraction of bitangents are encoded in a projective-invariant manner, which permits the identification of common pieces of curves between two images.
We evaluated the approach on human brain histology and compared resulting alignments against manually annotated ground truths.
Considering the complexity of the brain folding patterns, preliminary results are promising and suggest the use of characteristic and meaningful shape elements for improved robustness and efficiency.
The exponential growth in data generation and large-scale data analysis creates an unprecedented need for inexpensive, low-latency, and high-density information storage.
This need has motivated significant research into multi-level memory systems that can store multiple bits of information per device.
Although both the memory state of these devices and much of the data they store are intrinsically analog-valued, both are quantized for use with digital systems and discrete error correcting codes.
Using phase change memory as a prototypical multi-level storage technology, we herein demonstrate that analog-valued devices can achieve higher capacities when paired with analog codes.
Further, we find that storing analog signals directly through joint-coding can achieve low distortion with reduced coding complexity.
By jointly optimizing for signal statistics, device statistics, and a distortion metric, finite-length analog encodings can perform comparable to digital systems with asymptotically infinite large encodings.
These results show that end-to-end analog memory systems have not only the potential to reach higher storage capacities than discrete systems, but also to significantly lower coding complexity, leading to faster and more energy efficient storage.
In this work we propose a methodology for an automatic food classification system which recognizes the contents of the meal from the images of the food.
We developed a multi-layered deep convolutional neural network (CNN) architecture that takes advantages of the features from other deep networks and improves the efficiency.
Numerous classical handcrafted features and approaches are explored, among which CNNs are chosen as the best performing features.
Networks are trained and fine-tuned using preprocessed images and the filter outputs are fused to achieve higher accuracy.
Experimental results on the largest real-world food recognition database ETH Food-101 and newly contributed Indian food image database demonstrate the effectiveness of the proposed methodology as compared to many other benchmark deep learned CNN frameworks.
A weak asynchronous system is a trace monoid with a partial action on a set.
A polygonal morphism between weak asynchronous systems commutes with the actions and preserves the independence of events.
We prove that the category of weak asynchronous systems and polygonal morphisms has all limits and colimits.
The complexity of deep neural network algorithms for hardware implementation can be much lowered by optimizing the word-length of weights and signals.
Direct quantization of floating-point weights, however, does not show good performance when the number of bits assigned is small.
Retraining of quantized networks has been developed to relieve this problem.
In this work, the effects of retraining are analyzed for a feedforward deep neural network (FFDNN) and a convolutional neural network (CNN).
The network complexity is controlled to know their effects on the resiliency of quantized networks by retraining.
The complexity of the FFDNN is controlled by varying the unit size in each hidden layer and the number of layers, while that of the CNN is done by modifying the feature map configuration.
We find that the performance gap between the floating-point and the retrain-based ternary (+1, 0, -1) weight neural networks exists with a fair amount in 'complexity limited' networks, but the discrepancy almost vanishes in fully complex networks whose capability is limited by the training data, rather than by the number of connections.
This research shows that highly complex DNNs have the capability of absorbing the effects of severe weight quantization through retraining, but connection limited networks are less resilient.
This paper also presents the effective compression ratio to guide the trade-off between the network size and the precision when the hardware resource is limited.
Mobile ad-hoc network (MANET) is a dynamic collection of mobile computers without the need for any existing infrastructure.
Nodes in a MANET act as hosts and routers.
Designing of robust routing algorithms for MANETs is a challenging task.
Disjoint multipath routing protocols address this problem and increase the reliability, security and lifetime of network.
However, selecting an optimal multipath is an NP-complete problem.
In this paper, Hopfield neural network (HNN) which its parameters are optimized by particle swarm optimization (PSO) algorithm is proposed as multipath routing algorithm.
Link expiration time (LET) between each two nodes is used as the link reliability estimation metric.
This approach can find either node-disjoint or link-disjoint paths in single phase route discovery.
Simulation results confirm that PSO-HNN routing algorithm has better performance as compared to backup path set selection algorithm (BPSA) in terms of the path set reliability and number of paths in the set.
Consumer trust is one of the key obstacles to online vendors seeking to extend their consumers across cultures.
This research identifies culture at the individual consumer level.
Based on the Stimulus-Organism-Response (SOR) model, this study focuses on the moderating role of uncertainty avoidance culture value on privacy and security as cognition influences, joy and fear as emotional influences (Stimuli), and individualism-collectivism on social networking services as social influence and subsequently on interpersonal trust (cognitive and affect-based trust) (Organism) towards purchase intention (Response).
Data were collected in Australia and the Partial least squares (PLS) approach was used to test the research model.
The findings confirmed the moderating role of individual level culture on consumer's cognitive and affect-based trust in B2Ce-commerce websites with diverse degrees of uncertainty avoidance and individualism.
Caching algorithms are usually described by the eviction method and analyzed using a metric of hit probability.
Since contents have different importance (e.g. popularity), the utility of a high hit probability, and the cost of transmission can vary across contents.
In this paper, we consider timer-based (TTL) policies across a cache network, where contents have differentiated timers over which we optimize.
Each content is associated with a utility measured in terms of the corresponding hit probability.
We start our analysis from a linear cache network: we propose a utility maximization problem where the objective is to maximize the sum of utilities and a cost minimization problem where the objective is to minimize the content transmission cost across the network.
These frameworks enable us to design online algorithms for cache management, for which we prove achieving optimal performance.
Informed by the results of our analysis, we formulate a non-convex optimization problem for a general cache network.
We show that the duality gap is zero, hence we can develop a distributed iterative primal-dual algorithm for content management in the network.
Numerical evaluations show that our algorithm significant outperforms path replication with traditional caching algorithms over some network topologies.
Finally, we consider a direct application of our cache network model to content distribution.
This letter investigates joint power control and user clustering for downlink non-orthogonal multiple access systems.
Our aim is to minimize the total power consumption by taking into account not only the conventional transmission power but also the decoding power of the users.
To solve this optimization problem, it is firstly transformed into an equivalent problem with tractable constraints.
Then, an efficient algorithm is proposed to tackle the equivalent problem by using the techniques of reweighted 1-norm minimization and majorization-minimization.
Numerical results validate the superiority of the proposed algorithm over the conventional algorithms including the popular matching-based algorithm.
This research tested the following well known strategies to deal with binary imbalanced data on 82 different real life data sets (sampled to imbalance rates of 5%, 3%, 1%, and 0.1%): class weight, SMOTE, Underbagging, and a baseline (just the base classifier).
As base classifiers we used SVM with RBF kernel, random forests, and gradient boosting machines and we measured the quality of the resulting classifier using 6 different metrics (Area under the curve, Accuracy, F-measure, G-mean, Matthew's correlation coefficient and Balanced accuracy).
The best strategy strongly depends on the metric used to measure the quality of the classifier.
For AUC and accuracy class weight and the baseline perform better; for F-measure and MCC, SMOTE performs better; and for G-mean and balanced accuracy, underbagging.
Full-duplex systems are expected to double the spectral efficiency compared to conventional half-duplex systems if the self-interference signal can be significantly mitigated.
Digital cancellation is one of the lowest complexity self-interference cancellation techniques in full-duplex systems.
However, its mitigation capability is very limited, mainly due to transmitter and receiver circuit's impairments.
In this paper, we propose a novel digital self-interference cancellation technique for full-duplex systems.
The proposed technique is shown to significantly mitigate the self-interference signal as well as the associated transmitter and receiver impairments.
In the proposed technique, an auxiliary receiver chain is used to obtain a digital-domain copy of the transmitted Radio Frequency (RF) self-interference signal.
The self-interference copy is then used in the digital-domain to cancel out both the self-interference signal and the associated impairments.
Furthermore, to alleviate the receiver phase noise effect, a common oscillator is shared between the auxiliary and ordinary receiver chains.
A thorough analytical and numerical analysis for the effect of the transmitter and receiver impairments on the cancellation capability of the proposed technique is presented.
Finally, the overall performance is numerically investigated showing that using the proposed technique, the self-interference signal could be mitigated to ~3dB higher than the receiver noise floor, which results in up to 76% rate improvement compared to conventional half-duplex systems at 20dBm transmit power values.
When supervising an object detector with weakly labeled data, most existing approaches are prone to trapping in the discriminative object parts, e.g., finding the face of a cat instead of the full body, due to lacking the supervision on the extent of full objects.
To address this challenge, we incorporate object segmentation into the detector training, which guides the model to correctly localize the full objects.
We propose the multiple instance curriculum learning (MICL) method, which injects curriculum learning (CL) into the multiple instance learning (MIL) framework.
The MICL method starts by automatically picking the easy training examples, where the extent of the segmentation masks agree with detection bounding boxes.
The training set is gradually expanded to include harder examples to train strong detectors that handle complex images.
The proposed MICL method with segmentation in the loop outperforms the state-of-the-art weakly supervised object detectors by a substantial margin on the PASCAL VOC datasets.
Kernel alignment measures the degree of similarity between two kernels.
In this paper, inspired from kernel alignment, we propose a new Linear Discriminant Analysis (LDA) formulation, kernel alignment LDA (kaLDA).
We first define two kernels, data kernel and class indicator kernel.
The problem is to find a subspace to maximize the alignment between subspace-transformed data kernel and class indicator kernel.
Surprisingly, the kernel alignment induced kaLDA objective function is very similar to classical LDA and can be expressed using between-class and total scatter matrices.
This can be extended to multi-label data.
We use a Stiefel-manifold gradient descent algorithm to solve this problem.
We perform experiments on 8 single-label and 6 multi-label data sets.
Results show that kaLDA has very good performance on many single-label and multi-label problems.
In this note we shall introduce a simple, effective numerical method for solving partial differential equations for scalar and vector-valued data defined on surfaces.
Even though we shall follow the traditional way to approximate the regular surfaces under consideration by triangular meshes, the key idea of our algorithm is to develop an intrinsic and unified way to compute directly the partial derivatives of functions defined on triangular meshes.
We shall present examples in computer graphics and image processing applications.
Traditional pattern mining algorithms generally suffer from a lack of flexibility.
In this paper, we propose a SAT formulation of the problem to successfully mine frequent flexible sequences occurring in transactional datasets.
Our SAT-based approach can easily be extended with extra constraints to address a broad range of pattern mining applications.
To demonstrate this claim, we formulate and add several constraints, such as gap and span constraints, to our model in order to extract more specific patterns.
We also use interactive solving to perform important derived tasks, such as closed pattern mining or maximal pattern mining.
Finally, we prove the practical feasibility of our SAT model by running experiments on two real datasets.
While the smart surveillance system enhanced by the Internet of Things (IoT) technology becomes an essential part of Smart Cities, it also brings new concerns in security of the data.
Compared to the traditional surveillance systems that is built following a monolithic architecture to carry out lower level operations, such as monitoring and recording, the modern surveillance systems are expected to support more scalable and decentralized solutions for advanced video stream analysis at the large volumes of distributed edge devices.
In addition, the centralized architecture of the conventional surveillance systems is vulnerable to single point of failure and privacy breach owning to the lack of protection to the surveillance feed.
This position paper introduces a novel secure smart surveillance system based on microservices architecture and blockchain technology.
Encapsulating the video analysis algorithms as various independent microservices not only isolates the video feed from different sectors, but also improve the system availability and robustness by decentralizing the operations.
The blockchain technology securely synchronizes the video analysis databases among microservices across surveillance domains, and provides tamper proof of data in the trustless network environment.
Smart contract enabled access authorization strategy prevents any unauthorized user from accessing the microservices and offers a scalable, decentralized and fine-grained access control solution for smart surveillance systems.
We propose a semantics for permutation equivalence in higher-order rewriting.
This semantics takes place in cartesian closed 2-categories, and is proved sound and complete.
In recent years, neural network approaches have been widely adopted for machine learning tasks, with applications in computer vision.
More recently, unsupervised generative models based on neural networks have been successfully applied to model data distributions via low-dimensional latent spaces.
In this paper, we use Generative Adversarial Networks (GANs) to impose structure in compressed sensing problems, replacing the usual sparsity constraint.
We propose to train the GANs in a task-aware fashion, specifically for reconstruction tasks.
We also show that it is possible to train our model without using any (or much) non-compressed data.
Finally, we show that the latent space of the GAN carries discriminative information and can further be regularized to generate input features for general inference tasks.
We demonstrate the effectiveness of our method on a variety of reconstruction and classification problems.
Outlier detection is the identification of points in a dataset that do not conform to the norm.
Outlier detection is highly sensitive to the choice of the detection algorithm and the feature subspace used by the algorithm.
Extracting domain-relevant insights from outliers needs systematic exploration of these choices since diverse outlier sets could lead to complementary insights.
This challenge is especially acute in an interactive setting, where the choices must be explored in a time-constrained manner.
In this work, we present REMIX, the first system to address the problem of outlier detection in an interactive setting.
REMIX uses a novel mixed integer programming (MIP) formulation for automatically selecting and executing a diverse set of outlier detectors within a time limit.
This formulation incorporates multiple aspects such as (i) an upper limit on the total execution time of detectors (ii) diversity in the space of algorithms and features, and (iii) meta-learning for evaluating the cost and utility of detectors.
REMIX provides two distinct ways for the analyst to consume its results: (i) a partitioning of the detectors explored by REMIX into perspectives through low-rank non-negative matrix factorization; each perspective can be easily visualized as an intuitive heatmap of experiments versus outliers, and (ii) an ensembled set of outliers which combines outlier scores from all detectors.
We demonstrate the benefits of REMIX through extensive empirical validation on real-world data.
Phishing is a common online weapon, used against users, by Phishers for acquiring a confidential information through deception.
Since the inception of internet, nearly everything, ranging from money transaction to sharing information, is done online in most parts of the world.
This has also given rise to malicious activities such as Phishing.
Detecting Phishing is an intricate process due to complexity, ambiguity and copious amount of possibilities of factors responsible for phishing .
Rough sets can be a powerful tool, when working on such kind of Applications containing vague or imprecise data.
This paper proposes an approach towards Phishing Detection Using Rough Set Theory.
The Thirteen basic factors, directly responsible towards Phishing, are grouped into four Strata.
Reliability Factor is determined on the basis of the outcome of these strata, using Rough Set Theory .
Reliability Factor determines the possibility of a suspected site to be Valid or Fake.
Using Rough set Theory most and the least influential factors towards Phishing are also determined.
Real-time algorithms for automatically recognizing surgical phases are needed to develop systems that can provide assistance to surgeons, enable better management of operating room (OR) resources and consequently improve safety within the OR.
State-of-the-art surgical phase recognition algorithms using laparoscopic videos are based on fully supervised training.
This limits their potential for widespread application, since creation of manual annotations is an expensive process considering the numerous types of existing surgeries and the vast amount of laparoscopic videos available.
In this work, we propose a new self-supervised pre-training approach based on the prediction of remaining surgery duration (RSD) from laparoscopic videos.
The RSD prediction task is used to pre-train a convolutional neural network (CNN) and long short-term memory (LSTM) network in an end-to-end manner.
Our proposed approach utilizes all available data and reduces the reliance on annotated data, thereby facilitating the scaling up of surgical phase recognition algorithms to different kinds of surgeries.
Additionally, we present EndoN2N, an end-to-end trained CNN-LSTM model for surgical phase recognition and evaluate the performance of our approach on a dataset of 120 Cholecystectomy laparoscopic videos (Cholec120).
This work also presents the first systematic study of self-supervised pre-training approaches to understand the amount of annotations required for surgical phase recognition.
Interestingly, the proposed RSD pre-training approach leads to performance improvement even when all the training data is manually annotated and outperforms the single pre-training approach for surgical phase recognition presently published in the literature.
It is also observed that end-to-end training of CNN-LSTM networks boosts surgical phase recognition performance.
This paper presents a planning system for autonomous driving among many pedestrians.
A key ingredient of our approach is PORCA, a pedestrian motion prediction model that accounts for both a pedestrian's global navigation intention and local interactions with the vehicle and other pedestrians.
Unfortunately, the autonomous vehicle does not know the pedestrian's intention a priori and requires a planning algorithm that hedges against the uncertainty in pedestrian intentions.
Our planning system combines a POMDP algorithm with the pedestrian motion model and runs in near real time.
Experiments show that it enables a robot vehicle to drive safely, efficiently, and smoothly among a crowd with a density of nearly one person per square meter.
The semantic localization problem in robotics consists in determining the place where a robot is located by means of semantic categories.
The problem is usually addressed as a supervised classification process, where input data correspond to robot perceptions while classes to semantic categories, like kitchen or corridor.
In this paper we propose a framework, implemented in the PCL library, which provides a set of valuable tools to easily develop and evaluate semantic localization systems.
The implementation includes the generation of 3D global descriptors following a Bag-of-Words approach.
This allows the generation of dimensionality-fixed descriptors from any type of keypoint detector and feature extractor combinations.
The framework has been designed, structured and implemented in order to be easily extended with different keypoint detectors, feature extractors as well as classification models.
The proposed framework has also been used to evaluate the performance of a set of already implemented descriptors, when used as input for a specific semantic localization system.
The results obtained are discussed paying special attention to the internal parameters of the BoW descriptor generation process.
Moreover, we also review the combination of some keypoint detectors with different 3D descriptor generation techniques.
This paper presents a neural network-based end-to-end clustering framework.
We design a novel strategy to utilize the contrastive criteria for pushing data-forming clusters directly from raw data, in addition to learning a feature embedding suitable for such clustering.
The network is trained with weak labels, specifically partial pairwise relationships between data instances.
The cluster assignments and their probabilities are then obtained at the output layer by feed-forwarding the data.
The framework has the interesting characteristic that no cluster centers need to be explicitly specified, thus the resulting cluster distribution is purely data-driven and no distance metrics need to be predefined.
The experiments show that the proposed approach beats the conventional two-stage method (feature embedding with k-means) by a significant margin.
It also compares favorably to the performance of the standard cross entropy loss for classification.
Robustness analysis also shows that the method is largely insensitive to the number of clusters.
Specifically, we show that the number of dominant clusters is close to the true number of clusters even when a large k is used for clustering.
Proportionate-type normalized suband adaptive filter (PNSAF-type) algorithms are very attractive choices for echo cancellation.
To further obtain both fast convergence rate and low steady-state error, in this paper, a variable step size (VSS) version of the presented improved PNSAF (IPNSAF) algorithm is proposed by minimizing the square of the noise-free a posterior subband error signals.
A noniterative shrinkage method is used to recover the noise-free a priori subband error signals from the noisy subband error signals.
Significantly, the proposed VSS strategy can be applied to any other PNSAF-type algorithm, since it is independent of the proportionate principles.
Simulation results in the context of acoustic echo cancellation have demonstrated the effectiveness of the proposed method.
In recent times Massive Multiplayer Online Game has appeared as a computer game that enables hundreds of players from all parts of the world to interact in a game world (common platform) at the same time instance.
Current architecture used for MMOGs based on the classic tightly coupled distributed system.
While, MMOGs are getting more interactive same time number of interacting users is increasing, classic implementation architecture may raise scalability and interdependence issues.
This requires a loosely coupled service oriented architecture to support evolution in MMOG application.
Data flow architecture, Event driven architecture and client server architecture are basic date orchestration approaches used by any service oriented architecture.
Real time service is hottest issue for service oriented architecture.
The basic requirement of any real time service oriented architecture is to ensure the quality of service.
In this paper we have proposed a service oriented architecture for massive multiplayer online game and a specific middleware (based on open source DDS) in MMOGs for fulfilling real time constraints.
A foundation for closing the gap between biometrics in the narrower and the broader perspective is presented trough a conceptualization of biometric systems in both perspectives.
A clear distinction between verification, identification and classification systems is made as well as shown that there are additional classes of biometric systems.
In the end a Unified Modeling Language model is developed showing the connections between the two perspectives.
In this paper we present our winning entry at the 2018 ECCV PoseTrack Challenge on 3D human pose estimation.
Using a fully-convolutional backbone architecture, we obtain volumetric heatmaps per body joint, which we convert to coordinates using soft-argmax.
Absolute person center depth is estimated by a 1D heatmap prediction head.
The coordinates are back-projected to 3D camera space, where we minimize the L1 loss.
Key to our good results is the training data augmentation with randomly placed occluders from the Pascal VOC dataset.
In addition to reaching first place in the Challenge, our method also surpasses the state-of-the-art on the full Human3.6M benchmark among methods that use no additional pose datasets in training.
Code for applying synthetic occlusions is availabe at https://github.com/isarandi/synthetic-occlusion.
An alternative pathway for the human brain to communicate with the outside world is by means of a brain computer interface (BCI).
A BCI can decode electroencephalogram (EEG) signals of brain activities, and then send a command or an intent to an external interactive device, such as a wheelchair.
The effectiveness of the BCI depends on the performance in decoding the EEG.
Usually, the EEG is contaminated by different kinds of artefacts (e.g., electromyogram (EMG), background activity), which leads to a low decoding performance.
A number of filtering methods can be utilized to remove or weaken the effects of artefacts, but they generally fail when the EEG contains extreme artefacts.
In such cases, the most common approach is to discard the whole data segment containing extreme artefacts.
This causes the fatal drawback that the BCI cannot output decoding results during that time.
In order to solve this problem, we employ the Lomb-Scargle periodogram to estimate the spectral power from incomplete EEG (after removing only parts contaminated by artefacts), and Denoising Autoencoder (DAE) for learning.
The proposed method is evaluated with motor imagery EEG data.
The results show that our method can successfully decode incomplete EEG to good effect.
This article is an attempt to combine different ways of working with sets of objects and their classes for designing and development of artificial intelligent systems (AIS) of analysis information, using object-oriented programming (OOP).
This paper contains analysis of basic concepts of OOP and their relation with set theory and artificial intelligence (AI).
Process of sets and multisets creation from different sides, in particular mathematical set theory, OOP and AI is considered.
Definition of object and its properties, homogeneous and inhomogeneous classes of objects, set of objects, multiset of objects and constructive methods of their creation and classification are proposed.
In addition, necessity of some extension of existing OOP tools for the purpose of practical implementation AIS of analysis information, using proposed approach, is shown.
A recommender system's basic task is to estimate how users will respond to unseen items.
This is typically modeled in terms of how a user might rate a product, but here we aim to extend such approaches to model how a user would write about the product.
To do so, we design a character-level Recurrent Neural Network (RNN) that generates personalized product reviews.
The network convincingly learns styles and opinions of nearly 1000 distinct authors, using a large corpus of reviews from BeerAdvocate.com.
It also tailors reviews to describe specific items, categories, and star ratings.
Using a simple input replication strategy, the Generative Concatenative Network (GCN) preserves the signal of static auxiliary inputs across wide sequence intervals.
Without any additional training, the generative model can classify reviews, identifying the author of the review, the product category, and the sentiment (rating), with remarkable accuracy.
Our evaluation shows the GCN captures complex dynamics in text, such as the effect of negation, misspellings, slang, and large vocabularies gracefully absent any machinery explicitly dedicated to the purpose.
The performance of prediction models is often based on "abstract metrics" that estimate the model's ability to limit residual errors between the observed and predicted values.
However, meaningful evaluation and selection of prediction models for end-user domains requires holistic and application-sensitive performance measures.
Inspired by energy consumption prediction models used in the emerging "big data" domain of Smart Power Grids, we propose a suite of performance measures to rationally compare models along the dimensions of scale independence, reliability, volatility and cost.
We include both application independent and dependent measures, the latter parameterized to allow customization by domain experts to fit their scenario.
While our measures are generalizable to other domains, we offer an empirical analysis using real energy use data for three Smart Grid applications: planning, customer education and demand response, which are relevant for energy sustainability.
Our results underscore the value of the proposed measures to offer a deeper insight into models' behavior and their impact on real applications, which benefit both data mining researchers and practitioners.
This paper studies the effects on user welfare of imposing network neutrality, using a game-theoretic model of provider interactions based on a two-sided market framework: we assume that the platform--the last-mile access providers (ISPs)--are monopolists, and consider content providers (CPs) entry decisions.
All decisions affect the choices made by users, who are sensitive both to CP and ISP investments (in content creation and quality-of-service, respectively).
In a non-neutral regime, CPs and ISPs can charge each other, while such charges are prohibited in the neutral regime.
We assume those charges (if any) are chosen by CPs, a direction rarely considered in the literature, where they are assumed fixed by ISPs.
Our analysis suggests that, unexpectedly, more CPs enter the market in a non-neutral regime where they pay ISPs, than without such payments.
Additionally, in this case ISPs tend to invest more than in the neutral regime.
From our results, the best regime in terms of user welfare is parameter dependent, calling for caution in designing neutrality regulations.
The accuracy of indoor wireless localization systems can be substantially enhanced by map-awareness, i.e., by the knowledge of the map of the environment in which localization signals are acquired.
In fact, this knowledge can be exploited to cancel out, at least to some extent, the signal degradation due to propagation through physical obstructions, i.e., to the so called non-line-of-sight bias.
This result can be achieved by developing novel localization techniques that rely on proper map-aware statistical modelling of the measurements they process.
In this manuscript a unified statistical model for the measurements acquired in map-aware localization systems based on time-of-arrival and received signal strength techniques is developed and its experimental validation is illustrated.
Finally, the accuracy of the proposed map-aware model is assessed and compared with that offered by its map-unaware counterparts.
Our numerical results show that, when the quality of acquired measurements is poor, map-aware modelling can enhance localization accuracy by up to 110% in certain scenarios.
Rate adaptation and transmission power control in 802.11 WLANs have received a lot of attention from the research community, with most of the proposals aiming at maximising throughput based on network conditions.
Considering energy consumption, an implicit assumption is that optimality in throughput implies optimality in energy efficiency, but this assumption has been recently put into question.
In this paper, we address via analysis, simulation and experimentation the relation between throughput performance and energy efficiency in multi-rate 802.11 scenarios.
We demonstrate the trade-off between these performance figures, confirming that they may not be simultaneously optimised, and analyse their sensitivity towards the energy consumption parameters of the device.
We analyse this trade-off in existing rate adaptation with transmission power control algorithms, and discuss how to design novel schemes taking energy consumption into account.
Spurred by the growth of transportation network companies and increasing data capabilities, vehicle routing and ride-matching algorithms can improve the efficiency of private transportation services.
However, existing routing solutions do not address where drivers should travel after dropping off a passenger and before receiving the next passenger ride request, i.e., during the between-ride period.
We address this problem by developing an efficient algorithm to find the optimal policy for drivers between rides in order to maximize driver profits.
We model the road network as a graph, and we show that the between-ride routing problem is equivalent to a stochastic shortest path problem, an infinite dynamic program with no discounting.
We prove under reasonable assumptions that an optimal routing policy exists that avoids cycles; policies of this type can be efficiently found.
We present an iterative approach to find an optimal routing policy.
Our approach can account for various factors, including the frequency of passenger ride requests at different locations, traffic conditions, and surge pricing.
We demonstrate the effectiveness of the approach by implementing it on road network data from Boston and New York City.
Large-scale distributed training of deep neural networks suffer from the generalization gap caused by the increase in the effective mini-batch size.
Previous approaches try to solve this problem by varying the learning rate and batch size over epochs and layers, or some ad hoc modification of the batch normalization.
We propose an alternative approach using a second-order optimization method that shows similar generalization capability to first-order methods, but converges faster and can handle larger mini-batches.
To test our method on a benchmark where highly optimized first-order methods are available as references, we train ResNet-50 on ImageNet.
We converged to 75% Top-1 validation accuracy in 35 epochs for mini-batch sizes under 16,384, and achieved 75% even with a mini-batch size of 131,072, which took 100 epochs.
This article presents a two-stage topological algorithm for recovering an estimate of a quasiperiodic function from a set of noisy measurements.
The first stage of the algorithm is a topological phase estimator, which detects the quasiperiodic structure of the function without placing additional restrictions on the function.
By respecting this phase estimate, the algorithm avoids creating distortion even when it uses a large number of samples for the estimate of the function.
WPaxos is a multileader Paxos protocol that provides low-latency and high-throughput consensus across wide-area network (WAN) deployments.
Unlike statically partitioned multiple Paxos deployments, WPaxos perpetually adapts to the changing access locality through object stealing.
Multiple concurrent leaders coinciding in different zones steal ownership of objects from each other using phase-1 of Paxos, and then use phase-2 to commit update-requests on these objects locally until they are stolen by other leaders.
To achieve fast phase-2 commits, WPaxos adopts the flexible quorums idea in a novel manner, and appoints phase-2 acceptors to be close to their respective leaders.
We implemented WPaxos and evaluated it on WAN deployments across 5 AWS regions.
The dynamic partitioning of the object-space and emphasis on zone-local commits allow WPaxos to significantly outperform both partitioned Paxos deployments and leaderless Paxos approaches, while providing the same consistency guarantees.
The increasing use of social networks generates enormous amounts of data that can be used for many types of analysis.
Some of these data have temporal and geographical information, which can be used for comprehensive examination.
In this paper, we propose a new method to analyze the massive volume of messages available in Twitter to identify places in the world where topics such as TV shows, climate change, disasters, and sports are emerging.
The proposed method is based on a neural network that is used to detect outliers from a time series, which is built upon statistical data from tweets located on different political divisions (i.e., countries, cities).
The outliers are used to identify topics within an abnormal behavior in Twitter.
The effectiveness of our method is evaluated in an online environment indicating new findings on modeling local people's behavior from different places.
Functional neuroimaging can measure the brain?s response to an external stimulus.
It is used to perform brain mapping: identifying from these observations the brain regions involved.
This problem can be cast into a linear supervised learning task where the neuroimaging data are used as predictors for the stimulus.
Brain mapping is then seen as a support recovery problem.
On functional MRI (fMRI) data, this problem is particularly challenging as i) the number of samples is small due to limited acquisition time and ii) the variables are strongly correlated.
We propose to overcome these difficulties using sparse regression models over new variables obtained by clustering of the original variables.
The use of randomization techniques, e.g. bootstrap samples, and clustering of the variables improves the recovery properties of sparse methods.
We demonstrate the benefit of our approach on an extensive simulation study as well as two fMRI datasets.
Evidence-based health care (EBHC) is an important practice of medicine which attempts to provide systematic scientific evidence to answer clinical questions.
In this context, Epistemonikos (www.epistemonikos.org) is one of the first and most important online systems in the field, providing an interface that supports users on searching and filtering scientific articles for practicing EBHC.
The system nowadays requires a large amount of expert human effort, where close to 500 physicians manually curate articles to be utilized in the platform.
In order to scale up the large and continuous amount of data to keep the system updated, we introduce EpistAid, an interactive intelligent interface which supports clinicians in the process of curating documents for Epistemonikos within lists of papers called evidence matrices.
We introduce the characteristics, design and algorithms of our solution, as well as a prototype implementation and a case study to show how our solution addresses the information overload problem in this area.
Input validation is the first line of defense against malformed or malicious inputs.
It is therefore critical that the validator (which is often part of the parser) is free of bugs.
To build dependable input validators, we propose using parser generators for context-free languages.
In the context of network protocols, various works have pointed at context-free languages as falling short to specify precisely or concisely common idioms found in protocols.
We review those assessments and perform a rigorous, language-theoretic analysis of several common protocol idioms.
We then demonstrate the practical value of our findings by developing a modular, robust, and efficient input validator for HTTP relying on context-free grammars and regular expressions.
In the framework of computational complexity and in an effort to define a more natural reduction for problems of equivalence, we investigate the recently introduced kernel reduction, a reduction that operates on each element of a pair independently.
This paper details the limitations and uses of kernel reductions.
We show that kernel reductions are weaker than many-one reductions and provide conditions under which complete problems exist.
Ultimately, the number and size of equivalence classes can dictate the existence of a kernel reduction.
We leave unsolved the unconditional existence of a complete problem under polynomial-time kernel reductions for the standard complexity classes.
Improving patient care safety is an ultimate objective for medical cyber-physical systems.
A recent study shows that the patients' death rate can be significantly reduced by computerizing medical best practice guidelines.
To facilitate the development of computerized medical best practice guidelines, statecharts are often used as a modeling tool because of their high resemblances to disease and treatment models and their capabilities to provide rapid prototyping and simulation for clinical validations.
However, some implementations of statecharts, such as Yakindu statecharts, are priority-based and have synchronous execution semantics which makes it difficult to model certain functionalities that are essential in modeling medical guidelines, such as two-way communications and configurable execution orders.
Rather than introducing new statechart elements or changing the statechart implementation's underline semantics, we use existing basic statechart elements to design model patterns for the commonly occurring issues.
In particular, we show the design of model patterns for two-way communications and configurable execution orders and formally prove the correctness of these model patterns.
We further use a simplified airway laser surgery scenario as a case study to demonstrate how the developed model patterns address the two-way communication and configurable execution order issues and their impact on validation and verification of medical safety properties.
Recommender systems take inputs from user history, use an internal ranking algorithm to generate results and possibly optimize this ranking based on feedback.
However, often the recommender system is unaware of the actual intent of the user and simply provides recommendations dynamically without properly understanding the thought process of the user.
An intelligent recommender system is not only useful for the user but also for businesses which want to learn the tendencies of their users.
Finding out tendencies or intents of a user is a difficult problem to solve.
Keeping this in mind, we sought out to create an intelligent system which will keep track of the user's activity on a web-application as well as determine the intent of the user in each session.
We devised a way to encode the user's activity through the sessions.
Then, we have represented the information seen by the user in a high dimensional format which is reduced to lower dimensions using tensor factorization techniques.
The aspect of intent awareness (or scoring) is dealt with at this stage.
Finally, combining the user activity data with the contextual information gives the recommendation score.
The final recommendations are then ranked using filtering and collaborative recommendation techniques to show the top-k recommendations to the user.
A provision for feedback is also envisioned in the current system which informs the model to update the various weights in the recommender system.
Our overall model aims to combine both frequency-based and context-based recommendation systems and quantify the intent of a user to provide better recommendations.
We ran experiments on real-world timestamped user activity data, in the setting of recommending reports to the users of a business analytics tool and the results are better than the baselines.
We also tuned certain aspects of our model to arrive at optimized results.
Applications in many domains require processing moving object trajectories.
In this work, we focus on a trajectory similarity search that finds all trajectories within a given distance of a query trajectory over a time interval, which we call the distance threshold similarity search.
We develop three indexing strategies with spatial, temporal and spatiotemporal selectivity for the GPU that differ significantly from indexes suitable for the CPU, and show the conditions under which each index achieves good performance.
Furthermore, we show that the GPU implementations outperform multithreaded CPU implementations in a range of experimental scenarios, making the GPU an attractive technology for processing moving object trajectories.
We test our implementations on two synthetic and one real-world dataset of a galaxy merger.
This paper describes a resolution based Description Logic reasoning system called DLog.
DLog transforms Description Logic axioms into a Prolog program and uses the standard Prolog execution for efficiently answering instance retrieval queries.
From the Description Logic point of view, DLog is an ABox reasoning engine for the full SHIQ language.
The DLog approach makes it possible to store the individuals in a database instead of memory, which results in better scalability and helps using description logic ontologies directly on top of existing information sources.
To appear in Theory and Practice of Logic Programming (TPLP).
Any non-trivial concurrent system warrants synchronisation, regardless of the concurrency model.
Actor-based concurrency serialises all computations in an actor through asynchronous message passing.
In contrast, lock-based concurrency serialises some computations by following a lock--unlock protocol for accessing certain data.
Both systems require sound reasoning about pointers and aliasing to exclude data-races.
If actor isolation is broken, so is the single-thread-of-control abstraction.
Similarly for locks, if a datum is accessible outside of the scope of the lock, the datum is not governed by the lock.
In this paper we discuss how to balance aliasing and synchronisation.
In previous work, we defined a type system that guarantees data-race freedom of actor-based concurrency and lock-based concurrency.
This paper extends this work by the introduction of two programming constructs; one for decoupling isolation and synchronisation and one for constructing higher-level atomicity guarantees from lower-level synchronisation.
We focus predominantly on actors, and in particular the Encore programming language, but our ultimate goal is to define our constructs in such a way that they can be used both with locks and actors, given that combinations of both models occur frequently in actual systems.
We discuss the design space, provide several formalisations of different semantics and discuss their properties, and connect them to case studies showing how our proposed constructs can be useful.
We also report on an on-going implementation of our proposed constructs in Encore.
Multi-tenant cloud networks have various security and monitoring service functions (SFs) that constitute a service function chain (SFC) between two endpoints.
SF rule ordering overlaps and policy conflicts can cause increased latency, service disruption and security breaches in cloud networks.
Software Defined Network (SDN) based Network Function Virtualization (NFV) has emerged as a solution that allows dynamic SFC composition and traffic steering in a cloud network.
We propose an SDN enabled Universal Policy Checking (SUPC) framework, to provide 1) Flow Composition and Ordering by translating various SF rules into the OpenFlow format.
This ensures elimination of redundant rules and policy compliance in SFC.
2) Flow conflict analysis to identify conflicts in header space and actions between various SF rules.
Our results show a significant reduction in SF rules on composition.
Additionally, our conflict checking mechanism was able to identify several rule conflicts that pose security, efficiency, and service availability issues in the cloud network.
Sparse tensors appear in many large-scale applications with multidimensional and sparse data.
While multidimensional sparse data often need to be processed on manycore processors, attempts to develop highly-optimized GPU-based implementations of sparse tensor operations are rare.
The irregular computation patterns and sparsity structures as well as the large memory footprints of sparse tensor operations make such implementations challenging.
We leverage the fact that sparse tensor operations share similar computation patterns to propose a unified tensor representation called F-COO.
Combined with GPU-specific optimizations, F-COO provides highly-optimized implementations of sparse tensor computations on GPUs.
The performance of the proposed unified approach is demonstrated for tensor-based kernels such as the Sparse Matricized Tensor- Times-Khatri-Rao Product (SpMTTKRP) and the Sparse Tensor- Times-Matrix Multiply (SpTTM) and is used in tensor decomposition algorithms.
Compared to state-of-the-art work we improve the performance of SpTTM and SpMTTKRP up to 3.7 and 30.6 times respectively on NVIDIA Titan-X GPUs.
We implement a CANDECOMP/PARAFAC (CP) decomposition and achieve up to 14.9 times speedup using the unified method over state-of-the-art libraries on NVIDIA Titan-X GPUs.
We study two mixed robust/average-case submodular partitioning problems that we collectively call Submodular Partitioning.
These problems generalize both purely robust instances of the problem (namely max-min submodular fair allocation (SFA) and min-max submodular load balancing (SLB) and also generalize average-case instances (that is the submodular welfare problem (SWP) and submodular multiway partition (SMP).
While the robust versions have been studied in the theory community, existing work has focused on tight approximation guarantees, and the resultant algorithms are not, in general, scalable to very large real-world applications.
This is in contrast to the average case, where most of the algorithms are scalable.
In the present paper, we bridge this gap, by proposing several new algorithms (including those based on greedy, majorization-minimization, minorization-maximization, and relaxation algorithms) that not only scale to large sizes but that also achieve theoretical approximation guarantees close to the state-of-the-art, and in some cases achieve new tight bounds.
We also provide new scalable algorithms that apply to additive combinations of the robust and average-case extreme objectives.
We show that these problems have many applications in machine learning (ML).
This includes: 1) data partitioning and load balancing for distributed machine algorithms on parallel machines; 2) data clustering; and 3) multi-label image segmentation with (only) Boolean submodular functions via pixel partitioning.
We empirically demonstrate the efficacy of our algorithms on real-world problems involving data partitioning for distributed optimization of standard machine learning objectives (including both convex and deep neural network objectives), and also on purely unsupervised (i.e., no supervised or semi-supervised learning, and no interactive segmentation) image segmentation.
We consider two classes of computations which admit taking linear combinations of execution runs: probabilistic sampling and generalized animation.
We argue that the task of program learning should be more tractable for these architectures than for conventional deterministic programs.
We look at the recent advances in the "sampling the samplers" paradigm in higher-order probabilistic programming.
We also discuss connections between partial inconsistency, non-monotonic inference, and vector semantics.
Mobile phone usage provides a wealth of information, which can be used to better understand the demographic structure of a population.
In this paper we focus on the population of Mexican mobile phone users.
Our first contribution is an observational study of mobile phone usage according to gender and age groups.
We were able to detect significant differences in phone usage among different subgroups of the population.
Our second contribution is to provide a novel methodology to predict demographic features (namely age and gender) of unlabeled users by leveraging individual calling patterns, as well as the structure of the communication graph.
We provide details of the methodology and show experimental results on a real world dataset that involves millions of users.
In recent work, we formalized the theory of optimal-size sorting networks with the goal of extracting a verified checker for the large-scale computer-generated proof that 25 comparisons are optimal when sorting 9 inputs, which required more than a decade of CPU time and produced 27 GB of proof witnesses.
The checker uses an untrusted oracle based on these witnesses and is able to verify the smaller case of 8 inputs within a couple of days, but it did not scale to the full proof for 9 inputs.
In this paper, we describe several non-trivial optimizations of the algorithm in the checker, obtained by appropriately changing the formalization and capitalizing on the symbiosis with an adequate implementation of the oracle.
We provide experimental evidence of orders of magnitude improvements to both runtime and memory footprint for 8 inputs, and actually manage to check the full proof for 9 inputs.
In this paper we present an adaptable fast matrix multiplication (AFMM) algorithm, for two nxn dense matrices which computes the product matrix with average complexity Tavg(n) = d1d2n3 with the acknowledgement that the average count is obtained for addition as the basic operation rather than multiplication which is probably the unquestionable choice for basic operation in existing matrix multiplication algorithms.
Communication systems with low-resolution analog-to-digital-converters (ADCs) can exploit channel state information at the transmitter (CSIT) and receiver.
This paper presents initial results on codebook design and performance analysis for limited feedback systems with one-bit ADCs.
Different from the high-resolution case, the absolute phase at the receiver is important to align the phase of the received signals when the received signal is sliced by one-bit ADCs.
A new codebook design for the beamforming case is proposed that separately quantizes the channel direction and the residual phase.
Identification of minimum number of local regions of a handwritten character image, containing well-defined discriminating features which are sufficient for a minimal but complete description of the character is a challenging task.
A new region selection technique based on the idea of an enhanced Harmony Search methodology has been proposed here.
The powerful framework of Harmony Search has been utilized to search the region space and detect only the most informative regions for correctly recognizing the handwritten character.
The proposed method has been tested on handwritten samples of Bangla Basic, Compound and mixed (Basic and Compound characters) characters separately with SVM based classifier using a longest run based feature-set obtained from the image subregions formed by a CG based quad-tree partitioning approach.
Applying this methodology on the above mentioned three types of datasets, respectively 43.75%, 12.5% and 37.5% gains have been achieved in terms of region reduction and 2.3%, 0.6% and 1.2% gains have been achieved in terms of recognition accuracy.
The results show a sizeable reduction in the minimal number of descriptive regions as well a significant increase in recognition accuracy for all the datasets using the proposed technique.
Thus the time and cost related to feature extraction is decreased without dampening the corresponding recognition accuracy.
Several researchers have argued that a machine learning system's interpretability should be defined in relation to a specific agent or task: we should not ask if the system is interpretable, but to whom is it interpretable.
We describe a model intended to help answer this question, by identifying different roles that agents can fulfill in relation to the machine learning system.
We illustrate the use of our model in a variety of scenarios, exploring how an agent's role influences its goals, and the implications for defining interpretability.
Finally, we make suggestions for how our model could be useful to interpretability researchers, system developers, and regulatory bodies auditing machine learning systems.
It is generally accepted as common wisdom that receiving social feedback is helpful to (i) keep an individual engaged with a community and to (ii) facilitate an individual's positive behavior change.
However, quantitative data on the effect of social feedback on continued engagement in an online health community is scarce.
In this work we apply Mahalanobis Distance Matching (MDM) to demonstrate the importance of receiving feedback in the "loseit" weight loss community on Reddit.
Concretely we show that (i) even when correcting for differences in word choice, users receiving more positive feedback on their initial post are more likely to return in the future, and that (ii) there are diminishing returns and social feedback on later posts is less important than for the first post.
We also give a description of the type of initial posts that are more likely to attract this valuable social feedback.
Though we cannot yet argue about ultimate weight loss success or failure, we believe that understanding the social dynamics underlying online health communities is an important step to devise more effective interventions.
Tor is the most widely used anonymity network, currently serving millions of users each day.
However, there is no access control in place for all these users, leaving the network vulnerable to botnet abuse and attacks.
For example, criminals frequently use exit relays as stepping stones for attacks, causing service providers to serve CAPTCHAs to exit relay IP addresses or blacklisting them altogether, which leads to severe usability issues for legitimate Tor users.
To address this problem, we propose TorPolice, the first privacy-preserving access control framework for Tor.
TorPolice enables abuse-plagued service providers such as Yelp to enforce access rules to police and throttle malicious requests coming from Tor while still providing service to legitimate Tor users.
Further, TorPolice equips Tor with global access control for relays, enhancing Tor's resilience to botnet abuse.
We show that TorPolice preserves the privacy of Tor users, implement a prototype of TorPolice, and perform extensive evaluations to validate our design goals.
Emergency communications requires reliability and flexibility for disaster recovery and relief operation.
Based upon existing commercial portable devices (e.g., smartphones, tablets, laptops), we propose a network architecture that uses cellular networks and WiFi connections to deliver large files in emergency scenarios under the impairments of wireless channel such as packet losses and intermittent connection issues.
Network coding (NC) is exploited to improve the delivery probability.
We first review the state-of-the-art of NC for emergency communications.
Then, we present the proposed network architecture which utilizes multiple radio interfaces of portable devices to support data delivery.
A random linear NC scheme is exploited at source to enhance the reliability for content delivery against packet losses.
Besides, an analytical model for the successful decoding probability in linear NC is derived.
Finally, we evaluate the effectiveness of the proposed architecture with NC in terms of the delivery ratio of content for intermittent connectivity scenarios.
Photoacoustic spectral analysis is a novel tool for studying various parameters affecting signals in Photoacoustic microscopy.
But only observing frequency components of photoacoustic signals doesn't make enough data for a desirable analysis.
Thus a hybrid time-domain and frequency-domain analysis scheme has been proposed to investigate effects of various parameters like depth of microscopy, laser focal spot size and contrast agent concentration on Photoacoustic signals.
Liquids are an important part of many common manipulation tasks in human environments.
If we wish to have robots that can accomplish these types of tasks, they must be able to interact with liquids in an intelligent manner.
In this paper, we investigate ways for robots to perceive and reason about liquids.
That is, a robot asks the questions What in the visual data stream is liquid? and How can I use that to infer all the potential places where liquid might be?
We collected two datasets to evaluate these questions, one using a realistic liquid simulator and another on our robot.
We used fully convolutional neural networks to learn to detect and track liquids across pouring sequences.
Our results show that these networks are able to perceive and reason about liquids, and that integrating temporal information is important to performing such tasks well.
Recent incidents of data breaches call for organizations to proactively identify cyber attacks on their systems.
Darkweb/Deepweb (D2web) forums and marketplaces provide environments where hackers anonymously discuss existing vulnerabilities and commercialize malicious software to exploit those vulnerabilities.
These platforms offer security practitioners a threat intelligence environment that allows to mine for patterns related to organization-targeted cyber attacks.
In this paper, we describe a system (called DARKMENTION) that learns association rules correlating indicators of attacks from D2web to real-world cyber incidents.
Using the learned rules, DARKMENTION generates and submits warnings to a Security Operations Center (SOC) prior to attacks.
Our goal was to design a system that automatically generates enterprise-targeted warnings that are timely, actionable, accurate, and transparent.
We show that DARKMENTION meets our goal.
In particular, we show that it outperforms baseline systems that attempt to generate warnings of cyber attacks related to two enterprises with an average increase in F1 score of about 45% and 57%.
Additionally, DARKMENTION was deployed as part of a larger system that is built under a contract with the IARPA Cyber-attack Automated Unconventional Sensor Environment (CAUSE) program.
It is actively producing warnings that precede attacks by an average of 3 days.
How many links can be cut before a network is bisected?
What is the maximal bandwidth that can be pushed between two nodes of a network?
These questions are closely related to network resilience, path choice for multipath routing or bisection bandwidth estimations in data centers.
The answer is quantified using metrics such as the number of edge-disjoint paths between two network nodes and the cumulative bandwidth that can flow over these paths.
In practice though, such calculations are far from simple due to the restrictive effect of network policies on path selection.
Policies are set by network administrators to conform to service level agreements, protect valuable resources or optimize network performance.
In this work, we introduce a general methodology for estimating lower and upper bounds for the policy-compliant path diversity and bisection bandwidth between two nodes of a network, effectively quantifying the effect of policies on these metrics.
Exact values can be obtained if certain conditions hold.
The approach is based on regular languages and can be applied in a variety of use cases.
Shannon entropy was defined for probability distributions and then its using was expanded to measure the uncertainty of knowledge for systems with complete information.
In this article, it is proposed to extend the using of Shannon entropy to under-defined or over-defined information systems.
To be able to use Shannon entropy, the information is normalized by an affine transformation.
The construction of affine transformation is done in two stages: one for homothety and another for translation.
Moreover, the case of information with a certain degree of imprecision was included in this approach.
Besides, the article shows the using of Shannon entropy for some particular cases such as: neutrosophic information both in the trivalent and bivalent case, bifuzzy information, intuitionistic fuzzy information, imprecise fuzzy information, and fuzzy partitions.
The increasing use of online channels for service delivery raises new challenges in service failure prevention.
This work-in-progress paper reports on the first phase of an action-design research project to develop a service failure prevention methodology.
In this paper we review the literature on online services, failure prevention and failure recovery and develop a theoretical framework for online service failure prevention.
This provides the theoretical grounding for the artefact (the methodology) to be developed.
We use this framework to develop an initial draft of our methodology.
We then outline the remaining phases of the research, and offer some initial conclusions gained from the project to date.
Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards.
However, this places on environment designers the onus of designing language-conditional reward functions which may not be easily or tractably implemented as the complexity of the environment and the language scales.
To overcome this limitation, we present a framework within which instruction-conditional RL agents are trained using rewards obtained not from the environment, but from reward models which are jointly trained from expert examples.
As reward models improve, they learn to accurately reward agents for completing tasks for environment configurations---and for instructions---not present amongst the expert data.
This framework effectively separates the representation of what instructions require from how they can be executed.
In a simple grid world, it enables an agent to learn a range of commands requiring interaction with blocks and understanding of spatial relations and underspecified abstract arrangements.
We further show the method allows our agent to adapt to changes in the environment without requiring new expert examples.
We analyzed the relation between surgical service providers' network structure and surgical team size with patient outcome during the operation.
We did correlation analysis to evaluate the associations among the network structure measures in the intra-operative networks of surgical service providers.
We focused on intra-operative networks of surgical service providers, in a quaternary-care academic medical center, using retrospective Electronic Medical Record (EMR) data.
We used de-identified intra-operative data for adult patients who received nonambulatory/nonobstetric surgery in a main operating room at Shands at the University of Florida between June 1, 2011 and November 1, 2014.
The intra-operative dataset contained 30,211 unique surgical cases.
To perform the analysis, we created the networks of surgical service providers and calculated several network structure measures at both team and individual levels.
We considered number of patients' complications as the target variable and assessed its interrelations with the calculated network measures along with other influencing factors (e.g. surgical team size, type of surgery).
Our results confirm the significant role of interactions among surgical providers on patient outcome.
In addition, we observed that highly central providers at the global network level are more likely to be associated with a lower number of surgical complications, while locally important providers might be associated with higher number of complications.
We also found a positive relation between age of patients and number of complications.
In many practical cases, the engineer has access to prior knowledge like rough values of the DC-gain or the main time constant of the system.
In order to improve the accuracy of subspace-based identification techniques using the model Markov parameters, we derive in this short paper the direct links between these impulse response coefficients and this prior information.
The next step will consist in introducing this prior knowledge explicitly in Kung's algorithm thank to dedicated equality and equality constraints.
Virtual machine is built on group of real servers which are scattered globally and connect together through the telecommunications systems, it has an increasingly important role in the operation, providing the ability to exploit virtual resources.
The latest technique helps to use computing resources more effectively and has many benefits, such as cost reduction of power, cooling and, hence, contributes to the Green Computing.
To ensure the supply of these resources to demand processes correctly and promptly, avoiding any duplication or conflict, especially remote resources, it is necessary to study and propose a reliable solution appropriate to be the foundation for internal control systems in the cloud.
In the scope of this paper, we find a way to produce efficient distributed resources which emphasizes solutions preventing deadlock and proposing methods to avoid resource shortage issue.
With this approach, the outcome result is the checklist of re-sources state which has the possibility of deadlock and lack of resources, by sending messages to the servers, the server would know the situation and have corresponding reaction.
A novel semantic approach to data selection and compression is presented for the dynamic adaptation of IoT data processing and transmission within "wireless islands", where a set of sensing devices (sensors) are interconnected through one-hop wireless links to a computational resource via a local access point.
The core of the proposed technique is a cooperative framework where local classifiers at the mobile nodes are dynamically crafted and updated based on the current state of the observed system, the global processing objective and the characteristics of the sensors and data streams.
The edge processor plays a key role by establishing a link between content and operations within the distributed system.
The local classifiers are designed to filter the data streams and provide only the needed information to the global classifier at the edge processor, thus minimizing bandwidth usage.
However, the better the accuracy of these local classifiers, the larger the energy necessary to run them at the individual sensors.
A formulation of the optimization problem for the dynamic construction of the classifiers under bandwidth and energy constraints is proposed and demonstrated on a synthetic example.
The training complexity of deep learning-based channel decoders scales exponentially with the codebook size and therefore with the number of information bits.
Thus, neural network decoding (NND) is currently only feasible for very short block lengths.
In this work, we show that the conventional iterative decoding algorithm for polar codes can be enhanced when sub-blocks of the decoder are replaced by neural network (NN) based components.
Thus, we partition the encoding graph into smaller sub-blocks and train them individually, closely approaching maximum a posteriori (MAP) performance per sub-block.
These blocks are then connected via the remaining conventional belief propagation decoding stage(s).
The resulting decoding algorithm is non-iterative and inherently enables a high-level of parallelization, while showing a competitive bit error rate (BER) performance.
We examine the degradation through partitioning and compare the resulting decoder to state-of-the-art polar decoders such as successive cancellation list and belief propagation decoding.
Handwriting of Chinese has long been an important skill in East Asia.
However, automatic generation of handwritten Chinese characters poses a great challenge due to the large number of characters.
Various machine learning techniques have been used to recognize Chinese characters, but few works have studied the handwritten Chinese character generation problem, especially with unpaired training data.
In this work, we formulate the Chinese handwritten character generation as a problem that learns a mapping from an existing printed font to a personalized handwritten style.
We further propose DenseNet CycleGAN to generate Chinese handwritten characters.
Our method is applied not only to commonly used Chinese characters but also to calligraphy work with aesthetic values.
Furthermore, we propose content accuracy and style discrepancy as the evaluation metrics to assess the quality of the handwritten characters generated.
We then use our proposed metrics to evaluate the generated characters from CASIA dataset as well as our newly introduced Lanting calligraphy dataset.
Single document summarization is the task of producing a shorter version of a document while preserving its principal information content.
In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective.
We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate experimentally that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans.
Computer aided diagnostic (CAD) system is crucial for modern med-ical imaging.
But almost all CAD systems operate on reconstructed images, which were optimized for radiologists.
Computer vision can capture features that is subtle to human observers, so it is desirable to design a CAD system op-erating on the raw data.
In this paper, we proposed a deep-neural-network-based detection system for lung nodule detection in computed tomography (CT).
A primal-dual-type deep reconstruction network was applied first to convert the raw data to the image space, followed by a 3-dimensional convolutional neural network (3D-CNN) for the nodule detection.
For efficient network training, the deep reconstruction network and the CNN detector was trained sequentially first, then followed by one epoch of end-to-end fine tuning.
The method was evaluated on the Lung Image Database Consortium image collection (LIDC-IDRI) with simulated forward projections.
With 144 multi-slice fanbeam pro-jections, the proposed end-to-end detector could achieve comparable sensitivity with the reference detector, which was trained and applied on the fully-sampled image data.
It also demonstrated superior detection performance compared to detectors trained on the reconstructed images.
The proposed method is general and could be expanded to most detection tasks in medical imaging.
Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay.
This success rests in crucial ways on gradient-descent optimization and the ability to learn parameters of a neural network by backpropagating observed errors.
However, neural network architectures are growing increasingly sophisticated and diverse, which motivates an emerging quest for even more general forms of differentiable programming, where arbitrary parameterized computations can be trained by gradient descent.
In this paper, we take a fresh look at automatic differentiation (AD) techniques, and especially aim to demystify the reverse-mode form of AD that generalizes backpropagation in neural networks.
We uncover a tight connection between reverse-mode AD and delimited continuations, which permits implementing reverse-mode AD purely via operator overloading and without any auxiliary data structures.
We further show how this formulation of AD can be fruitfully combined with multi-stage programming (staging), leading to a highly efficient implementation that combines the performance benefits of deep learning frameworks based on explicit reified computation graphs (e.g., TensorFlow) with the expressiveness of pure library approaches (e.g., PyTorch).
We briefly report on a successful linear program reconstruction attack performed on a production statistical queries system and using a real dataset.
The attack was deployed in test environment in the course of the Aircloak Challenge bug bounty program and is based on the reconstruction algorithm of Dwork, McSherry, and Talwar.
We empirically evaluate the effectiveness of the algorithm and a related algorithm by Dinur and Nissim with various dataset sizes, error rates, and numbers of queries in a Gaussian noise setting.
This paper deals with the issue of the perceptual quality evaluation of user-generated videos shared online, which is an important step toward designing video-sharing services that maximize users' satisfaction in terms of quality.
We first analyze viewers' quality perception patterns by applying graph analysis techniques to subjective rating data.
We then examine the performance of existing state-of-the-art objective metrics for the quality estimation of user-generated videos.
In addition, we investigate the feasibility of metadata accompanied with videos in online video-sharing services for quality estimation.
Finally, various issues in the quality assessment of online user-generated videos are discussed, including difficulties and opportunities.
Disentangled distributed representations of data are desirable for machine learning, since they are more expressive and can generalize from fewer examples.
However, for complex data, the distributed representations of multiple objects present in the same input can interfere and lead to ambiguities, which is commonly referred to as the binding problem.
We argue for the importance of the binding problem to the field of representation learning, and develop a probabilistic framework that explicitly models inputs as a composition of multiple objects.
We propose an unsupervised algorithm that uses denoising autoencoders to dynamically bind features together in multi-object inputs through an Expectation-Maximization-like clustering process.
The effectiveness of this method is demonstrated on artificially generated datasets of binary images, showing that it can even generalize to bind together new objects never seen by the autoencoder during training.
Information processing has reached the era of big data.
Big data challenges are difficult to address with traditional Von Neumann or Turing approach.
Hence implementation of new computational techniques is highly essential.
Nanophotonics with its remarkable speed and multiplexing capability is a promising candidate for such implementations.
This paper proposes a novel photonic computing system made-up of Mach-Zehnder interferometer and an optical fiber spool to emulate a powerful machine learning technique called reservoir computing.
The proposed system is also integrated with a time-division-multiplexing circuit to facilitate parallel computation of multiple tasks which is first of its kind.
The proposed design performs large-scale tasks like spoken digit recognition, channel equalization, and time-series prediction.
Experimental results with standard photonic simulator demonstrate significant performance in terms of speed and accuracy compared to state of the art digital and software implementations.
Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability.
This is partly due to the interaction between the actor and critic during learning, e.g., an inaccurate step taken by one of them might adversely affect the other and destabilize the learning.
To avoid such issues, we propose to regularize the learning objective of the actor by penalizing the temporal difference (TD) error of the critic.
This improves stability by avoiding large steps in the actor update whenever the critic is highly inaccurate.
The resulting method, which we call the TD-regularized actor-critic method, is a simple plug-and-play approach to improve stability and overall performance of the actor-critic methods.
Evaluations on standard benchmarks confirm this.
Ezhil is a Tamil language based interpreted procedural programming language.
Tamil keywords and grammar are chosen to make the native Tamil speaker write programs in the Ezhil system.
Ezhil allows easy representation of computer program closer to the Tamil language logical constructs equivalent to the conditional, branch and loop statements in modern English based programming languages.
Ezhil is a compact programming language aimed towards Tamil speaking novice computer users.
Grammar for Ezhil and a few example programs are reported here, from the initial proof-of-concept implementation using the Python programming language1.
To the best of our knowledge, Ezhil language is the first freely available Tamil programming language.
Recent hardware developments have made unprecedented amounts of data parallelism available for accelerating neural network training.
Among the simplest ways to harness next-generation accelerators is to increase the batch size in standard mini-batch neural network training algorithms.
In this work, we aim to experimentally characterize the effects of increasing the batch size on training time, as measured in the number of steps necessary to reach a goal out-of-sample error.
Eventually, increasing the batch size will no longer reduce the number of training steps required, but the exact relationship between the batch size and how many training steps are necessary is of critical importance to practitioners, researchers, and hardware designers alike.
We study how this relationship varies with the training algorithm, model, and data set and find extremely large variation between workloads.
Along the way, we reconcile disagreements in the literature on whether batch size affects model quality.
Finally, we discuss the implications of our results for efforts to train neural networks much faster in the future.
In this article we show how power transformations can be used as a common framework for the derivation of local term weights.
We found that under some parametric conditions, BM25 and inverse regression produce equivalent results.
As a special case of inverse regression, we show that the largest increment in term weight occurs when a term is mentioned for the second time.
A model based on inverse regression (BM25IR) is presented.
Simulations suggest that BM25IR works fairly well for different BM25 parametric conditions and document lengths.
The identity of a user is permanently lost if biometric data gets compromised since the biometric information is irreplaceable and irrevocable.
To revoke and reissue a new template in place of the compromised biometric template, the idea of cancelable biometrics has been introduced.
The concept behind cancelable biometric is to irreversibly transform the original biometric template and perform the comparison in the protected domain.
In this paper, a coprime transformation scheme has been proposed to derive a protected fingerprint template.
The method divides the fingerprint region into a number of sectors with respect to each minutiae point and identifies the nearest-neighbor minutiae in each sector.
Then, ridge features for all neighboring minutiae points are computed and mapped onto co-prime positions of a random matrix to generate the cancelable template.
The proposed approach achieves an EER of 1.82, 1.39, 4.02 and 5.77 on DB1, DB2, DB3 and DB4 datasets of the FVC2002 and an EER of 8.70, 7.95, 5.23 and 4.87 on DB1, DB2, DB3 and DB4 datasets of FVC2004 databases, respectively.
Experimental evaluations indicate that the method outperforms in comparison to the current state-of-the-art.
Moreover, it has been confirmed from the security analysis that the proposed method fulfills the desired characteristics of diversity, revocability, and non-invertibility with a minor performance degradation caused by the transformation.
Cloze-style reading comprehension has been a popular task for measuring the progress of natural language understanding in recent years.
In this paper, we design a novel multi-perspective framework, which can be seen as the joint training of heterogeneous experts and aggregate context information from different perspectives.
Each perspective is modeled by a simple aggregation module.
The outputs of multiple aggregation modules are fed into a one-timestep pointer network to get the final answer.
At the same time, to tackle the problem of insufficient labeled data, we propose an efficient sampling mechanism to automatically generate more training examples by matching the distribution of candidates between labeled and unlabeled data.
We conduct our experiments on a recently released cloze-test dataset CLOTH (Xie et al., 2017), which consists of nearly 100k questions designed by professional teachers.
Results show that our method achieves new state-of-the-art performance over previous strong baselines.
In this paper, a contrastive evaluation of massively parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU.
Besides suffix array only held approximately 20%~30% of the space relative to suffix tree, the coalesced binary search and tile optimization make suffix array clearly outperform suffix tree using GPU.
Consequently, the experimental results show that multiple genome sequence matching based on suffix array is more than 99 times speedup than that of CPU serial implementation.
There is no doubt that massively parallel matching algorithm based on suffix array is an efficient approach to high-performance bioinformatics applications.
Facility location queries identify the best locations to set up new facilities for providing service to its users.
Majority of the existing works in this space assume that the user locations are static.
Such limitations are too restrictive for planning many modern real-life services such as fuel stations, ATMs, convenience stores, cellphone base-stations, etc. that are widely accessed by mobile users.
The placement of such services should, therefore, factor in the mobility patterns or trajectories of the users rather than simply their static locations.
In this work, we introduce the TOPS (Trajectory-Aware Optimal Placement of Services) query that locates the best k sites on a road network.
The aim is to optimize a wide class of objective functions defined over the user trajectories.
We show that the problem is NP-hard and even the greedy heuristic with an approximation bound of (1-1/e) fails to scale on urban-scale datasets.
To overcome this challenge, we develop a multi-resolution clustering based indexing framework called NetClus.
Empirical studies on real road network trajectory datasets show that NetClus offers solutions that are comparable in terms of quality with those of the greedy heuristic, while having practical response times and low memory footprints.
Additionally, the NetClus framework can absorb dynamic updates in mobility patterns, handle constraints such as site-costs and capacity, and existing services, thereby providing an effective solution for modern urban-scale scenarios.
Content-Centric Networking (CCN) is an internetworking paradigm that offers an alternative to today's IP-based Internet Architecture.
Instead of focusing on hosts and their locations, CCN emphasizes addressable named content.
By decoupling content from its location, CCN allows opportunistic in-network content caching, thus enabling better network utilization, at least for scalable content distribution.
However, in order to be considered seriously, CCN must support basic security services, including content authenticity, integrity, confidentiality, authorization and access control.
Current approaches rely on content producers to perform authorization and access control.
This general approach has several disadvantages.
First, consumer privacy vis-a-vis producers is not preserved.
Second, identity management and access control impose high computational overhead on producers.
Also, unnecessary repeated authentication and access control decisions must be made for each content request.
These issues motivate our design of KRB-CCN - a complete authorization and access control system for private CCNs.
Inspired by Kerberos in IP-based networks, KRB-CCN involves distinct authentication and authorization authorities.
By doing so, KRB-CCN obviates the need for producers to make consumer authentication and access control decisions.
KRB-CCN preserves consumer privacy since producers are unaware of consumer identities.
Producers are also not required to keep any hard state and only need to perform two symmetric key operations to guarantee that sensitive content is confidentially delivered only to authenticated and authorized consumers.
Most importantly, unlike prior designs, KRB-CCN leaves the network (i.e., CCN routers) out of any authorization, access control or confidentiality issues.
We describe KRB-CCN design and implementation, analyze its security, and report on its performance.
In order to extract the best possible performance from asynchronous stochastic gradient descent one must increase the mini-batch size and scale the learning rate accordingly.
In order to achieve further speedup we introduce a technique that delays gradient updates effectively increasing the mini-batch size.
Unfortunately with the increase of mini-batch size we worsen the stale gradient problem in asynchronous stochastic gradient descent (SGD) which makes the model convergence poor.
We introduce local optimizers which mitigate the stale gradient problem and together with fine tuning our momentum we are able to train a shallow machine translation system 27% faster than an optimized baseline with negligible penalty in BLEU.
Entity Linking aims to link entity mentions in texts to knowledge bases, and neural models have achieved recent success in this task.
However, most existing methods rely on local contexts to resolve entities independently, which may usually fail due to the data sparsity of local information.
To address this issue, we propose a novel neural model for collective entity linking, named as NCEL.
NCEL applies Graph Convolutional Network to integrate both local contextual features and global coherence information for entity linking.
To improve the computation efficiency, we approximately perform graph convolution on a subgraph of adjacent entity mentions instead of those in the entire text.
We further introduce an attention scheme to improve the robustness of NCEL to data noise and train the model on Wikipedia hyperlinks to avoid overfitting and domain bias.
In experiments, we evaluate NCEL on five publicly available datasets to verify the linking performance as well as generalization ability.
We also conduct an extensive analysis of time complexity, the impact of key modules, and qualitative results, which demonstrate the effectiveness and efficiency of our proposed method.
In this paper, we propose a fully convolutional network for 3D human pose estimation from monocular images.
We use limb orientations as a new way to represent 3D poses and bind the orientation together with the bounding box of each limb region to better associate images and predictions.
The 3D orientations are modeled jointly with 2D keypoint detections.
Without additional constraints, this simple method can achieve good results on several large-scale benchmarks.
Further experiments show that our method can generalize well to novel scenes and is robust to inaccurate bounding boxes.
This paper first describes an `obfuscating' compiler technology developed for encrypted computing, then examines if the trivial case without encryption produces much-sought indistinguishability obfuscation.
Learning to rank has recently emerged as an attractive technique to train deep convolutional neural networks for various computer vision tasks.
Pairwise ranking, in particular, has been successful in multi-label image classification, achieving state-of-the-art results on various benchmarks.
However, most existing approaches use the hinge loss to train their models, which is non-smooth and thus is difficult to optimize especially with deep networks.
Furthermore, they employ simple heuristics, such as top-k or thresholding, to determine which labels to include in the output from a ranked list of labels, which limits their use in the real-world setting.
In this work, we propose two techniques to improve pairwise ranking based multi-label image classification: (1) we propose a novel loss function for pairwise ranking, which is smooth everywhere and thus is easier to optimize; and (2) we incorporate a label decision module into the model, estimating the optimal confidence thresholds for each visual concept.
We provide theoretical analyses of our loss function in the Bayes consistency and risk minimization framework, and show its benefit over existing pairwise ranking formulations.
We demonstrate the effectiveness of our approach on three large-scale datasets, VOC2007, NUS-WIDE and MS-COCO, achieving the best reported results in the literature.
This paper deals with the problem of enforcing modular diagnosability for discrete-event systems that don't satisfy this property by their natural modularity.
We introduce an approach to achieve this property combining existing modules into new virtual modules.
An underlining mathematical problem is to find a partition of a set, such that the partition satisfies the required property.
The time complexity of such problem is very high.
To overcome it, the paper introduces a structural analysis of the system's modules.
In the analysis we focus on the case when the modules participate in diagnosis with their observations, rather then the case when indistinguishable observations are blocked due to concurrency.
We report findings related to a two dimensional viscous fingering problem solved with a timespace method and anisotropic elements.
Timespace methods have attracted interest for solution of time dependent partial differential equations due to the implications of parallelism in the temporal dimension, but there are also attractive features in the context of anisotropic mesh adaptation; not only are heuristics and interpolation errors avoided, but slanted elements in timespace also correspond to long and accurate timesteps, i.e. the anisotropy in timespace can be exploited.
We show that our timespace method is restricted by a minimum timestep size, which is due to the growth of numerical perturbations.
The lower bound on the timestep is, however, quite high, which is indicative that the number of timesteps can be reduced with several orders of magnitude for practical applications.
Android, the #1 mobile app framework, enforces the single-GUI-thread model, in which a single UI thread manages GUI rendering and event dispatching.
Due to this model, it is vital to avoid blocking the UI thread for responsiveness.
One common practice is to offload long-running tasks into async threads.
To achieve this, Android provides various async programming constructs, and leaves developers themselves to obey the rules implied by the model.
However, as our study reveals, more than 25% apps violate these rules and introduce hard-to-detect, fail-stop errors, which we term as aysnc programming errors (APEs).
To this end, this paper introduces APEChecker, a technique to automatically and efficiently manifest APEs.
The key idea is to characterize APEs as specific fault patterns, and synergistically combine static analysis and dynamic UI exploration to detect and verify such errors.
Among the 40 real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51 are confirmed (83.6% hit rate).
Specifically, APEChecker detects 3X more APEs than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces testing time from half an hour to a few minutes.
On a specific type of APEs, APEChecker confirms 5X more errors than the data race detection tool, EventRacer, with very few false alarms.
The selection of the best classification algorithm for a given dataset is a very widespread problem.
It is also a complex one, in the sense it requires to make several important methodological choices.
Among them, in this work we focus on the measure used to assess the classification performance and rank the algorithms.
We present the most popular measures and discuss their properties.
Despite the numerous measures proposed over the years, many of them turn out to be equivalent in this specific case, to have interpretation problems, or to be unsuitable for our purpose.
Consequently, classic overall success rate or marginal rates should be preferred for this specific task.
Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model while training data remains distributed over a large number of clients each with unreliable and relatively slow network connections.
We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model.
The typical clients in this setting are mobile phones, and communication efficiency is of the utmost importance.
In this paper, we propose two ways to reduce the uplink communication costs: structured updates, where we directly learn an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, where we learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling before sending it to the server.
Experiments on both convolutional and recurrent networks show that the proposed methods can reduce the communication cost by two orders of magnitude.
Graphs are an essential data structure that can represent the structure of social networks.
Many online companies, in order to provide intelligent and personalized services for their users, aim to comprehensively analyze a significant amount of graph data with different features.
One example is k-core decomposition which captures the degree of connectedness in social graphs.
The main purpose of this report is to explore a distributed algorithm for k-core decomposition on Apache Giraph.
Namely, we would like to determine whether a cluster-based, Giraph implementation of k-core decomposition that we provide is more efficient than a single-machine, disk-based implementation on GraphChi for large networks.
In this report, we describe (a) the programming model of Giraph and GraphChi, (b) the specific implementation of k-core decomposition with Giraph, and (c) the result comparison between Giraph and GraphChi.
By analyzing the results, we conclude that Giraph is faster than GraphChi when dealing with large data.
However, since worker nodes need time to communicate with each other, Giraph is not very efficient for small data.
In an uncoordinated network, the link performance between the devices might degrade significantly due to the interference from other links in the network sharing the same spectrum.
As a solution, in this study, the concept of partially overlapping tones (POT) is introduced.
The interference energy observed at the victim receiver is mitigated by partially overlapping the individual subcarriers via an intentional carrier frequency offset between the links.
Also, it is shown that while orthogonal transformations at the receiver cannot mitigate the other-user interference without losing spectral efficiency, non-orthogonal transformations are able to mitigate the other-user interference without any spectral efficiency loss at the expense of self-interference.
Using spatial Poisson point process, a tractable bit error rate analysis is provided to demonstrate potential benefits emerging from POT.
We study the problem of conditional generative modeling based on designated semantics or structures.
Existing models that build conditional generators either require massive labeled instances as supervision or are unable to accurately control the semantics of generated samples.
We propose structured generative adversarial networks (SGANs) for semi-supervised conditional generative modeling.
SGAN assumes the data x is generated conditioned on two independent latent variables: y that encodes the designated semantics, and z that contains other factors of variation.
To ensure disentangled semantics in y and z, SGAN builds two collaborative games in the hidden space to minimize the reconstruction error of y and z, respectively.
Training SGAN also involves solving two adversarial games that have their equilibrium concentrating at the true joint data distributions p(x, z) and p(x, y), avoiding distributing the probability mass diffusely over data space that MLE-based methods may suffer.
We assess SGAN by evaluating its trained networks, and its performance on downstream tasks.
We show that SGAN delivers a highly controllable generator, and disentangled representations; it also establishes start-of-the-art results across multiple datasets when applied for semi-supervised image classification (1.27%, 5.73%, 17.26% error rates on MNIST, SVHN and CIFAR-10 using 50, 1000 and 4000 labels, respectively).
Benefiting from the separate modeling of y and z, SGAN can generate images with high visual quality and strictly following the designated semantic, and can be extended to a wide spectrum of applications, such as style transfer.
The growing popularity of location-based systems, allowing unknown/untrusted servers to easily collect huge amounts of information regarding users' location, has recently started raising serious privacy concerns.
In this paper we study geo-indistinguishability, a formal notion of privacy for location-based systems that protects the user's exact location, while allowing approximate information - typically needed to obtain a certain desired service - to be released.
Our privacy definition formalizes the intuitive notion of protecting the user's location within a radius r with a level of privacy that depends on r, and corresponds to a generalized version of the well-known concept of differential privacy.
Furthermore, we present a perturbation technique for achieving geo-indistinguishability by adding controlled random noise to the user's location.
We demonstrate the applicability of our technique on a LBS application.
Finally, we compare our mechanism with other ones in the literature.
It turns our that our mechanism offers the best privacy guarantees, for the same utility, among all those which do not depend on the prior.
Adversarial examples are known to have a negative effect on the performance of classifiers which have otherwise good performance on undisturbed images.
These examples are generated by adding non-random noise to the testing samples in order to make classifier misclassify the given data.
Adversarial attacks use these intentionally generated examples and they pose a security risk to the machine learning based systems.
To be immune to such attacks, it is desirable to have a pre-processing mechanism which removes these effects causing misclassification while keeping the content of the image.
JPEG and JPEG2000 are well-known image compression techniques which suppress the high-frequency content taking the human visual system into account.
JPEG has been also shown to be an effective method for reducing adversarial noise.
In this paper, we propose applying JPEG2000 compression as an alternative and systematically compare the classification performance of adversarial images compressed using JPEG and JPEG2000 at different target PSNR values and maximum compression levels.
Our experiments show that JPEG2000 is more effective in reducing adversarial noise as it allows higher compression rates with less distortion and it does not introduce blocking artifacts.
Clifford algebras have broad applications in science and engineering.
The use of Clifford algebras can be further promoted in these fields by availability of computational tools that automate tedious routine calculations.
We offer an extensive demonstration of the applications of Clifford algebras in electromagnetism using the geometric algebra G3 = Cl(3,0) as a computational model in the Maxima computer algebra system.
We compare the geometric algebra-based approach with conventional symbolic tensor calculations supported by Maxima, based on the itensor package.
The Clifford algebra functionality of Maxima is distributed as two new packages called clifford - for basic simplification of Clifford products, outer products, scalar products and inverses; and cliffordan - for applications of geometric calculus.
Photography usually requires optics in conjunction with a recording device (an image sensor).
Eliminating the optics could lead to new form factors for cameras.
Here, we report a simple demonstration of imaging using a bare CMOS sensor that utilizes computation.
The technique relies on the space variant point-spread functions resulting from the interaction of a point source in the field of view with the image sensor.
These space-variant point-spread functions are combined with a reconstruction algorithm in order to image simple objects displayed on a discrete LED array as well as on an LCD screen.
We extended the approach to video imaging at the native frame rate of the sensor.
Finally, we performed experiments to analyze the parametric impact of the object distance.
Improving the sensor designs and reconstruction algorithms can lead to useful cameras without optics.
In many computer vision applications, obtaining images of high resolution in both the spatial and spectral domains are equally important.
However, due to hardware limitations, one can only expect to acquire images of high resolution in either the spatial or spectral domains.
This paper focuses on hyperspectral image super-resolution (HSI-SR), where a hyperspectral image (HSI) with low spatial resolution (LR) but high spectral resolution is fused with a multispectral image (MSI) with high spatial resolution (HR) but low spectral resolution to obtain HR HSI.
Existing deep learning-based solutions are all supervised that would need a large training set and the availability of HR HSI, which is unrealistic.
Here, we make the first attempt to solving the HSI-SR problem using an unsupervised encoder-decoder architecture that carries the following uniquenesses.
First, it is composed of two encoder-decoder networks, coupled through a shared decoder, in order to preserve the rich spectral information from the HSI network.
Second, the network encourages the representations from both modalities to follow a sparse Dirichlet distribution which naturally incorporates the two physical constraints of HSI and MSI.
Third, the angular difference between representations are minimized in order to reduce the spectral distortion.
We refer to the proposed architecture as unsupervised Sparse Dirichlet-Net, or uSDN.
Extensive experimental results demonstrate the superior performance of uSDN as compared to the state-of-the-art.
When simulating trajectories by integrating time-continuous car-following models, standard integration schemes such as the forth-order Runge-Kutta method (RK4) are rarely used while the simple Euler's method is popular among researchers.
We compare four explicit methods: Euler's method, ballistic update, Heun's method (trapezoidal rule), and the standard forth-order RK4.
As performance metrics, we plot the global discretization error as a function of the numerical complexity.
We tested the methods on several time-continuous car-following models in several multi-vehicle simulation scenarios with and without discontinuities such as stops or a discontinuous behavior of an external leader.
We find that the theoretical advantage of RK4 (consistency order~4) only plays a role if both the acceleration function of the model and the external data of the simulation scenario are sufficiently often differentiable.
Otherwise, we obtain lower (and often fractional) consistency orders.
Although, to our knowledge, Heun's method has never been used for integrating car-following models, it turns out to be the best scheme for many practical situations.
The ballistic update always prevails Euler's method although both are of first order.
The agent program, called Samu, is an experiment to build a disembodied DevRob (Developmental Robotics) chatter bot that can talk in a natural language like humans do.
One of the main design feature is that Samu can be interacted with using only a character terminal.
This is important not only for practical aspects of Turing test or Loebner prize, but also for the study of basic principles of Developmental Robotics.
Our purpose is to create a rapid prototype of Q-learning with neural network approximators for Samu.
We sketch out the early stages of the development process of this prototype, where Samu's task is to predict the next sentence of tales or conversations.
The basic objective of this paper is to reach the same results using reinforcement learning with general function approximators that can be achieved by using the classical Q lookup table on small input samples.
The paper is closed by an experiment that shows a significant improvement in Samu's learning when using LZW tree to narrow the number of possible Q-actions.
Nowadays, the need for system interoperability in or across enterprises has become more and more ubiquitous.
Lots of research works have been carried out in the information exchange, transformation, discovery and reuse.
One of the main challenges in these researches is to overcome the semantic heterogeneity between enterprise applications along the lifecycle of a product.
As a possible solution to assist the semantic interoperability, semantic annotation has gained more and more attentions and is widely used in different domains.
In this paper, based on the investigation of the context and the related works, we identify some existing drawbacks and propose a formal semantic annotation approach to support the semantics enrichment of models in a PLM environment.
With an increasing number of web services, providing an end-to-end Quality of Service (QoS) guarantee in responding to user queries is becoming an important concern.
Multiple QoS parameters (e.g., response time, latency, throughput, reliability, availability, success rate) are associated with a service, thereby, service composition with a large number of candidate services is a challenging multi-objective optimization problem.
In this paper, we study the multi-constrained multi-objective QoS aware web service composition problem and propose three different approaches to solve the same, one optimal, based on Pareto front construction and two other based on heuristically traversing the solution space.
We compare the performance of the heuristics against the optimal, and show the effectiveness of our proposals over other classical approaches for the same problem setting, with experiments on WSC-2009 and ICEBE-2005 datasets.
We present a novel proof-of-concept attack named Trojan of Things (ToT), which aims to attack NFC- enabled mobile devices such as smartphones.
The key idea of ToT attacks is to covertly embed maliciously programmed NFC tags into common objects routinely encountered in daily life such as banknotes, clothing, or furniture, which are not considered as NFC touchpoints.
To fully explore the threat of ToT, we develop two striking techniques named ToT device and Phantom touch generator.
These techniques enable an attacker to carry out various severe and sophisticated attacks unbeknownst to the device owner who unintentionally puts the device close to a ToT.
We discuss the feasibility of the attack as well as the possible countermeasures against the threats of ToT attacks.
In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the color-texture distributions to address the problem of person re-identification.
In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations.
These correspondence representations are fused to perform the re-identification task.
Further, the proposed framework is optimized via a unified end-to-end deep learning scheme.
Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art literature, especially on the rank-1 recognition rate.
Steganography is the technique of hiding confidential information within any media.
Steganography is often confused with cryptography because the two are similar in the way that they both are used to protect confidential information.
The difference between the two is in the appearance in the processed output; the output of steganography operation is not apparently visible but in cryptography the output is scrambled so that it can draw attention.
Steganlysis is process to detect of presence of steganography.
In this article we have tried to elucidate the different approaches towards implementation of steganography using 'multimedia' file (text, static image, audio and video) and Network IP datagram as cover.
Also some methods of steganalysis will be discussed.
This paper describes an alignment-based model for interpreting natural language instructions in context.
We approach instruction following as a search over plans, scoring sequences of actions conditioned on structured observations of text and the environment.
By explicitly modeling both the low-level compositional structure of individual actions and the high-level structure of full plans, we are able to learn both grounded representations of sentence meaning and pragmatic constraints on interpretation.
To demonstrate the model's flexibility, we apply it to a diverse set of benchmark tasks.
On every task, we outperform strong task-specific baselines, and achieve several new state-of-the-art results.
In stream-based programming, data sources are abstracted as a stream of values that can be manipulated via callback functions.
Stream-based programming is exploding in popularity, as it provides a powerful and expressive paradigm for handling asynchronous data sources in interactive software.
However, high-level stream abstractions can also make it difficult for developers to reason about control- and data-flow relationships in their programs.
This is particularly impactful when asynchronous stream-based code interacts with thread-limited features such as UI frameworks that restrict UI access to a single thread, since the threading behavior of streaming constructs is often non-intuitive and insufficiently documented.
In this paper, we present a type-based approach that can statically prove the thread-safety of UI accesses in stream-based software.
Our key insight is that the fluent APIs of stream-processing frameworks enable the tracking of threads via type-refinement, making it possible to reason automatically about what thread a piece of code runs on -- a difficult problem in general.
We implement the system as an annotation-based Java typechecker for Android programs built upon the popular ReactiveX framework and evaluate its efficacy by annotating and analyzing 8 open-source apps, where we find 33 instances of unsafe UI access while incurring an annotation burden of only one annotation per 186 source lines of code.
We also report on our experience applying the typechecker to two much larger apps from the Uber Technologies Inc. codebase, where it currently runs on every code change and blocks changes that introduce potential threading bugs.
In the online packet buffering problem (also known as the unweighted FIFO variant of buffer management), we focus on a single network packet switching device with several input ports and one output port.
This device forwards unit-size, unit-value packets from input ports to the output port.
Buffers attached to input ports may accumulate incoming packets for later transmission; if they cannot accommodate all incoming packets, their excess is lost.
A packet buffering algorithm has to choose from which buffers to transmit packets in order to minimize the number of lost packets and thus maximize the throughput.
We present a tight lower bound of e/(e-1) ~ 1.582 on the competitive ratio of the throughput maximization, which holds even for fractional or randomized algorithms.
This improves the previously best known lower bound of 1.4659 and matches the performance of the algorithm Random Schedule.
Our result contradicts the claimed performance of the algorithm Random Permutation; we point out a flaw in its original analysis.
The increasing accuracy of automatic chord estimation systems, the availability of vast amounts of heterogeneous reference annotations, and insights from annotator subjectivity research make chord label personalization increasingly important.
Nevertheless, automatic chord estimation systems are historically exclusively trained and evaluated on a single reference annotation.
We introduce a first approach to automatic chord label personalization by modeling subjectivity through deep learning of a harmonic interval-based chord label representation.
After integrating these representations from multiple annotators, we can accurately personalize chord labels for individual annotators from a single model and the annotators' chord label vocabulary.
Furthermore, we show that chord personalization using multiple reference annotations outperforms using a single reference annotation.
Heterogeneous cellular networks (HCNs) usually exhibit spatial separation amongst base stations (BSs) of different types (termed tiers in this paper).
For instance, operators will usually not deploy a picocell in close proximity to a macrocell, thus inducing separation amongst the locations of pico and macrocells.
This separation has recently been captured by modeling the small cell locations by a Poisson Hole Process (PHP) with the hole centers being the locations of the macrocells.
Due to the presence of exclusion zones, the analysis of the resulting model is significantly more complex compared to the more popular Poisson Point Process (PPP) based models.
In this paper, we derive a tight bound on the distribution of the distance of a typical user to the closest point of a PHP.
Since the exact distribution of this distance is not known, it is often approximated in the literature.
For this model, we then provide tight characterization of the downlink coverage probability for a typical user in a two-tier closed-access HCN under two cases: (i) typical user is served by the closest macrocell, and (ii) typical user is served by its closest small cell.
The proposed approach can be extended to analyze other relevant cases of interest, e.g., coverage in a PHP-based open access HCN.
How can we analyze enormous networks including the Web and social networks which have hundreds of billions of nodes and edges?
Network analyses have been conducted by various graph mining methods including shortest path computation, PageRank, connected component computation, random walk with restart, etc.
These graph mining methods can be expressed as generalized matrix-vector multiplication which consists of few operations inspired by typical matrix-vector multiplication.
Recently, several graph processing systems based on matrix-vector multiplication or their own primitives have been proposed to deal with large graphs; however, they all have failed on Web-scale graphs due to insufficient memory space or the lack of consideration for I/O costs.
In this paper, we propose PMV (Pre-partitioned generalized Matrix-Vector multiplication), a scalable distributed graph mining method based on generalized matrix-vector multiplication on distributed systems.
PMV significantly decreases the communication cost, which is the main bottleneck of distributed systems, by partitioning the input graph in advance and judiciously applying execution strategies based on the density of the pre-partitioned sub-matrices.
Experiments show that PMV succeeds in processing up to 16x larger graphs than existing distributed memory-based graph mining methods, and requires 9x less time than previous disk-based graph mining methods by reducing I/O costs significantly.
Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily.
Guesswork is central to several applications in information theory: Average guesswork provides a lower bound on the expected computational cost of a sequential decoder to decode successfully the transmitted message; the complementary cumulative distribution function of guesswork gives the error probability in list decoding; the logarithm of guesswork is the number of bits needed in optimal lossless one-to-one source coding; and guesswork is the number of trials required of an adversary to breach a password protected system in a brute-force attack.
In this paper, we consider memoryless string-sources that generate strings consisting of i.i.d. characters drawn from a finite alphabet, and characterize their corresponding guesswork.
Our main tool is the tilt operation.
We show that the tilt operation on a memoryless string-source parametrizes an exponential family of memoryless string-sources, which we refer to as the tilted family.
We provide an operational meaning to the tilted families by proving that two memoryless string-sources result in the same guesswork on all strings of all lengths if and only if their respective categorical distributions belong to the same tilted family.
Establishing some general properties of the tilt operation, we generalize the notions of weakly typical set and asymptotic equipartition property to tilted weakly typical sets of different orders.
We use this new definition to characterize the large deviations for all atypical strings and characterize the volume of weakly typical sets of different orders.
We subsequently build on this characterization to prove large deviation bounds on guesswork and provide an accurate approximation of its PMF.
For security and privacy management and enforcement purposes, various policy languages have been presented.
We give an overview on 27 security and privacy policy languages and present a categorization framework for policy languages.
We show how the current policy languages are represented in the framework and summarize our interpretation.
We show up identified gaps and motivate for the adoption of policy languages for the specification of privacy-utility trade-off policies.
Previous machine comprehension (MC) datasets are either too small to train end-to-end deep learning models, or not difficult enough to evaluate the ability of current MC techniques.
The newly released SQuAD dataset alleviates these limitations, and gives us a chance to develop more realistic MC models.
Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage.
Our model first adjusts each word-embedding vector in the passage by multiplying a relevancy weight computed against the question.
Then, we encode the question and weighted passage by using bi-directional LSTMs.
For each point in the passage, our model matches the context of this point against the encoded question from multiple perspectives and produces a matching vector.
Given those matched vectors, we employ another bi-directional LSTM to aggregate all the information and predict the beginning and ending points.
Experimental result on the test set of SQuAD shows that our model achieves a competitive result on the leaderboard.
Energy-efficiency, high data rates and secure communications are essential requirements of the future wireless networks.
In this paper, optimizing the secrecy energy efficiency is considered.
The optimal beamformer is designed for a MISO system with and without considering the minimum required secrecy rate.
Further, the optimal power control in a SISO system is carried out using an efficient iterative method, and this is followed by analyzing the trade-off between the secrecy energy efficiency and the secrecy rate for both MISO and SISO systems.
In this paper, we propose a framework for generating 3D point cloud of an object from a single-view RGB image.
Most previous work predict the 3D point coordinates from single RGB images directly.
We decompose this problem into depth estimation from single images and point completion from partial point clouds.
Our method sequentially predicts the depth maps and then infers the complete 3D object point clouds based on the predicted partial point clouds.
We explicitly impose the camera model geometrical constraint in our pipeline and enforce the alignment of the generated point clouds and estimated depth maps.
Experimental results for the single image 3D object reconstruction task show that the proposed method outperforms state-of-the-art methods.
Both the qualitative and quantitative results demonstrate the generality and suitability of our method.
Electroluminescence (EL) imaging is a useful modality for the inspection of photovoltaic (PV) modules.
EL images provide high spatial resolution, which makes it possible to detect even finest defects on the surface of PV modules.
However, the analysis of EL images is typically a manual process that is expensive, time-consuming, and requires expert knowledge of many different types of defects.
In this work, we investigate two approaches for automatic detection of such defects in a single image of a PV cell.
The approaches differ in their hardware requirements, which are dictated by their respective application scenarios.
The more hardware-efficient approach is based on hand-crafted features that are classified in a Support Vector Machine (SVM).
To obtain a strong performance, we investigate and compare various processing variants.
The more hardware-demanding approach uses an end-to-end deep Convolutional Neural Network (CNN) that runs on a Graphics Processing Unit (GPU).
Both approaches are trained on 1,968 cells extracted from high resolution EL intensity images of mono- and polycrystalline PV modules.
The CNN is more accurate, and reaches an average accuracy of 88.42%.
The SVM achieves a slightly lower average accuracy of 82.44%, but can run on arbitrary hardware.
Both automated approaches make continuous, highly accurate monitoring of PV cells feasible.
The school timetabling problem can be described as scheduling a set of lessons (combination of classes, teachers, subjects and rooms) in a weekly timetable.
This paper presents a novel way to generate timetables for high schools.
The algorithm has three phases.
Pre-scheduling, initial phase and optimization through tabu search.
In the first phase, a graph based algorithm used to create groups of lessons to be scheduled simultaneously; then an initial solution is built by a sequential greedy heuristic.
Finally, the solution is optimized using tabu search algorithm based on frequency based diversification.
The algorithm has been tested on a set of real problems gathered from Iranian high schools.
Experiments show that the proposed algorithm can effectively build acceptable timetables.
We show how to extend traditional intrinsic image decompositions to incorporate further layers above albedo and shading.
It is hard to obtain data to learn a multi-layer decomposition.
Instead, we can learn to decompose an image into layers that are "like this" by authoring generative models for each layer using proxy examples that capture the Platonic ideal (Mondrian images for albedo; rendered 3D primitives for shading; material swatches for shading detail).
Our method then generates image layers, one from each model, that explain the image.
Our approach rests on innovation in generative models for images.
We introduce a Convolutional Variational Auto Encoder (conv-VAE), a novel VAE architecture that can reconstruct high fidelity images.
The approach is general, and does not require that layers admit a physical interpretation.
This study covers an analytical approach to calculate positively invariant sets of dynamical systems.
Using Lyapunov techniques and quantifier elimination methods, an automatic procedure for determining bounds in the state space as an enclosure of attractors is proposed.
The available software tools permit an algorithmizable process, which normally requires a good insight into the systems dynamics and experience.
As a result we get an estimation of the attractor, whose conservatism only results from the initial choice of the Lyapunov candidate function.
The proposed approach is illustrated on the well-known Lorenz system.
This short paper presents the video browsing tool of VIREO team which has been used in the Video Browser Showdown 2018.
All added functions in the final version are introduced and experiences gained from the benchmark are also shared.
Online class imbalance learning constitutes a new problem and an emerging research topic that focusses on the challenges of online learning under class imbalance and concept drift.
Class imbalance deals with data streams that have very skewed distributions while concept drift deals with changes in the class imbalance status.
Little work exists that addresses these challenges and in this paper we introduce queue-based resampling, a novel algorithm that successfully addresses the co-existence of class imbalance and concept drift.
The central idea of the proposed resampling algorithm is to selectively include in the training set a subset of the examples that appeared in the past.
Results on two popular benchmark datasets demonstrate the effectiveness of queue-based resampling over state-of-the-art methods in terms of learning speed and quality.
Legal probabilism (LP) claims the degrees of conviction in juridical fact-finding are to be modeled exactly the way degrees of beliefs are modeled in standard bayesian epistemology.
Classical legal probabilism (CLP) adds that the conviction is justified if the credence in guilt given the evidence is above an appropriate guilt probability threshold.
The views are challenged on various counts, especially by the proponents of the so-called narrative approach, on which the fact-finders' decision is the result of a dynamic interplay between competing narratives of what happened.
I develop a way a bayesian epistemologist can make sense of the narrative approach.
I do so by formulating a probabilistic framework for evaluating competing narrations in terms of formal explications of the informal evaluation criteria used in the narrative approach.
As the Industrial Internet of Things (IIoT) grows, systems are increasingly being monitored by arrays of sensors returning time-series data at ever-increasing 'volume, velocity and variety' (i.e.Industrial Big Data).
An obvious use for these data is real-time systems condition monitoring and prognostic time to failure analysis (remaining useful life, RUL). (e.g.See white papers by Senseye.io, and output of the NASA Prognostics Center of Excellence (PCoE).)
However, as noted by Agrawal and Choudhary 'Our ability to collect "big data" has greatly surpassed our capability to analyze it, underscoring the emergence of the fourth paradigm of science, which is data-driven discovery.'
In order to fully utilize the potential of Industrial Big Data we need data-driven techniques that operate at scales that process models cannot.
Here we present a prototype technique for data-driven anomaly detection to operate at industrial scale.
The method generalizes to application with almost any multivariate dataset based on independent ordinations of repeated (bootstrapped) partitions of the dataset and inspection of the joint distribution of ordinal distances.
Using machine learning algorithms, including deep learning, we studied the prediction of personal attributes from the text of tweets, such as gender, occupation, and age groups.
We applied word2vec to construct word vectors, which were then used to vectorize tweet blocks.
The resulting tweet vectors were used as inputs for training models, and the prediction accuracy of those models was examined as a function of the dimension of the tweet vectors and the size of the tweet blacks.
The results showed that the machine learning algorithms could predict the three personal attributes of interest with 60-70% accuracy.
Background subtraction is the primary task of the majority of video inspection systems.
The most important part of the background subtraction which is common among different algorithms is background modeling.
In this regard, our paper addresses the problem of background modeling in a computationally efficient way, which is important for current eruption of "big data" processing coming from high resolution multi-channel videos.
Our model is based on the assumption that background in natural images lies on a low-dimensional subspace.
We formulated and solved this problem in a low-rank matrix completion framework.
In modeling the background, we benefited from the in-face extended Frank-Wolfe algorithm for solving a defined convex optimization problem.
We evaluated our fast robust matrix completion (fRMC) method on both background models challenge (BMC) and Stuttgart artificial background subtraction (SABS) datasets.
The results were compared with the robust principle component analysis (RPCA) and low-rank robust matrix completion (RMC) methods, both solved by inexact augmented Lagrangian multiplier (IALM).
The results showed faster computation, at least twice as when IALM solver is used, while having a comparable accuracy even better in some challenges, in subtracting the backgrounds in order to detect moving objects in the scene.
In order for autonomous robots to be able to support people's well-being in homes and everyday environments, new interactive capabilities will be required, as exemplified by the soft design used for Disney's recent robot character Baymax in popular fiction.
Home robots will be required to be easy to interact with and intelligent--adaptive, fun, unobtrusive and involving little effort to power and maintain--and capable of carrying out useful tasks both on an everyday level and during emergencies.
The current article adopts an exploratory medium fidelity prototyping approach for testing some new robotic capabilities in regard to recognizing people's activities and intentions and behaving in a way which is transparent to people.
Results are discussed with the aim of informing next designs.
For future traffic scenarios, we envision interconnected traffic participants, who exchange information about their current state, e.g., position, their predicted intentions, allowing to act in a cooperative manner.
Vulnerable road users (VRUs), e.g., pedestrians and cyclists, will be equipped with smart device that can be used to detect their intentions and transmit these detected intention to approaching cars such that their drivers can be warned.
In this article, we focus on detecting the initial movement of cyclist using smart devices.
Smart devices provide the necessary sensors, namely accelerometer and gyroscope, and therefore pose an excellent instrument to detect movement transitions (e.g., waiting to moving) fast.
Convolutional Neural Networks prove to be the state-of-the-art solution for many problems with an ever increasing range of applications.
Therefore, we model the initial movement detection as a classification problem.
In terms of Organic Computing (OC) it be seen as a step towards self-awareness and self-adaptation.
We apply residual network architectures to the task of detecting the initial starting movement of cyclists.
We develop a multiexposure image fusion method based on texture features, which exploits the edge preserving and intraregion smoothing property of nonlinear diffusion filters based on partial differential equations (PDE).
With the captured multiexposure image series, we first decompose images into base layers and detail layers to extract sharp details and fine details, respectively.
The magnitude of the gradient of the image intensity is utilized to encourage smoothness at homogeneous regions in preference to inhomogeneous regions.
Then, we have considered texture features of the base layer to generate a mask (i.e., decision mask) that guides the fusion of base layers in multiresolution fashion.
Finally, well-exposed fused image is obtained that combines fused base layer and the detail layers at each scale across all the input exposures.
Proposed algorithm skipping complex High Dynamic Range Image (HDRI) generation and tone mapping steps to produce detail preserving image for display on standard dynamic range display devices.
Moreover, our technique is effective for blending flash/no-flash image pair and multifocus images, that is, images focused on different targets.
Recently, Image-to-Image Translation (IIT) has achieved great progress in image style transfer and semantic context manipulation for images.
However, existing approaches require exhaustively labelling training data, which is labor demanding, difficult to scale up, and hard to adapt to a new domain.
To overcome such a key limitation, we propose Sparsely Grouped Generative Adversarial Networks (SG-GAN) as a novel approach that can translate images in sparsely grouped datasets where only a few train samples are labelled.
Using a one-input multi-output architecture, SG-GAN is well-suited for tackling multi-task learning and sparsely grouped learning tasks.
The new model is able to translate images among multiple groups using only a single trained model.
To experimentally validate the advantages of the new model, we apply the proposed method to tackle a series of attribute manipulation tasks for facial images as a case study.
Experimental results show that SG-GAN can achieve comparable results with state-of-the-art methods on adequately labelled datasets while attaining a superior image translation quality on sparsely grouped datasets.
This study investigates the role of both cultural and technological factors in determining audience formation on a global scale.
It integrates theories of media choice with theories of global cultural consumption and tests them by analyzing shared audience traffic between the world's 1000 most popular Websites.
We find that language and geographic similarities are more powerful predictors of audience overlap than hyperlinks and genre similarity, highlighting the role of cultural structures in shaping global media use.
Accurate state estimation is a fundamental module for various intelligent applications, such as robot navigation, autonomous driving, virtual and augmented reality.
Visual and inertial fusion is a popular technology for 6-DOF state estimation in recent years.
Time instants at which different sensors' measurements are recorded are of crucial importance to the system's robustness and accuracy.
In practice, timestamps of each sensor typically suffer from triggering and transmission delays, leading to temporal misalignment (time offsets) among different sensors.
Such temporal offset dramatically influences the performance of sensor fusion.
To this end, we propose an online approach for calibrating temporal offset between visual and inertial measurements.
Our approach achieves temporal offset calibration by jointly optimizing time offset, camera and IMU states, as well as feature locations in a SLAM system.
Furthermore, the approach is a general model, which can be easily employed in several feature-based optimization frameworks.
Simulation and experimental results demonstrate the high accuracy of our calibration approach even compared with other state-of-art offline tools.
The VIO comparison against other methods proves that the online temporal calibration significantly benefits visual-inertial systems.
The source code of temporal calibration is integrated into our public project, VINS-Mono.
We provide a detailed overview of the various approaches that were proposed to date to solve the task of Open Information Extraction.
We present the major challenges that such systems face, show the evolution of the suggested approaches over time and depict the specific issues they address.
In addition, we provide a critique of the commonly applied evaluation procedures for assessing the performance of Open IE systems and highlight some directions for future work.
One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications.
The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity.
Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder.
However, previous work often treats representation learning and anomaly prediction separately.
In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training.
Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection.
We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training.
We consider the complexity of the firefighter problem where b>=1 firefighters are available at each time step.
This problem is proved NP-complete even on trees of degree at most three and budget one (Finbow et al.,2007) and on trees of bounded degree b+3 for any fixed budget b>=2 (Bazgan et al.,2012).
In this paper, we provide further insight into the complexity landscape of the problem by showing that the pathwidth and the maximum degree of the input graph govern its complexity.
More precisely, we first prove that the problem is NP-complete even on trees of pathwidth at most three for any fixed budget b>=1.
We then show that the problem turns out to be fixed parameter-tractable with respect to the combined parameter "pathwidth" and "maximum degree" of the input graph.
Scientists, journalists, and photographers have used advanced camera technology to capture extremely high-resolution timelapse and developed information visualization tools for data exploration and analysis.
However, it takes a great deal of effort for professionals to form and tell stories after exploring data, since these tools usually provide little aids in creating visual elements.
We present a web-based timelapse editor to support the creation of guided video tours and interactive slideshows from a collection of large-scale spatial and temporal images.
Professionals can embed these two visual elements into web pages in conjunction with various forms of digital media to tell multimodal and interactive stories.
Methods from computational topology are becoming more and more popular in computer vision and have shown to improve the state-of-the-art in several tasks.
In this paper, we investigate the applicability of topological descriptors in the context of 3D surface analysis for the classification of different surface textures.
We present a comprehensive study on topological descriptors, investigate their robustness and expressiveness and compare them with state-of-the-art methods including Convolutional Neural Networks (CNNs).
Results show that class-specific information is reflected well in topological descriptors.
The investigated descriptors can directly compete with non-topological descriptors and capture complementary information.
As a consequence they improve the state-of-the-art when combined with non-topological descriptors.
The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time.
Very few previous studies have examined this crucial and challenging weather forecasting problem from the machine learning perspective.
In this paper, we formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the prediction target are spatiotemporal sequences.
By extending the fully connected LSTM (FC-LSTM) to have convolutional structures in both the input-to-state and state-to-state transitions, we propose the convolutional LSTM (ConvLSTM) and use it to build an end-to-end trainable model for the precipitation nowcasting problem.
Experiments show that our ConvLSTM network captures spatiotemporal correlations better and consistently outperforms FC-LSTM and the state-of-the-art operational ROVER algorithm for precipitation nowcasting.
State-of-the-art Natural Language Processing algorithms rely heavily on efficient word segmentation.
Urdu is amongst languages for which word segmentation is a complex task as it exhibits space omission as well as space insertion issues.
This is partly due to the Arabic script which although cursive in nature, consists of characters that have inherent joining and non-joining attributes regardless of word boundary.
This paper presents a word segmentation system for Urdu which uses a Conditional Random Field sequence modeler with orthographic, linguistic and morphological features.
Our proposed model automatically learns to predict white space as word boundary as well as Zero Width Non-Joiner (ZWNJ) as sub-word boundary.
Using a manually annotated corpus, our model achieves F1 score of 0.97 for word boundary identification and 0.85 for sub-word boundary identification tasks.
We have made our code and corpus publicly available to make our results reproducible.
With the ubiquity of large-scale graph data in a variety of application domains, querying them effectively is a challenge.
In particular, reachability queries are becoming increasingly important, especially for containment, subsumption, and connectivity checks.
Whereas many methods have been proposed for static graph reachability, many real-world graphs are constantly evolving, which calls for dynamic indexing.
In this paper, we present a fully dynamic reachability index over dynamic graphs.
Our method, called DAGGER, is a light-weight index based on interval labeling, that scales to million node graphs and beyond.
Our extensive experimental evaluation on real-world and synthetic graphs confirms its effectiveness over baseline methods.
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence.
We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren.
Crucially, the neural model uses miniKanren's internal representation as input; miniKanren represents a PBE problem as recursive constraints imposed by the provided examples.
We explore Recurrent Neural Network and Graph Neural Network models.
We contribute a modified miniKanren, drivable by an external agent, available at https://github.com/xuexue/neuralkanren.
We show that our neural-guided approach using constraints can synthesize programs faster in many cases, and importantly, can generalize to larger problems.
While foreground extraction is fundamental to virtual reality systems and has been studied for decades, majority of the professional softwares today still rely substantially on human interventions, e.g., providing trimaps or labeling key frames.
This is not only time consuming, but is also sensitive to human error.
In this paper, we present a fully automatic foreground extraction algorithm which does not require any trimap or scribble.
Our solution is based on a newly developed concept called the Multi-Agent Consensus Equilibrium (MACE), a framework which allows us to integrate multiple sources of expertise to produce an overall superior result.
The MACE framework consists of three agents: (1) A new dual layer closed-form matting agent to estimate the foreground mask using the color image and a background image; (2) A background probability estimator using color difference and object segmentation; (3) A total variation minimization agent to control the smoothness of the foreground masks.
We show how these agents are constructed, and how their interactions lead to better performance.
We evaluate the performance of the proposed algorithm by comparing to several state-of-the-art methods.
On the real datasets we tested, our results show less error compared to the other methods.
We present a transition-based dependency parser that uses a convolutional neural network to compose word representations from characters.
The character composition model shows great improvement over the word-lookup model, especially for parsing agglutinative languages.
These improvements are even better than using pre-trained word embeddings from extra data.
On the SPMRL data sets, our system outperforms the previous best greedy parser (Ballesteros et al., 2015) by a margin of 3% on average.
We propose Range and Roots which are two common patterns useful for specifying a wide range of counting and occurrence constraints.
We design specialised propagation algorithms for these two patterns.
Counting and occurrence constraints specified using these patterns thus directly inherit a propagation algorithm.
To illustrate the capabilities of the Range and Roots constraints, we specify a number of global constraints taken from the literature.
Preliminary experiments demonstrate that propagating counting and occurrence constraints using these two patterns leads to a small loss in performance when compared to specialised global constraints and is competitive with alternative decompositions using elementary constraints.
Community structure is an important area of research.
It has received a considerable attention from the scientific community.
Despite its importance, one of the key problems in locating information about community detection is the diverse spread of related articles across various disciplines.
To the best of our knowledge, there is no current comprehensive review of recent literature which uses a scientometric analysis using complex networks analysis covering all relevant articles from the Web of Science (WoS).
Here we present a visual survey of key literature using CiteSpace.
The idea is to identify emerging trends besides using network techniques to examine the evolution of the domain.
Towards that end, we identify the most influential, central, as well as active nodes using scientometric analyses.
We examine authors, key articles, cited references, core subject categories, key journals, institutions, as well as countries.
The exploration of the scientometric literature of the domain reveals that Yong Wang is a pivot node with the highest centrality.
Additionally, we have observed that Mark Newman is the most highly cited author in the network.
We have also identified that the journal, "Reviews of Modern Physics" has the strongest citation burst.
In terms of cited documents, an article by Andrea Lancichinetti has the highest centrality score.
We have also discovered that the origin of the key publications in this domain is from the United States.
Whereas Scotland has the strongest and longest citation burst.
Additionally, we have found that the categories of "Computer Science" and "Engineering" lead other categories based on frequency and centrality respectively.
A container is a group of processes isolated from other groups via distinct kernel namespaces and resource allocation quota.
Attacks against containers often leverage kernel exploits through system call interface.
In this paper, we present an approach that mines sandboxes for containers.
We first explore the behaviors of a container by leveraging automatic testing, and extract the set of system calls accessed during testing.
The set of system calls then results as a sandbox of the container.
The mined sandbox restricts the container's access to system calls which are not seen during testing and thus reduces the attack surface.
In the experiment, our approach requires less than eleven minutes to mine sandbox for each of the containers.
The enforcement of mined sandboxes does not impact the regular functionality of a container and incurs low performance overhead.
Layout hotpot detection is one of the main steps in modern VLSI design.
A typical hotspot detection flow is extremely time consuming due to the computationally expensive mask optimization and lithographic simulation.
Recent researches try to facilitate the procedure with a reduced flow including feature extraction, training set generation and hotspot detection, where feature extraction methods and hotspot detection engines are deeply studied.
However, the performance of hotspot detectors relies highly on the quality of reference layout libraries which are costly to obtain and usually predetermined or randomly sampled in previous works.
In this paper, we propose an active learning-based layout pattern sampling and hotspot detection flow, which simultaneously optimizes the machine learning model and the training set that aims to achieve similar or better hotspot detection performance with much smaller number of training instances.
Experimental results show that our proposed method can significantly reduce lithography simulation overhead while attaining satisfactory detection accuracy on designs under both DUV and EUV lithography technologies.
The advancements in wireless mesh networks (WMN), and the surge in multi-radio multi-channel (MRMC) WMN deployments have spawned a multitude of network performance issues.
These issues are intricately linked to the adverse impact of endemic interference.
Thus, interference mitigation is a primary design objective in WMNs.
Interference alleviation is often effected through efficient channel allocation (CA) schemes which fully utilize the potential of MRMC environment and also restrain the detrimental impact of interference.
However, numerous CA schemes have been proposed in research literature and there is a lack of CA performance prediction techniques which could assist in choosing a suitable CA for a given WMN.
In this work, we propose a reliable interference estimation and CA performance prediction approach.
We demonstrate its efficacy by substantiating the CA performance predictions for a given WMN with experimental data obtained through rigorous simulations on an ns-3 802.11g environment.
Target encoding plays a central role when learning Convolutional Neural Networks.
In this realm, One-hot encoding is the most prevalent strategy due to its simplicity.
However, this so widespread encoding schema assumes a flat label space, thus ignoring rich relationships existing among labels that can be exploited during training.
In large-scale datasets, data does not span the full label space, but instead lies in a low-dimensional output manifold.
Following this observation, we embed the targets into a low-dimensional space, drastically improving convergence speed while preserving accuracy.
Our contribution is two fold: (i) We show that random projections of the label space are a valid tool to find such lower dimensional embeddings, boosting dramatically convergence rates at zero computational cost; and (ii) we propose a normalized eigenrepresentation of the class manifold that encodes the targets with minimal information loss, improving the accuracy of random projections encoding while enjoying the same convergence rates.
Experiments on CIFAR-100, CUB200-2011, Imagenet, and MIT Places demonstrate that the proposed approach drastically improves convergence speed while reaching very competitive accuracy rates.
Text segmentation (TS) aims at dividing long text into coherent segments which reflect the subtopic structure of the text.
It is beneficial to many natural language processing tasks, such as Information Retrieval (IR) and document summarisation.
Current approaches to text segmentation are similar in that they all use word-frequency metrics to measure the similarity between two regions of text, so that a document is segmented based on the lexical cohesion between its words.
Various NLP tasks are now moving towards the semantic web and ontologies, such as ontology-based IR systems, to capture the conceptualizations associated with user needs and contents.
Text segmentation based on lexical cohesion between words is hence not sufficient anymore for such tasks.
This paper proposes OntoSeg, a novel approach to text segmentation based on the ontological similarity between text blocks.
The proposed method uses ontological similarity to explore conceptual relations between text segments and a Hierarchical Agglomerative Clustering (HAC) algorithm to represent the text as a tree-like hierarchy that is conceptually structured.
The rich structure of the created tree further allows the segmentation of text in a linear fashion at various levels of granularity.
The proposed method was evaluated on a wellknown dataset, and the results show that using ontological similarity in text segmentation is very promising.
Also we enhance the proposed method by combining ontological similarity with lexical similarity and the results show an enhancement of the segmentation quality.
Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions.
Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this application requires substantial implementation effort.
Thus, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups.
This work presents the open-source NiftyNet platform for deep learning in medical imaging.
The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon.
NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications.
Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention.
NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D and 3D images and computational graphs by default.
We present 3 illustrative medical image analysis applications built using NiftyNet: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses.
NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.
Event-based cameras offer much potential to the fields of robotics and computer vision, in part due to their large dynamic range and extremely high "frame rates".
These attributes make them, at least in theory, particularly suitable for enabling tasks like navigation and mapping on high speed robotic platforms under challenging lighting conditions, a task which has been particularly challenging for traditional algorithms and camera sensors.
Before these tasks become feasible however, progress must be made towards adapting and innovating current RGB-camera-based algorithms to work with event-based cameras.
In this paper we present ongoing research investigating two distinct approaches to incorporating event-based cameras for robotic navigation: the investigation of suitable place recognition / loop closure techniques, and the development of efficient neural implementations of place recognition techniques that enable the possibility of place recognition using event-based cameras at very high frame rates using neuromorphic computing hardware.
This work considers multiple-input multiple-output (MIMO) communication systems using hierarchical modulation.
A disadvantage of the maximum-likelihood (ML) MIMO detector is that computational complexity increases exponentially with the number of transmit antennas.
To reduce complexity, we propose a hierarchical modulation scheme to be used in MIMO trans- mission where base and enhancement layers are incorporated.
In the proposed receiver, the base layer is detected first with a minimum mean square error (MMSE) detector which is followed by ML detection of the enhancement layer.
Our results indicate that the proposed low complexity scheme does not compromise performance when design parameters such as code rates and constellation ratio are chosen carefully.
Recent work has shown that convolutional neural networks (CNNs) can be applied successfully in disparity estimation, but these methods still suffer from errors in regions of low-texture, occlusions and reflections.
Concurrently, deep learning for semantic segmentation has shown great progress in recent years.
In this paper, we design a CNN architecture that combines these two tasks to improve the quality and accuracy of disparity estimation with the help of semantic segmentation.
Specifically, we propose a network structure in which these two tasks are highly coupled.
One key novelty of this approach is the two-stage refinement process.
Initial disparity estimates are refined with an embedding learned from the semantic segmentation branch of the network.
The proposed model is trained using an unsupervised approach, in which images from one half of the stereo pair are warped and compared against images from the other camera.
Another key advantage of the proposed approach is that a single network is capable of outputting disparity estimates and semantic labels.
These outputs are of great use in autonomous vehicle operation; with real-time constraints being key, such performance improvements increase the viability of driving applications.
Experiments on KITTI and Cityscapes datasets show that our model can achieve state-of-the-art results and that leveraging embedding learned from semantic segmentation improves the performance of disparity estimation.
We propose a context-dependent model to map utterances within an interaction to executable formal queries.
To incorporate interaction history, the model maintains an interaction-level encoder that updates after each turn, and can copy sub-sequences of previously predicted queries during generation.
Our approach combines implicit and explicit modeling of references between utterances.
We evaluate our model on the ATIS flight planning interactions, and demonstrate the benefits of modeling context and explicit references.
The practical realization of beam steering mechanisms in millimeter wave communications has a large impact on performance.
The key challenge is to find a pragmatic trade-off between throughput performance and the overhead of periodic beam sweeping required to improve link quality in case of transient link blockage.
This is particularly critical in commercial off-the-shelf devices, which require simple yet efficient solutions.
First, we analyze the operation of such a commercial device to understand the impact of link blockage in practice.
To this end, we measure TCP throughput for different traffic loads while blocking the link at regular intervals.
Second, we derive a Markov model based on our practical insights to compute throughput for the case of transient blockage.
We use this model to evaluate the trade-off between throughput and periodic beam sweeping.
Finally, we validate our results using throughput traces collected using the aforementioned commercial device.
Both our model and our practical measurements show that transient blockage causes significant signal fluctuation due to suboptimal beam realignment.
In particular, fluctuations increase with traffic load, limiting the achievable throughput.
We show that choosing lower traffic loads allows us to reduce fluctuations by 41% while achieving the same net throughput than with higher traffic loads.
The human action classification task is a widely researched topic and is still an open problem.
Many state-of-the-arts approaches involve the usage of bag-of-video-words with spatio-temporal local features to construct characterizations for human actions.
In order to improve beyond this standard approach, we investigate the usage of co-occurrences between local features.
We propose the usage of co-occurrences information to characterize human actions.
A trade-off factor is used to define an optimal trade-off between vocabulary size and classification rate.
Next, a spatio-temporal co-occurrence technique is applied to extract co-occurrence information between labeled local features.
Novel characterizations for human actions are then constructed.
These include a vector quantized correlogram-elements vector, a highly discriminative PCA (Principal Components Analysis) co-occurrence vector and a Haralick texture vector.
Multi-channel kernel SVM (support vector machine) is utilized for classification.
For evaluation, the well known KTH as well as the challenging UCF-Sports action datasets are used.
We obtained state-of-the-arts classification performance.
We also demonstrated that we are able to fully utilize co-occurrence information, and improve the standard bag-of-video-words approach.
Lowpass envelope approximation of smooth continuous-variable signals are introduced in this work.
Envelope approximations are necessary when a given signal has to be approximated always to a larger value (such as in TV white space protection regions).
In this work, a near-optimal approximate algorithm for finding a signal's envelope, while minimizing a mean-squared cost function, is detailed.
The sparse (lowpass) signal approximation is obtained in the linear Fourier series basis.
This approximate algorithm works by discretizing the envelope property from an infinite number of points to a large (but finite) number of points.
It is shown that this approximate algorithm is near-optimal and can be solved by using efficient convex optimization programs available in the literature.
Simulation results are provided towards the end to gain more insights into the analytical results presented.
On-demand video accounts for the majority of wireless data traffic.
Video distribution schemes based on caching combined with device-to-device (D2D) communications promise order-of-magnitude greater spectral efficiency for video delivery, but hinge on the principle of "concentrated demand distributions."
This letter presents, for the first time, evaluations of the spectral efficiency of such schemes based on measured cellular demand distributions.
In particular, we use a database with more than 100 million requests (689,461 for cellular users) from the BBC iPlayer, a popular video streaming service in the U.K., to evaluate the throughput-outage tradeoff of a random caching D2D based scheme, and find that also for this realistic case, order-of-magnitude improvements can be achieved.
The gains depend on the size of the local cache in the devices; e.g., with a cache size of 32 GB, a throughput increase of two orders of magnitude at an outage probability between 0.01 and 0.1 can be obtained.
Prevailing computational tools available to and used by architecture and engineering professionals purport to gather and present thorough and accurate perspectives of the environmental impacts associated with their contributions to the built environment.
The presented research of building modeling and analysis software used by the Architecture, Engineering, Construction, and Operations (AECO) industry reveals that many of the most heavily relied-upon industry tools are isolated in functionality, utilize incomplete models and data, and are disruptive to normative design and building optimization workflows.
This paper describes the current models and tools, their primary functions and limitations, and presents our concurrent research to develop more advanced models to assess lifetime building energy consumption alongside operating energy use.
A series of case studies describes the current state-of-the-art in tools and building energy analysis followed by the research models and novel design and analysis Tool that the Green Scale Research Group has developed in response.
A fundamental goal of this effort is to increase the use and efficacy of building impact studies conducted by architects, engineers, and building owners and operators during the building design process.
We present an approach to automatically classify clinical text at a sentence level.
We are using deep convolutional neural networks to represent complex features.
We train the network on a dataset providing a broad categorization of health information.
Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.
This paper proposes a secure surveillance framework for Internet of things (IoT) systems by intelligent integration of video summarization and image encryption.
First, an efficient video summarization method is used to extract the informative frames using the processing capabilities of visual sensors.
When an event is detected from keyframes, an alert is sent to the concerned authority autonomously.
As the final decision about an event mainly depends on the extracted keyframes, their modification during transmission by attackers can result in severe losses.
To tackle this issue, we propose a fast probabilistic and lightweight algorithm for the encryption of keyframes prior to transmission, considering the memory and processing requirements of constrained devices that increase its suitability for IoT systems.
Our experimental results verify the effectiveness of the proposed method in terms of robustness, execution time, and security compared to other image encryption algorithms.
Furthermore, our framework can reduce the bandwidth, storage, transmission cost, and the time required for analysts to browse large volumes of surveillance data and make decisions about abnormal events, such as suspicious activity detection and fire detection in surveillance applications.
Using the machine learning approach known as reservoir computing, it is possible to train one dynamical system to emulate another.
We show that such trained reservoir computers reproduce the properties of the attractor of the chaotic system sufficiently well to exhibit chaos synchronisation.
That is, the trained reservoir computer, weakly driven by the chaotic system, will synchronise with the chaotic system.
Conversely, the chaotic system, weakly driven by a trained reservoir computer, will synchronise with the reservoir computer.
We illustrate this behaviour on the Mackey-Glass and Lorenz systems.
We then show that trained reservoir computers can be used to crack chaos based cryptography and illustrate this on a chaos cryptosystem based on the Mackey-Glass system.
We conclude by discussing why reservoir computers are so good at emulating chaotic systems.
A single-letter lower bound on the sum rate of multiple description coding with tree-structured distortion constraints is established by generalizing Ozarow's celebrated converse argument through the introduction of auxiliary random variables that form a Markov tree.
For the quadratic vector Gaussian case, this lower bound is shown to be achievable by an extended version of the El Gamal-Cover scheme, yielding a complete sum-rate characterization.
Synchronized Random Access Channel (RACH) attempts by Internet of Things (IoT) devices could result in Radio Access Network (RAN) overload in LTE-A.
3GPP adopted Barring Bitmap Enabled-Extended Access Barring (EAB-BB) mechanism that announces the EAB information (i.e., a list of barred Access Classes) through a barring bitmap as the baseline solution to mitigate the RAN overload.
EAB-BB was analyzed for its optimal performance in a recent work.
However, there has been no work that analyzes Barring Factor Enabled-Extended Access Barring (EAB-BF), an alternative mechanism that was considered during the standardization process.
Due to the modeling complexity involved, not only has it been difficult to analyze EAB-BF, but also, a much more far-reaching issue, like the effect of these schemes on key network performance parameter, like eNodeB energy consumption, has been overlooked.
In this regard, for the first time, we develop a novel analytical model for EAB-BF to obtain its performance metrics.
Results obtained from our analysis and simulation are seen to match very well.
Furthermore, we also build an eNodeB energy consumption model to serve the IoT RACH requests.
We then show that our analytical and energy consumption models can be combined to obtain EAB-BF settings that can minimize eNodeB energy consumption, while simultaneously providing optimal Quality of Service (QoS) performance.
Results obtained reveal that the optimal performance of EAB-BF is better than that of EAB-BB.
Furthermore, we also show that not only all the three 3GPP-proposed EAB-BF settings considered during standardization provide sub-optimal QoS to devices, but also result in excessive eNodeB energy consumption, thereby acutely penalizing the network.
Finally, we provide corrections to these 3GPP-settings that can lead to significant gains in EAB-BF performance.
Storage networking technology has enjoyed strong growth in recent years, but security concerns and threats facing networked data have grown equally fast.
Today, there are many potential threats that are targeted at storage networks, including data modification, destruction and theft, DoS attacks, malware, hardware theft and unauthorized access, among others.
In order for a Storage Area Network (SAN) to be secure, each of these threats must be individually addressed.
In this paper, we present a comparative study by implementing different security methods in IP Storage network.
LSTMs and other RNN variants have shown strong performance on character-level language modeling.
These models are typically trained using truncated backpropagation through time, and it is common to assume that their success stems from their ability to remember long-term contexts.
In this paper, we show that a deep (64-layer) transformer model with fixed context outperforms RNN variants by a large margin, achieving state of the art on two popular benchmarks: 1.13 bits per character on text8 and 1.06 on enwik8.
To get good results at this depth, we show that it is important to add auxiliary losses, both at intermediate network layers and intermediate sequence positions.
Computational Social Choice is an interdisciplinary research area involving Economics, Political Science, and Social Science on the one side, and Mathematics and Computer Science (including Artificial Intelligence and Multiagent Systems) on the other side.
Typical computational problems studied in this field include the vulnerability of voting procedures against attacks, or preference aggregation in multi-agent systems.
Parameterized Algorithmics is a subfield of Theoretical Computer Science seeking to exploit meaningful problem-specific parameters in order to identify tractable special cases of in general computationally hard problems.
In this paper, we propose nine of our favorite research challenges concerning the parameterized complexity of problems appearing in this context.
We present and evaluate an approach for human-in-the-loop specification of shape reconstruction with annotations for basic robot-object interactions.
Our method is based on the idea of model annotation: the addition of simple cues to an underlying object model to specify shape and delineate a simple task.
The goal is to explore reducing the complexity of CAD-like interfaces so that novice users can quickly recover an object's shape and describe a manipulation task that is then carried out by a robot.
The object modeling and interaction annotation capabilities are tested with a user study and compared against results obtained using existing approaches.
The approach has been analyzed using a variety of shape comparison, grasping, and manipulation metrics, and tested with the PR2 robot platform, where it was shown to be successful.
In the last decade, deep learning algorithms have become very popular thanks to the achieved performance in many machine learning and computer vision tasks.
However, most of the deep learning architectures are vulnerable to so called adversarial examples.
This questions the security of deep neural networks (DNN) for many security- and trust-sensitive domains.
The majority of the proposed existing adversarial attacks are based on the differentiability of the DNN cost function.Defence strategies are mostly based on machine learning and signal processing principles that either try to detect-reject or filter out the adversarial perturbations and completely neglect the classical cryptographic component in the defence.
In this work, we propose a new defence mechanism based on the second Kerckhoffs's cryptographic principle which states that the defence and classification algorithm are supposed to be known, but not the key.
To be compliant with the assumption that the attacker does not have access to the secret key, we will primarily focus on a gray-box scenario and do not address a white-box one.
More particularly, we assume that the attacker does not have direct access to the secret block, but (a) he completely knows the system architecture, (b) he has access to the data used for training and testing and (c) he can observe the output of the classifier for each given input.
We show empirically that our system is efficient against most famous state-of-the-art attacks in black-box and gray-box scenarios.
Current approaches to cross-lingual sentiment analysis try to leverage the wealth of labeled English data using bilingual lexicons, bilingual vector space embeddings, or machine translation systems.
Here we show that it is possible to use a single linear transformation, with as few as 2000 word pairs, to capture fine-grained sentiment relationships between words in a cross-lingual setting.
We apply these cross-lingual sentiment models to a diverse set of tasks to demonstrate their functionality in a non-English context.
By effectively leveraging English sentiment knowledge without the need for accurate translation, we can analyze and extract features from other languages with scarce data at a very low cost, thus making sentiment and related analyses for many languages inexpensive.
We study an uplink multi secondary user (SU) cognitive radio system having average delay constraints as well as an instantaneous interference constraint to the primary user (PU).
If the interference channels from the SUs to the PU have independent but not identically distributed fading coefficients, then the SUs will experience heterogeneous delay performances.
This is because SUs causing low interference to the PU will be scheduled more frequently, and/or allocated more transmission power than those causing high interference.
We propose a dynamic scheduling-and-power-control algorithm that can provide the required average delay guarantees to all SUs as well as protecting the PU from interference.
Using the Lyapunov technique, we show that our algorithm is asymptotically delay optimal while satisfying the delay and interference constraints.
We support our findings by extensive system simulations and show the robustness of the proposed algorithm against channel estimation errors.
We propose a new MDS paradigm called reader-aware multi-document summarization (RA-MDS).
Specifically, a set of reader comments associated with the news reports are also collected.
The generated summaries from the reports for the event should be salient according to not only the reports but also the reader comments.
To tackle this RA-MDS problem, we propose a sparse-coding-based method that is able to calculate the salience of the text units by jointly considering news reports and reader comments.
Another reader-aware characteristic of our framework is to improve linguistic quality via entity rewriting.
The rewriting consideration is jointly assessed together with other summarization requirements under a unified optimization model.
To support the generation of compressive summaries via optimization, we explore a finer syntactic unit, namely, noun/verb phrase.
In this work, we also generate a data set for conducting RA-MDS.
Extensive experiments on this data set and some classical data sets demonstrate the effectiveness of our proposed approach.
Industrial Control Systems are under increased scrutiny.
Their security is historically sub-par, and although measures are being taken by the manufacturers to remedy this, the large installed base of legacy systems cannot easily be updated with state-of-the-art security measures.
We propose a system that uses electromagnetic side-channel measurements to detect behavioural changes of the software running on industrial control systems.
To demonstrate the feasibility of this method, we show it is possible to profile and distinguish between even small changes in programs on Siemens S7-317 PLCs, using methods from cryptographic side-channel analysis.
Objective: Radiomics-driven Computer Aided Diagnosis (CAD) has shown considerable promise in recent years as a potential tool for improving clinical decision support in medical oncology, particularly those based around the concept of Discovery Radiomics, where radiomic sequencers are discovered through the analysis of medical imaging data.
One of the main limitations with current CAD approaches is that it is very difficult to gain insight or rationale as to how decisions are made, thus limiting their utility to clinicians.
Methods: In this study, we propose CLEAR-DR, a novel interpretable CAD system based on the notion of CLass-Enhanced Attentive Response Discovery Radiomics for the purpose of clinical decision support for diabetic retinopathy.
Results: In addition to disease grading via the discovered deep radiomic sequencer, the CLEAR-DR system also produces a visual interpretation of the decision-making process to provide better insight and understanding into the decision-making process of the system.
Conclusion: We demonstrate the effectiveness and utility of the proposed CLEAR-DR system of enhancing the interpretability of diagnostic grading results for the application of diabetic retinopathy grading.
Significance: CLEAR-DR can act as a potential powerful tool to address the uninterpretability issue of current CAD systems, thus improving their utility to clinicians.
We propose a new approach for solving a class of discrete decision making problems under uncertainty with positive cost.
This issue concerns multiple and diverse fields such as engineering, economics, artificial intelligence, cognitive science and many others.
Basically, an agent has to choose a single or series of actions from a set of options, without knowing for sure their consequences.
Schematically, two main approaches have been followed: either the agent learns which option is the correct one to choose in a given situation by trial and error, or the agent already has some knowledge on the possible consequences of his decisions; this knowledge being generally expressed as a conditional probability distribution.
In the latter case, several optimal or suboptimal methods have been proposed to exploit this uncertain knowledge in various contexts.
In this work, we propose following a different approach, based on the geometric intuition of distance.
More precisely, we define a goal independent quasimetric structure on the state space, taking into account both cost function and transition probability.
We then compare precision and computation time with classical approaches.
Modular optical switch architectures combining wavelength routing based on arrayed waveguide grating (AWG) devices and multicasting based on star couplers hold promise for flexibly addressing the exponentially growing traffic demands in a cost- and power-efficient fashion.
In a default switching scenario, an input port of the AWG is connected to an output port via a single wavelength.
This can severely limit the capacity between broadcast domains, resulting in interdomain traffic switching bottlenecks.
In this paper, we examine the possibility of resolving capacity bottlenecks by exploiting multiple AWG free spectral ranges (FSRs), i.e., setting up multiple parallel connections between each pair of broadcast domains.
To this end, we introduce a multi-FSR scheduling algorithm for interconnecting broadcast domains by fairly distributing the wavelength resources among them.
We develop a general-purpose analytical framework to study the blocking probabilities in a multistage switching scenario and compare our results with Monte Carlo simulations.
Our study points to significant improvements with a moderate increase in the number of FSRs.
We show that an FSR count beyond four results in diminishing returns.
Furthermore, to investigate the trade-offs between the network- and physical-layer effects, we conduct a cross-layer analysis, taking into account pulse amplitude modulation (PAM) and rate-adaptive forward error correction (FEC).
We illustrate how the effective bit rate per port increases with an increase in the number of FSRs.
%We also look at the advantages of an impairment-aware scheduling strategy in a multi-FSR switching scenario.
This paper presents an adaptive fault-tolerant control (FTC) scheme for a class of nonlinear uncertain multi-agent systems.
A local FTC scheme is designed for each agent using local measurements and suitable information exchanged between neighboring agents.
Each local FTC scheme consists of a fault diagnosis module and a reconfigurable controller module comprised of a baseline controller and two adaptive fault-tolerant controllers activated after fault detection and after fault isolation, respectively.
Under certain assumptions, the closed-loop system's stability and leader-follower consensus properties are rigorously established under different modes of the FTC system, including the time-period before possible fault detection, between fault detection and possible isolation, and after fault isolation.
Ubiquitous computing helps make data and services available to users anytime and anywhere.
This makes the cooperation of devices a crucial need.
In return, such cooperation causes an overload of the devices and/or networks, resulting in network malfunction and suspension of its activities.
Our goal in this paper is to propose an approach of devices reconfiguration in order to help to reduce the energy consumption in ubiquitous environments.
The idea is that when high-energy consumption is detected, we proceed to a change in component distribution on the devices to reduce and/or balance the energy consumption.
We also investigate the possibility to detect high-energy consumption of devices/network based on devices abilities.
As a result, our idea realizes a reconfiguration of devices aimed at reducing the consumption of energy and/or load balancing in ubiquitous environments.
To synthesize Maxwell optics systems, the mathematical apparatus of tensor and vector analysis is generally employed.
This mathematical apparatus implies executing a great number of simple stereotyped operations, which are adequately supported by computer algebra systems.
In this paper, we distinguish between two stages of working with a mathematical model: model development and model usage.
Each of these stages implies its own computer algebra system.
As a model problem, we consider the problem of geometrization of Maxwell's equations.
Two computer algebra systems---Cadabra and FORM---are selected for use at different stages of investigation.
This paper is motivated by the automation of neuropsychological tests involving discourse analysis in the retellings of narratives by patients with potential cognitive impairment.
In this scenario the task of sentence boundary detection in speech transcripts is important as discourse analysis involves the application of Natural Language Processing tools, such as taggers and parsers, which depend on the sentence as a processing unit.
Our aim in this paper is to verify which embedding induction method works best for the sentence boundary detection task, specifically whether it be those which were proposed to capture semantic, syntactic or morphological similarities.
Musical counterpoint, a musical technique in which two or more independent melodies are played simultaneously with the goal of creating harmony, has been around since the baroque era.
However, to our knowledge computational generation of aesthetically pleasing linear counterpoint based on subjective fitness assessment has not been explored by the evolutionary computation community (although generation using objective fitness has been attempted in quite a few cases).
The independence of contrapuntal melodies and the subjective nature of musical aesthetics provide an excellent platform for the application of genetic algorithms.
In this paper, a genetic algorithm approach to generating contrapuntal melodies is explained, with a description of the various musical heuristics used and of how variable-length chromosome strings are used to avoid generating "jerky" rhythms and melodic phrases, as well as how subjectivity is incorporated into the algorithm's fitness measures.
Next, results from empirical testing of the algorithm are presented, with a focus on how a user's musical sophistication influences their experience.
Lastly, further musical and compositional applications of the algorithm are discussed along with planned future work on the algorithm.
Distributed word representations (word embeddings) have recently contributed to competitive performance in language modeling and several NLP tasks.
In this work, we train word embeddings for more than 100 languages using their corresponding Wikipedias.
We quantitatively demonstrate the utility of our word embeddings by using them as the sole features for training a part of speech tagger for a subset of these languages.
We find their performance to be competitive with near state-of-art methods in English, Danish and Swedish.
Moreover, we investigate the semantic features captured by these embeddings through the proximity of word groupings.
We will release these embeddings publicly to help researchers in the development and enhancement of multilingual applications.
The distribution semantics is one of the most prominent approaches for the combination of logic programming and probability theory.
Many languages follow this semantics, such as Independent Choice Logic, PRISM, pD, Logic Programs with Annotated Disjunctions (LPADs) and ProbLog.
When a program contains functions symbols, the distribution semantics is well-defined only if the set of explanations for a query is finite and so is each explanation.
Well-definedness is usually either explicitly imposed or is achieved by severely limiting the class of allowed programs.
In this paper we identify a larger class of programs for which the semantics is well-defined together with an efficient procedure for computing the probability of queries.
Since LPADs offer the most general syntax, we present our results for them, but our results are applicable to all languages under the distribution semantics.
We present the algorithm "Probabilistic Inference with Tabling and Answer subsumption" (PITA) that computes the probability of queries by transforming a probabilistic program into a normal program and then applying SLG resolution with answer subsumption.
PITA has been implemented in XSB and tested on six domains: two with function symbols and four without.
The execution times are compared with those of ProbLog, cplint and CVE, PITA was almost always able to solve larger problems in a shorter time, on domains with and without function symbols.
Dropout Variational Inference, or Dropout Sampling, has been recently proposed as an approximation technique for Bayesian Deep Learning and evaluated for image classification and regression tasks.
This paper investigates the utility of Dropout Sampling for object detection for the first time.
We demonstrate how label uncertainty can be extracted from a state-of-the-art object detection system via Dropout Sampling.
We evaluate this approach on a large synthetic dataset of 30,000 images, and a real-world dataset captured by a mobile robot in a versatile campus environment.
We show that this uncertainty can be utilized to increase object detection performance under the open-set conditions that are typically encountered in robotic vision.
A Dropout Sampling network is shown to achieve a 12.3% increase in recall (for the same precision score as a standard network) and a 15.1% increase in precision (for the same recall score as the standard network).
We establish an equivalence between two seemingly different theories: one is the traditional axiomatisation of incomplete preferences on horse lotteries based on the mixture independence axiom; the other is the theory of desirable gambles developed in the context of imprecise probability.
The equivalence allows us to revisit incomplete preferences from the viewpoint of desirability and through the derived notion of coherent lower previsions.
Perhaps most importantly, we argue throughout that desirability is a powerful and natural setting to model, and work with, incomplete preferences, even in case of non-Archimedean problems.
This leads us to suggest that desirability, rather than preference, should be the primitive notion at the basis of decision-theoretic axiomatisations.
For the task of subdecimeter aerial imagery segmentation, fine-grained semantic segmentation results are usually difficult to obtain because of complex remote sensing content and optical conditions.
Recently, convolutional neural networks (CNNs) have shown outstanding performance on this task.
Although many deep neural network structures and techniques have been applied to improve the accuracy, few have paid attention to better differentiating the easily confused classes.
In this paper, we propose TreeSegNet which adopts an adaptive network to increase the classification rate at the pixelwise level.
Specifically, based on the infrastructure of DeepUNet, a Tree-CNN block in which each node represents a ResNeXt unit is constructed adaptively according to the confusion matrix and the proposed TreeCutting algorithm.
By transporting feature maps through concatenating connections, the Tree-CNN block fuses multiscale features and learns best weights for the model.
In experiments on the ISPRS 2D semantic labeling Potsdam dataset, the results obtained by TreeSegNet are better than those of other published state-of-the-art methods.
Detailed comparison and analysis show that the improvement brought by the adaptive Tree-CNN block is significant.
We give a new simple and short ("one-line") analysis for the runtime of the well-known Euclidean Algorithm.
While very short simple, the obtained upper bound in near-optimal.
Driving support systems, such as car navigation systems are becoming common and they support driver in several aspects.
Non-intrusive method of detecting Fatigue and drowsiness based on eye-blink count and eye directed instruction controlhelps the driver to prevent from collision caused by drowsy driving.
Eye detection and tracking under various conditions such as illumination, background, face alignment and facial expression makes the problem complex.Neural Network based algorithm is proposed in this paper to detect the eyes efficiently.
In the proposed algorithm, first the neural Network is trained to reject the non-eye regionbased on images with features of eyes and the images with features of non-eye using Gabor filter and Support Vector Machines to reduce the dimension and classify efficiently.
In the algorithm, first the face is segmented using L*a*btransform color space, then eyes are detected using HSV and Neural Network approach.
The algorithm is tested on nearly 100 images of different persons under different conditions and the results are satisfactory with success rate of 98%.The Neural Network is trained with 50 non-eye images and 50 eye images with different angles using Gabor filter.
This paper is a part of research work on "Development of Non-Intrusive system for real-time Monitoring and Prediction of Driver Fatigue and drowsiness" project sponsored by Department of Science & Technology, Govt. of India, New Delhi at Vignan Institute of Technology and Sciences, Vignan Hills, Hyderabad.
Recent developments in image quality, data storage, and computational capacity have heightened the need for texture analysis in image process.
To date various methods have been developed and introduced for assessing textures in images.
One of the most popular texture analysis methods is the Texture Energy Measure (TEM) and it has been used for detecting edges, levels, waves, spots and ripples by employing predefined TEM masks to images.
Despite several success- ful studies, TEM has a number of serious weaknesses in use.
The major drawback is; the masks are predefined therefore they cannot be adapted to image.
A new method, Adaptive Texture Energy Measure Method (aTEM), was offered to over- come this disadvantage of TEM by using adaptive masks by adjusting the contrast, sharpening and orientation angle of the mask.
To assess the applicability of aTEM, it is compared with TEM.
The accuracy of the classification of butterfly, flower seed and Brodatz datasets are 0.08, 0.3292 and 0.3343, respectively by TEM and 0.0053, 0.2417 and 0.3153, respectively by aTEM.
The results of this study indicate that aTEM is a successful method for texture analysis.
We introduce a hierarchy of fast-growing complexity classes and show its suitability for completeness statements of many non elementary problems.
This hierarchy allows the classification of many decision problems with a non-elementary complexity, which occur naturally in logic, combinatorics, formal languages, verification, etc., with complexities ranging from simple towers of exponentials to Ackermannian and beyond.
Automatic voice-controlled systems have changed the way humans interact with a computer.
Voice or speech recognition systems allow a user to make a hands-free request to the computer, which in turn processes the request and serves the user with appropriate responses.
After years of research and developments in machine learning and artificial intelligence, today voice-controlled technologies have become more efficient and are widely applied in many domains to enable and improve human-to-human and human-to-computer interactions.
The state-of-the-art e-commerce applications with the help of web technologies offer interactive and user-friendly interfaces.
However, there are some instances where people, especially with visual disabilities, are not able to fully experience the serviceability of such applications.
A voice-controlled system embedded in a web application can enhance user experience and can provide voice as a means to control the functionality of e-commerce websites.
In this paper, we propose a taxonomy of speech recognition systems (SRS) and present a voice-controlled commodity purchase e-commerce application using IBM Watson speech-to-text to demonstrate its usability.
The prototype can be extended to other application scenarios such as government service kiosks and enable analytics of the converted text data for scenarios such as medical diagnosis at the clinics.
Network analysis defines a number of centrality measures to identify the most central nodes in a network.
Fast computation of those measures is a major challenge in algorithmic network analysis.
Aside from closeness and betweenness, Katz centrality is one of the established centrality measures.
In this paper, we consider the problem of computing rankings for Katz centrality.
In particular, we propose upper and lower bounds on the Katz score of a given node.
While previous approaches relied on numerical approximation or heuristics to compute Katz centrality rankings, we construct an algorithm that iteratively improves those upper and lower bounds until a correct Katz ranking is obtained.
We extend our algorithm to dynamic graphs while maintaining its correctness guarantees.
Experiments demonstrate that our static graph algorithm outperforms both numerical approaches and heuristics with speedups between 1.5x and 3.5x, depending on the desired quality guarantees.
Our dynamic graph algorithm improves upon the static algorithm for update batches of less than 10000 edges.
We provide efficient parallel CPU and GPU implementations of our algorithms that enable near real-time Katz centrality computation for graphs with hundreds of millions of nodes in fractions of seconds.
Surrogate-based optimization and nature-inspired metaheuristics have become the state-of-the-art in solving real-world optimization problems.
Still, it is difficult for beginners and even experts to get an overview that explains their advantages in comparison to the large number of available methods in the scope of continuous optimization.
Available taxonomies lack the integration of surrogate-based approaches and thus their embedding in the larger context of this broad field.
This article presents a taxonomy of the field, which further matches the idea of nature-inspired algorithms, as it is based on the human behavior in path finding.
Intuitive analogies make it easy to conceive the most basic principles of the search algorithms, even for beginners and non-experts in this area of research.
However, this scheme does not oversimplify the high complexity of the different algorithms, as the class identifier only defines a descriptive meta-level of the algorithm search strategies.
The taxonomy was established by exploring and matching algorithm schemes, extracting similarities and differences, and creating a set of classification indicators to distinguish between five distinct classes.
In practice, this taxonomy allows recommendations for the applicability of the corresponding algorithms and helps developers trying to create or improve their own algorithms.
Distributed stateful stream processing enables the deployment and execution of large scale continuous computations in the cloud, targeting both low latency and high throughput.
One of the most fundamental challenges of this paradigm is providing processing guarantees under potential failures.
Existing approaches rely on periodic global state snapshots that can be used for failure recovery.
Those approaches suffer from two main drawbacks.
First, they often stall the overall computation which impacts ingestion.
Second, they eagerly persist all records in transit along with the operation states which results in larger snapshots than required.
In this work we propose Asynchronous Barrier Snapshotting (ABS), a lightweight algorithm suited for modern dataflow execution engines that minimises space requirements.
ABS persists only operator states on acyclic execution topologies while keeping a minimal record log on cyclic dataflows.
We implemented ABS on Apache Flink, a distributed analytics engine that supports stateful stream processing.
Our evaluation shows that our algorithm does not have a heavy impact on the execution, maintaining linear scalability and performing well with frequent snapshots.
While the incipient internet was largely text-based, the modern digital world is becoming increasingly multi-modal.
Here, we examine multi-modal classification where one modality is discrete, e.g. text, and the other is continuous, e.g. visual representations transferred from a convolutional neural network.
In particular, we focus on scenarios where we have to be able to classify large quantities of data quickly.
We investigate various methods for performing multi-modal fusion and analyze their trade-offs in terms of classification accuracy and computational efficiency.
Our findings indicate that the inclusion of continuous information improves performance over text-only on a range of multi-modal classification tasks, even with simple fusion methods.
In addition, we experiment with discretizing the continuous features in order to speed up and simplify the fusion process even further.
Our results show that fusion with discretized features outperforms text-only classification, at a fraction of the computational cost of full multi-modal fusion, with the additional benefit of improved interpretability.
Given a social network with diffusion probabilities as edge weights and an integer k, which k nodes should be chosen for initial injection of information to maximize influence in the network?
This problem is known as Target Set Selection in a social network (TSS Problem) and more popularly, Social Influence Maximization Problem (SIM Problem).
This is an active area of research in computational social network analysis domain since one and half decades or so.
Due to its practical importance in various domains, such as viral marketing, target advertisement, personalized recommendation, the problem has been studied in different variants, and different solution methodologies have been proposed over the years.
Hence, there is a need for an organized and comprehensive review on this topic.
This paper presents a survey on the progress in and around TSS Problem.
At last, it discusses current research trends and future research directions as well.
Many applications in different domains produce large amount of time series data.
Making accurate forecasting is critical for many decision makers.
Various time series forecasting methods exist which use linear and nonlinear models separately or combination of both.
Studies show that combining of linear and nonlinear models can be effective to improve forecasting performance.
However, some assumptions that those existing methods make, might restrict their performance in certain situations.
We provide a new Autoregressive Integrated Moving Average (ARIMA)-Artificial Neural Network(ANN) hybrid method that work in a more general framework.
Experimental results show that strategies for decomposing the original data and for combining linear and nonlinear models throughout the hybridization process are key factors in the forecasting performance of the methods.
By using appropriate strategies, our hybrid method can be an effective way to improve forecasting accuracy obtained by traditional hybrid methods and also either of the individual methods used separately.
Dynamically typed programming languages such as JavaScript and Python defer type checking to run time.
In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision.
This has lead to the creation of increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths.
This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information.
This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler.
This approach is compared with a classical flow-based type analysis.
Lazy basic block versioning performs as well or better on all benchmarks.
On average, 71% of type tests are eliminated, yielding speedups of up to 50%.
We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks.
The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers.
We give an example of a three-person deterministic graphical game that has no Nash equilibrium in pure stationary strategies.
The game has seven positions, four outcomes (a unique cycle and three terminal positions), and its normal form is of size 2 x 2 x 4 only.
Thus, our example strengthens significantly the one obtained in 2014 by Gurvich and Oudalov; the latter has four players, five terminals, and a 2 x 4 x 6 x 8 normal form.
Furthermore, our example is minimal with respect to the number of players.
Both examples are tight but not Nash-solvable.
Such examples were known since 1975, but they were not related to deterministic graphical games.
Moreover, due to the small size of our example, we can strengthen it further by showing that it has no Nash equilibrium not only in pure but also in independently mixed strategies, for both Markovian and a priori evaluations.
Principal component analysis (PCA) has well-documented merits for data extraction and dimensionality reduction.
PCA deals with a single dataset at a time, and it is challenged when it comes to analyzing multiple datasets.
Yet in certain setups, one wishes to extract the most significant information of one dataset relative to other datasets.
Specifically, the interest may be on identifying, namely extracting features that are specific to a single target dataset but not the others.
This paper develops a novel approach for such so-termed discriminative data analysis, and establishes its optimality in the least-squares (LS) sense under suitable data modeling assumptions.
The criterion reveals linear combinations of variables by maximizing the ratio of the variance of the target data to that of the remainders.
The novel approach solves a generalized eigenvalue problem by performing SVD just once.
Numerical tests using synthetic and real datasets showcase the merits of the proposed approach relative to its competing alternatives.
We analyze the time evolution of citations acquired by articles from journals of the American Physical Society (PRA, PRB, PRC, PRD, PRE and PRL).
The observed change over time in the number of papers published in each journal is considered an exogenously caused variation in citability that is accounted for by a normalization.
The appropriately inflation-adjusted citation rates are found to be separable into a preferential-attachment-type growth kernel and a purely obsolescence-related (i.e., monotonously decreasing as a function of time since publication) aging function.
Variations in the empirically extracted parameters of the growth kernels and aging functions associated with different journals point to research-field-specific characteristics of citation intensity and knowledge flow.
Comparison with analogous results for the citation dynamics of technology-disaggregated cohorts of patents provides deeper insight into the basic principles of information propagation as indicated by citing behavior.
This study concerns with the diagnosis of aerospace structure defects by applying a HPC parallel implementation of a novel learning algorithm, named U-BRAIN.
The Soft Computing approach allows advanced multi-parameter data processing in composite materials testing.
The HPC parallel implementation overcomes the limits due to the great amount of data and the complexity of data processing.
Our experimental results illustrate the effectiveness of the U-BRAIN parallel implementation as defect classifier in aerospace structures.
The resulting system is implemented on a Linux-based cluster with multi-core architecture.
The Wang tiling is a classical problem in combinatorics.
A major theoretical question is to find a (small) set of tiles which tiles the plane only aperiodically.
In this case, resulting tilings are rather restrictive.
On the other hand, Wang tiles are used as a tool to generate textures and patterns in computer graphics.
In these applications, a set of tiles is normally chosen so that it tiles the plane or its sub-regions easily in many different ways.
With computer graphics applications in mind, we introduce a class of such tileset, which we call sequentially permissive tilesets, and consider tiling problems with constrained boundary.
We apply our methodology to a special set of Wang tiles, called Brick Wang tiles, introduced by Derouet-Jourdan et al. in 2015 to model wall patterns.
We generalise their result by providing a linear algorithm to decide and solve the tiling problem for arbitrary planar regions with holes.
Many real world applications can be framed as multi-objective optimization problems, where we wish to simultaneously optimize for multiple criteria.
Bayesian optimization techniques for the multi-objective setting are pertinent when the evaluation of the functions in question are expensive.
Traditional methods for multi-objective optimization, both Bayesian and otherwise, are aimed at recovering the Pareto front of these objectives.
However, in certain cases a practitioner might desire to identify Pareto optimal points only in a particular region of the Pareto front due to external considerations.
In this work, we propose a strategy based on random scalarizations of the objectives that addresses this problem.
While being computationally similar or cheaper than other approaches, our approach is flexible enough to sample from specified subsets of the Pareto front or the whole of it.
We also introduce a novel notion of regret in the multi-objective setting and show that our strategy achieves sublinear regret.
We experiment with both synthetic and real-life problems, and demonstrate superior performance of our proposed algorithm in terms of flexibility, scalability and regret.
Early diagnosis of pulmonary nodules (PNs) can improve the survival rate of patients and yet is a challenging task for radiologists due to the image noise and artifacts in computed tomography (CT) images.
In this paper, we propose a novel and effective abnormality detector implementing the attention mechanism and group convolution on 3D single-shot detector (SSD) called group-attention SSD (GA-SSD).
We find that group convolution is effective in extracting rich context information between continuous slices, and attention network can learn the target features automatically.
We collected a large-scale dataset that contained 4146 CT scans with annotations of varying types and sizes of PNs (even PNs smaller than 3mm were annotated).
To the best of our knowledge, this dataset is the largest cohort with relatively complete annotations for PNs detection.
Our experimental results show that the proposed group-attention SSD outperforms the classic SSD framework as well as the state-of-the-art 3DCNN, especially on some challenging lesion types.
To understand a node's centrality in a multiplex network, its centrality values in all the layers of the network can be aggregated.
This requires a normalization of the values, to allow their meaningful comparison and aggregation over networks with different sizes and orders.
The concrete choices of such preprocessing steps like normalization and aggregation are almost never discussed in network analytic papers.
In this paper, we show that even sticking to the most simple centrality index (the degree) but using different, classic choices of normalization and aggregation strategies, can turn a node from being among the most central to being among the least central.
We present our results by using an aggregation operator which scales between different, classic aggregation strategies based on three multiplex networks.
We also introduce a new visualization and characterization of a node's sensitivity to the choice of a normalization and aggregation strategy in multiplex networks.
The observed high sensitivity of single nodes to the specific choice of aggregation and normalization strategies is of strong importance, especially for all kinds of intelligence-analytic software as it questions the interpretations of the findings.
Semantic segmentation and object detection research have recently achieved rapid progress.
However, the former task has no notion of different instances of the same object, and the latter operates at a coarse, bounding-box level.
We propose an Instance Segmentation system that produces a segmentation map where each pixel is assigned an object class and instance identity label.
Most approaches adapt object detectors to produce segments instead of boxes.
In contrast, our method is based on an initial semantic segmentation module, which feeds into an instance subnetwork.
This subnetwork uses the initial category-level segmentation, along with cues from the output of an object detector, within an end-to-end CRF to predict instances.
This part of our model is dynamically instantiated to produce a variable number of instances per image.
Our end-to-end approach requires no post-processing and considers the image holistically, instead of processing independent proposals.
Therefore, unlike some related work, a pixel cannot belong to multiple instances.
Furthermore, far more precise segmentations are achieved, as shown by our state-of-the-art results (particularly at high IoU thresholds) on the Pascal VOC and Cityscapes datasets.
This paper deals with the problem of control of partially known nonlinear systems, which have an open-loop stable equilibrium, but we would like to add a PI controller to regulate its behavior around another operating point.
Our main contribution is the identification of a class of systems for which a globally stable PI can be designed knowing only the systems input matrix and measuring only the actuated coordinates.
The construction of the PI is done invoking passivity theory.
The difficulties encountered in the design of adaptive PI controllers with the existing theoretical tools are also discussed.
As an illustration of the theory, we consider port--Hamiltonian systems and a class of thermal processes.
Kernel-based nonlinear mixing models have been applied to unmix spectral information of hyperspectral images when the type of mixing occurring in the scene is too complex or unknown.
Such methods, however, usually require the inversion of matrices of sizes equal to the number of spectral bands.
Reducing the computational load of these methods remains a challenge in large scale applications.
This paper proposes a centralized method for band selection (BS) in the reproducing kernel Hilbert space (RKHS).
It is based upon the coherence criterion, which sets the largest value allowed for correlations between the basis kernel functions characterizing the unmixing model.
We show that the proposed BS approach is equivalent to solving a maximum clique problem (MCP), that is, searching for the biggest complete subgraph in a graph.
Furthermore, we devise a strategy for selecting the coherence threshold and the Gaussian kernel bandwidth using coherence bounds for linearly independent bases.
Simulation results illustrate the efficiency of the proposed method.
Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task.
There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization.
In this paper, we introduce a simple yet effective RNN connection structure, the DilatedRNN, which simultaneously tackles all of these challenges.
The proposed architecture is characterized by multi-resolution dilated recurrent skip connections and can be combined flexibly with diverse RNN cells.
Moreover, the DilatedRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies.
To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures.
We rigorously prove the advantages of the DilatedRNN over other recurrent neural architectures.
The code for our method is publicly available at https://github.com/code-terminator/DilatedRNN
We propose CAVIA, a meta-learning method for fast adaptation that is scalable, flexible, and easy to implement.
CAVIA partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks.
At test time, the context parameters are updated with one or several gradient steps on a task-specific loss that is backpropagated through the shared part of the network.
Compared to approaches that adjust all parameters on a new task (e.g., MAML), CAVIA can be scaled up to larger networks without overfitting on a single task, is easier to implement, and is more robust to the inner-loop learning rate.
We show empirically that CAVIA outperforms MAML on regression, classification, and reinforcement learning problems.
We introduce a neural reading comprehension model that integrates external commonsense knowledge, encoded as a key-value memory, in a cloze-style setting.
Instead of relying only on document-to-question interaction or discrete features as in prior work, our model attends to relevant external knowledge and combines this knowledge with the context representation before inferring the answer.
This allows the model to attract and imply knowledge from an external knowledge source that is not explicitly stated in the text, but that is relevant for inferring the answer.
Our model improves results over a very strong baseline on a hard Common Nouns dataset, making it a strong competitor of much more complex models.
By including knowledge explicitly, our model can also provide evidence about the background knowledge used in the RC process.
The web graph is a commonly-used network representation of the hyperlink structure of a website.
A network of similar structure to the web graph, which we call the session graph has properties that reflect the browsing habits of the agents in the web server logs.
In this paper, we apply session graphs to compare the activity of humans against web robots or crawlers.
Understanding these properties will enable us to improve models of HTTP traffic, which can be used to predict and generate realistic traffic for testing and improving web server efficiency, as well as devising new caching algorithms.
We apply large-scale network properties, such as the connectivity and degree distribution of human and Web robot session graphs in order to identify characteristics of the traffic which would be useful for modeling web traffic and improving cache performance.
We find that the empirical degree distributions of session graphs for human and robot requests on one Web server are best fit by different theoretical distributions, indicating at a difference in the processes which generate the traffic.
This paper investigates and bounds the expected solution quality of combinatorial optimization problems when feasible solutions are chosen at random.
Loose general bounds are discovered, as well as families of combinatorial optimization problems for which random feasible solutions are expected to be a constant factor of optimal.
One implication of this result is that, for graphical problems, if the average edge weight in a feasible solution is sufficiently small, then any randomly chosen feasible solution to the problem will be a constant factor of optimal.
For example, under certain well-defined circumstances, the expected constant of approximation of a randomly chosen feasible solution to the Steiner network problem is bounded above by 3.
Empirical analysis supports these bounds and actually suggest that they might be tightened.
The investment on the stock market is prone to be affected by the Internet.
For the purpose of improving the prediction accuracy, we propose a multi-task stock prediction model that not only considers the stock correlations but also supports multi-source data fusion.
Our proposed model first utilizes tensor to integrate the multi-sourced data, including financial Web news, investors' sentiments extracted from the social network and some quantitative data on stocks.
In this way, the intrinsic relationships among different information sources can be captured, and meanwhile, multi-sourced information can be complemented to solve the data sparsity problem.
Secondly, we propose an improved sub-mode coordinate algorithm (SMC).
SMC is based on the stock similarity, aiming to reduce the variance of their subspace in each dimension produced by the tensor decomposition.
The algorithm is able to improve the quality of the input features, and thus improves the prediction accuracy.
And the paper utilizes the Long Short-Term Memory (LSTM) neural network model to predict the stock fluctuation trends.
Finally, the experiments on 78 A-share stocks in CSI 100 and thirteen popular HK stocks in the year 2015 and 2016 are conducted.
The results demonstrate the improvement on the prediction accuracy and the effectiveness of the proposed model.
Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation.
To address this issue, we propose a novel model regularization method for NMT training, which aims to improve the agreement between translations generated by left-to-right (L2R) and right-to-left (R2L) NMT decoders.
This goal is achieved by introducing two Kullback-Leibler divergence regularization terms into the NMT training objective to reduce the mismatch between output probabilities of L2R and R2L models.
In addition, we also employ a joint training strategy to allow L2R and R2L models to improve each other in an interactive update process.
Experimental results show that our proposed method significantly outperforms state-of-the-art baselines on Chinese-English and English-German translation tasks.
The modeling of speech can be used for speech synthesis and speech recognition.
We present a speech analysis method based on pole-zero modeling of speech with mixed block sparse and Gaussian excitation.
By using a pole-zero model, instead of the all-pole model, a better spectral fitting can be expected.
Moreover, motivated by the block sparse glottal flow excitation during voiced speech and the white noise excitation for unvoiced speech, we model the excitation sequence as a combination of block sparse signals and white noise.
A variational EM (VEM) method is proposed for estimating the posterior PDFs of the block sparse residuals and point estimates of mod- elling parameters within a sparse Bayesian learning framework.
Compared to conventional pole-zero and all-pole based methods, experimental results show that the proposed method has lower spectral distortion and good performance in reconstructing of the block sparse excitation.
Cloud computing is recognized as one of the most promising solutions to information technology, e.g., for storing and sharing data in the web service which is sustained by a company or third party instead of storing data in a hard drive or other devices.
It is essentially a physical storage system which provides large storage of data and faster computing to users over the Internet.
In this cloud system, the third party allows to preserve data of clients or users only for business purpose and also for a limited period of time.
The users are used to share data confidentially among themselves and to store data virtually to save the cost of physical devices as well as the time.
In this paper, we propose a discrete dynamical system for cloud computing and data management of the storage service between a third party and users.
A framework, comprised of different techniques and procedures for distribution of storage and their implementation with users and the third party is given.
For illustration purpose, the model is considered for two users and a third party, and its dynamical properties are briefly analyzed and discussed.
It is shown that the discrete system exhibits periodic, quasiperiodic and chaotic states.
The latter discerns that the cloud computing system with distribution of data and storage between users and the third party may be secured.
Some issues of data security are discussed and a random replication scheme is proposed to ensure that the data loss can be highly reduced compared to the existing schemes in the literature.
Leveraging human grasping skills to teach a robot to perform a manipulation task is appealing, but there are several limitations to this approach: time-inefficient data capture procedures, limited generalization of the data to other grasps and objects, and inability to use that data to learn more about how humans perform and evaluate grasps.
This paper presents a data capture protocol that partially addresses these deficiencies by asking participants to specify ranges over which a grasp is valid.
The protocol is verified both qualitatively through online survey questions (where 95.38% of within-range grasps are identified correctly with the nearest extreme grasp) and quantitatively by showing that there is small variation in grasps ranges from different participants as measured by joint angles, contact points, and position.
We demonstrate that these grasp ranges are valid through testing on a physical robot (93.75% of grasps interpolated from grasp ranges are successful).
This paper studies convolutional neural networks (CNN) to learn unsupervised feature representations for 44 different plant species, collected at the Royal Botanic Gardens, Kew, England.
To gain intuition on the chosen features from the CNN model (opposed to a 'black box' solution), a visualisation technique based on the deconvolutional networks (DN) is utilized.
It is found that venations of different order have been chosen to uniquely represent each of the plant species.
Experimental results using these CNN features with different classifiers show consistency and superiority compared to the state-of-the art solutions which rely on hand-crafted features.
Today there are many universal compression algorithms, but in most cases is for specific data better using specific algorithm - JPEG for images, MPEG for movies, etc.
For textual documents there are special methods based on PPM algorithm or methods with non-character access, e.g. word-based compression.
In the past, several papers describing variants of word-based compression using Huffman encoding or LZW method were published.
The subject of this paper is the description of a word-based compression variant based on the LZ77 algorithm.
The LZ77 algorithm and its modifications are described in this paper.
Moreover, various ways of sliding window implementation and various possibilities of output encoding are described, as well.
This paper also includes the implementation of an experimental application, testing of its efficiency and finding the best combination of all parts of the LZ77 coder.
This is done to achieve the best compression ratio.
In conclusion there is comparison of this implemented application with other word-based compression programs and with other commonly used compression programs.
Selecting the most appropriate data examples to present a deep neural network (DNN) at different stages of training is an unsolved challenge.
Though practitioners typically ignore this problem, a non-trivial data scheduling method may result in a significant improvement in both convergence and generalization performance.
In this paper, we introduce Self-Paced Learning with Adaptive Deep Visual Embeddings (SPL-ADVisE), a novel end-to-end training protocol that unites self-paced learning (SPL) and deep metric learning (DML).
We leverage the Magnet Loss to train an embedding convolutional neural network (CNN) to learn a salient representation space.
The student CNN classifier dynamically selects similar instance-level training examples to form a mini-batch, where the easiness from the cross-entropy loss and the true diverseness of examples from the learned metric space serve as sample importance priors.
To demonstrate the effectiveness of SPL-ADVisE, we use deep CNN architectures for the task of supervised image classification on several coarse- and fine-grained visual recognition datasets.
Results show that, across all datasets, the proposed method converges faster and reaches a higher final accuracy than other SPL variants, particularly on fine-grained classes.
In this paper, we reviewed the notes on using Web map image provided by Web map service, from the viewpoint of copyright act.
The copyright act aims to contribute to creation of culture by protecting the rights of authors and others, and promoting fair exploitation of cultural products.
Therefore, everyone can use copyrighted materials to the extent of the copyright limitation based on copyright act.
The Web map image, including maps, aerial photo and satellite image, are one of copyrighted materials, so it can be used within the limits of copyright.
However, the available range of Web map image under the copyright act is not wide.
In addition, it is pointed out that the copyright act has not been able to follow the progress of digitalization of copyrighted materials.
It is expected to revise the copyright act corresponding to digitalization of copyrighted work.
We investigate the problem of learning discrete, undirected graphical models in a differentially private way.
We show that the approach of releasing noisy sufficient statistics using the Laplace mechanism achieves a good trade-off between privacy, utility, and practicality.
A naive learning algorithm that uses the noisy sufficient statistics "as is" outperforms general-purpose differentially private learning algorithms.
However, it has three limitations: it ignores knowledge about the data generating process, rests on uncertain theoretical foundations, and exhibits certain pathologies.
We develop a more principled approach that applies the formalism of collective graphical models to perform inference over the true sufficient statistics within an expectation-maximization framework.
We show that this learns better models than competing approaches on both synthetic data and on real human mobility data used as a case study.
Industry-grade database systems are expected to produce the same result if the same query is repeatedly run on the same input.
However, the numerous sources of non-determinism in modern systems make reproducible results difficult to achieve.
This is particularly true if floating-point numbers are involved, where the order of the operations affects the final result.
As part of a larger effort to extend database engines with data representations more suitable for machine learning and scientific applications, in this paper we explore the problem of making relational GroupBy over floating-point formats bit-reproducible, i.e., ensuring any execution of the operator produces the same result up to every single bit.
To that aim, we first propose a numeric data type that can be used as drop-in replacement for other number formats and is---unlike standard floating-point formats---associative.
We use this data type to make state-of-the-art GroupBy operators reproducible, but this approach incurs a slowdown between 4x and 12x compared to the same operator using conventional database number formats.
We thus explore how to modify existing GroupBy algorithms to make them bit-reproducible and efficient.
By using vectorized summation on batches and carefully balancing batch size, cache footprint, and preprocessing costs, we are able to reduce the slowdown due to reproducibility to a factor between 1.9x and 2.4x of aggregation in isolation and to a mere 2.7% of end-to-end query performance even on aggregation-intensive queries in MonetDB.
We thereby provide a solid basis for supporting more reproducible operations directly in relational engines.
This document is an extended version of an article currently in print for the proceedings of ICDE'18 with the same title and by the same authors.
The main additions are more implementation details and experiments.
In this paper we study the long-standing open question regarding the computational complexity of one of the core problems in supply chains management, the periodic joint replenishment problem.
This problem has received a lot of attention over the years and many heuristic and approximation algorithms were suggested.
However, in spite of the vast effort, the complexity of the problem remained unresolved.
In this paper, we provide a proof that the problem is indeed strongly NP-hard.
Gaze behavior is an important non-verbal cue in social signal processing and human-computer interaction.
In this paper, we tackle the problem of person- and head pose-independent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN).
We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images.
Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame.
Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.
A canonical scenario in Machine-Type Communications (MTC) is the one featuring a large number of devices, each of them with sporadic traffic.
Hence, the number of served devices in a single LTE cell is not determined by the available aggregate rate, but rather by the limitations of the LTE access reservation protocol.
Specifically, the limited number of contention preambles and the limited amount of uplink grants per random access response are crucial to consider when dimensioning LTE networks for MTC.
We propose a low-complexity model of LTE's access reservation protocol that encompasses these two limitations and allows us to evaluate the outage probability at click-speed.
The model is based chiefly on closed-form expressions, except for the part with the feedback impact of retransmissions, which is determined by solving a fixed point equation.
Our model overcomes the incompleteness of the existing models that are focusing solely on the preamble collisions.
A comparison with the simulated LTE access reservation procedure that follows the 3GPP specifications, confirms that our model provides an accurate estimation of the system outage event and the number of supported MTC devices.
We present a new dataset and models for comprehending paragraphs about processes (e.g., photosynthesis), an important genre of text describing a dynamic world.
The new dataset, ProPara, is the first to contain natural (rather than machine-generated) text about a changing world along with a full annotation of entity states (location and existence) during those changes (81k datapoints).
The end-task, tracking the location and existence of entities through the text, is challenging because the causal effects of actions are often implicit and need to be inferred.
We find that previous models that have worked well on synthetic data achieve only mediocre performance on ProPara, and introduce two new neural models that exploit alternative mechanisms for state prediction, in particular using LSTM input encoding and span prediction.
The new models improve accuracy by up to 19%.
The dataset and models are available to the community at http://data.allenai.org/propara.
Satirical news is considered to be entertainment, but it is potentially deceptive and harmful.
Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news.
We observe that satirical cues are often reflected in certain paragraphs rather than the whole document.
Existing works only consider document-level features to detect the satire, which could be limited.
We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism.
We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset.
The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.
Image similarity involves fetching similar looking images given a reference image.
Our solution called SimNet, is a deep siamese network which is trained on pairs of positive and negative images using a novel online pair mining strategy inspired by Curriculum learning.
We also created a multi-scale CNN, where the final image embedding is a joint representation of top as well as lower layer embedding's.
We go on to show that this multi-scale siamese network is better at capturing fine grained image similarities than traditional CNN's.
The Internet Threat Monitoring (ITM),is a globally scoped Internet monitoring system whose goal is to measure, detect, characterize, and track threats such as distribute denial of service(DDoS) attacks and worms.
To block the monitoring system in the internet the attackers are targeted the ITM system.
In this paper we address flooding attack against ITM system in which the attacker attempt to exhaust the network and ITM's resources, such as network bandwidth, computing power, or operating system data structures by sending the malicious traffic.
We propose an information-theoretic frame work that models the flooding attacks using Botnet on ITM.
Based on this model we generalize the flooding attacks and propose an effective attack detection using Honeypots.
This paper introduces TakeFive, a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph.
It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalises the results as a knowledge graph.
This formal representation complies with the frame semantics used in Framester, a factual-linguistic linked data resource.
The obtained precision, recall and F1 values indicate that TakeFive is competitive with other existing methods such as SEMAFOR, Pikes, PathLSTM and FRED.
We finally discuss how to combine TakeFive and FRED, obtaining higher values of precision, recall and F1.
The ability of intelligent agents to play games in human-like fashion is popularly considered a benchmark of progress in Artificial Intelligence.
Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision.
In our work, we bring games and VQA together.
Specifically, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game.
We first introduce Sketch-QA, an elementary version of Visual Question Answering task.
Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data.
Notably, Sketch-QA involves asking a fixed question ("What object is being drawn?") and gathering open-ended guess-words from human guessers.
We analyze the resulting dataset and present many interesting findings therein.
To mimic Pictionary-style guessing, we subsequently propose a deep neural model which generates guess-words in response to temporally evolving human-drawn sketches.
Our model even makes human-like mistakes while guessing, thus amplifying the human mimicry factor.
We evaluate our model on the large-scale guess-word dataset generated via Sketch-QA task and compare with various baselines.
We also conduct a Visual Turing Test to obtain human impressions of the guess-words generated by humans and our model.
Experimental results demonstrate the promise of our approach for Pictionary and similarly themed games.
Deep reinforcement learning (DRL) has shown incredible performance in learning various tasks to the human level.
However, unlike human perception, current DRL models connect the entire low-level sensory input to the state-action values rather than exploiting the relationship between and among entities that constitute the sensory input.
Because of this difference, DRL needs vast amount of experience samples to learn.
In this paper, we propose a Multi-focus Attention Network (MANet) which mimics human ability to spatially abstract the low-level sensory input into multiple entities and attend to them simultaneously.
The proposed method first divides the low-level input into several segments which we refer to as partial states.
After this segmentation, parallel attention layers attend to the partial states relevant to solving the task.
Our model estimates state-action values using these attended partial states.
In our experiments, MANet attains highest scores with significantly less experience samples.
Additionally, the model shows higher performance compared to the Deep Q-network and the single attention model as benchmarks.
Furthermore, we extend our model to attentive communication model for performing multi-agent cooperative tasks.
In multi-agent cooperative task experiments, our model shows 20% faster learning than existing state-of-the-art model.
Understanding when and how computational complexity can be used to protect elections against different manipulative actions has been a highly active research area over the past two decades.
A recent body of work, however, has shown that many of the NP-hardness shields, previously obtained, vanish when the electorate has single-peaked or nearly single-peaked preferences.
In light of these results, we investigate whether it is possible to reimpose NP-hardness shields for such electorates by allowing the voters to specify partial preferences instead of insisting they cast complete ballots.
In particular, we show that in single-peaked and nearly single-peaked electorates, if voters are allowed to submit top-truncated ballots, then the complexity of manipulation and bribery for many voting rules increases from being in P to being NP-complete.
In this paper a Metaheuristic approach for solving the N-Queens Problem is introduced to find the best possible solution in a reasonable amount of time.
Genetic Algorithm is used with a novel fitness function as the Metaheuristic.
The aim of N-Queens Problem is to place N queens on an N x N chessboard, in a way so that no queen is in conflict with the others.
Chromosome representation and genetic operations like Mutation and Crossover are described in detail.
Results show that this approach yields promising and satisfactory results in less time compared to that obtained from the previous approaches for several large values of N.
Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable.
This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like topics) and clusters the basis indexes in the ego network of each polysemous word.
By adopting distributional inclusion vector embeddings as our basis formation model, we avoid the expensive step of nearest neighbor search that plagues other graph-based methods without sacrificing the quality of sense clusters.
Experiments on three datasets show that our proposed method produces similar or better sense clusters and embeddings compared with previous state-of-the-art methods while being significantly more efficient.
Statistical shape models (SSMs) represent a class of shapes as a normal distribution of point variations, whose parameters are estimated from example shapes.
Principal component analysis (PCA) is applied to obtain a low-dimensional representation of the shape variation in terms of the leading principal components.
In this paper, we propose a generalization of SSMs, called Gaussian Process Morphable Models (GPMMs).
We model the shape variations with a Gaussian process, which we represent using the leading components of its Karhunen-Loeve expansion.
To compute the expansion, we make use of an approximation scheme based on the Nystrom method.
The resulting model can be seen as a continuous analogon of an SSM.
However, while for SSMs the shape variation is restricted to the span of the example data, with GPMMs we can define the shape variation using any Gaussian process.
For example, we can build shape models that correspond to classical spline models, and thus do not require any example data.
Furthermore, Gaussian processes make it possible to combine different models.
For example, an SSM can be extended with a spline model, to obtain a model that incorporates learned shape characteristics, but is flexible enough to explain shapes that cannot be represented by the SSM.
We introduce a simple algorithm for fitting a GPMM to a surface or image.
This results in a non-rigid registration approach, whose regularization properties are defined by a GPMM.
We show how we can obtain different registration schemes,including methods for multi-scale, spatially-varying or hybrid registration, by constructing an appropriate GPMM.
As our approach strictly separates modelling from the fitting process, this is all achieved without changes to the fitting algorithm.
We show the applicability and versatility of GPMMs on a clinical use case, where the goal is the model-based segmentation of 3D forearm images.
Construction frequently appears at the bottom of productivity charts with decreasing indexes of productivity over the years.
Lack of innovation and delayed adoption, informal processes or insufficient rigor and consistency in process execution, insufficient knowledge transfer from project to project, weak project monitoring, little cross- functional cooperation, little collaboration with suppliers, conservative company culture, and a shortage of young talent and people development are usual issues.
Whereas work has been carried out on information technology and automation in construction their application is isolated without an interconnected information flow.
This paper suggests a framework to address production issues on construction by implementing an integrated automatic supervisory control and data acquisition for management and operations.
The system is divided into planning, monitoring, controlling, and executing groups clustering technologies to track both the project product and production.
This research stands on the four pillars of manufacturing knowledge and lean production (production processes, production management, equipment/tool design, and automated systems and control).
The framework offers benefits such as increased information flow, detection and prevention of overburdening equipment or labor (Muri) and production unevenness (Mura), reduction of waste (Muda), evidential and continuous process standardization and improvement, reuse and abstraction of project information across endeavors.
In the C-V2X sidelink Mode 4 communication, the sensing-based semi-persistent scheduling (SPS) implements a message collision avoidance algorithm to cope with the undesirable effects of wireless channel congestion.
Still, the current standard mechanism produces high number of packet collisions, which may hinder the high-reliability communications required in future C-V2X applications such as autonomous driving.
In this paper, we show that by drastically reducing the uncertainties in the choice of the resource to use for SPS, we can significantly reduce the message collisions in the C-V2X sidelink Mode 4.
Specifically, we propose the use of the "lookahead," which contains the next starting resource location in the time-frequency plane.
By exchanging the lookahead information piggybacked on the periodic safety message, vehicular user equipments (UEs) can eliminate most message collisions arising from the ignorance of other UEs' internal decisions.
Although the proposed scheme would require the inclusion of the lookahead in the control part of the packet, the benefit may outweigh the bandwidth cost, considering the stringent reliability requirement in future C-V2X applications.
Overlapping of cervical cells and poor contrast of cell cytoplasm are the major issues in accurate detection and segmentation of cervical cells.
An unsupervised cell segmentation approach is presented here.
Cell clump segmentation was carried out using the extended depth of field (EDF) image created from the images of different focal planes.
A modified Otsu method with prior class weights is proposed for accurate segmentation of nuclei from the cell clumps.
The cell cytoplasm was further segmented from cell clump depending upon the number of nucleus detected in that cell clump.
Level set model was used for cytoplasm segmentation.
A cellular automaton is a parallel synchronous computing model, which consists in a juxtaposition of finite automata whose state evolves according to that of their neighbors.
It induces a dynamical system on the set of configurations, i.e. the infinite sequences of cell states.
The limit set of the cellular automaton is the set of configurations which can be reached arbitrarily late in the evolution.
In this paper, we prove that all properties of limit sets of cellular automata with binary-state cells are undecidable, except surjectivity.
This is a refinement of the classical "Rice Theorem" that Kari proved on cellular automata with arbitrary state sets.
In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage.
This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data source.
Standard translation techniques in OBDA try to transform the user query into a union of conjunctive queries (UCQ), following the heuristic argument that UCQs can be efficiently evaluated by modern relational database engines.
In this work, we show that translating to UCQs is not always the best choice, and that, under certain conditions on the interplay between the ontology, the map- pings, and the statistics of the data, alternative translations can be evaluated much more efficiently.
To find the best translation, we devise a cost model together with a novel cardinality estimation that takes into account all such OBDA components.
Our experiments confirm that (i) alternatives to the UCQ translation might produce queries that are orders of magnitude more efficient, and (ii) the cost model we propose is faithful to the actual query evaluation cost, and hence is well suited to select the best translation.
While state-of-the-art kernels for graphs with discrete labels scale well to graphs with thousands of nodes, the few existing kernels for graphs with continuous attributes, unfortunately, do not scale well.
To overcome this limitation, we present hash graph kernels, a general framework to derive kernels for graphs with continuous attributes from discrete ones.
The idea is to iteratively turn continuous attributes into discrete labels using randomized hash functions.
We illustrate hash graph kernels for the Weisfeiler-Lehman subtree kernel and for the shortest-path kernel.
The resulting novel graph kernels are shown to be, both, able to handle graphs with continuous attributes and scalable to large graphs and data sets.
This is supported by our theoretical analysis and demonstrated by an extensive experimental evaluation.
Moving Object Segmentation is a challenging task for jittery/wobbly videos.
For jittery videos, the non-smooth camera motion makes discrimination between foreground objects and background layers hard to solve.
While most recent works for moving video object segmentation fail in this scenario, our method generates an accurate segmentation of a single moving object.
The proposed method performs a sparse segmentation, where frame-wise labels are assigned only to trajectory coordinates, followed by the pixel-wise labeling of frames.
The sparse segmentation involving stabilization and clustering of trajectories in a 3-stage iterative process.
At the 1st stage, the trajectories are clustered using pairwise Procrustes distance as a cue for creating an affinity matrix.
The 2nd stage performs a block-wise Procrustes analysis of the trajectories and estimates Frechet means (in Kendall's shape space) of the clusters.
The Frechet means represent the average trajectories of the motion clusters.
An optimization function has been formulated to stabilize the Frechet means, yielding stabilized trajectories at the 3rd stage.
The accuracy of the motion clusters are iteratively refined, producing distinct groups of stabilized trajectories.
Next, the labels obtained from the sparse segmentation are propagated for pixel-wise labeling of the frames, using a GraphCut based energy formulation.
Use of Procrustes analysis and energy minimization in Kendall's shape space for moving object segmentation in jittery videos, is the novelty of this work.
Second contribution comes from experiments performed on a dataset formed of 20 real-world natural jittery videos, with manually annotated ground truth.
Experiments are done with controlled levels of artificial jitter on videos of SegTrack2 dataset.
Qualitative and quantitative results indicate the superiority of the proposed method.
This paper introduces a new method to solve the cross-domain recognition problem.
Different from the traditional domain adaption methods which rely on a global domain shift for all classes between source and target domain, the proposed method is more flexible to capture individual class variations across domains.
By adopting a natural and widely used assumption -- "the data samples from the same class should lay on a low-dimensional subspace, even if they come from different domains", the proposed method circumvents the limitation of the global domain shift, and solves the cross-domain recognition by finding the compact joint subspaces of source and target domain.
Specifically, given labeled samples in source domain, we construct subspaces for each of the classes.
Then we construct subspaces in the target domain, called anchor subspaces, by collecting unlabeled samples that are close to each other and highly likely all fall into the same class.
The corresponding class label is then assigned by minimizing a cost function which reflects the overlap and topological structure consistency between subspaces across source and target domains, and within anchor subspaces, respectively.We further combine the anchor subspaces to corresponding source subspaces to construct the compact joint subspaces.
Subsequently, one-vs-rest SVM classifiers are trained in the compact joint subspaces and applied to unlabeled data in the target domain.
We evaluate the proposed method on two widely used datasets: object recognition dataset for computer vision tasks, and sentiment classification dataset for natural language processing tasks.
Comparison results demonstrate that the proposed method outperforms the comparison methods on both datasets.
Energy management of plug-in Hybrid Electric Vehicles (HEVs) has different challenges from non-plug-in HEVs, due to bigger batteries and grid recharging.
Instead of tackling it to pursue energetic efficiency, an approach minimizing the driving cost incurred by the user - the combined costs of fuel, grid energy and battery degradation - is here proposed.
A real-time approximation of the resulting optimal policy is then provided, as well as some analytic insight into its dependence on the system parameters.
The advantages of the proposed formulation and the effectiveness of the real-time strategy are shown by means of a thorough simulation campaign.
Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectives, etc.) to others and recognize that these mental states may differ from one's own.
Theory of Mind is critical to effective communication and to teams demonstrating higher collective performance.
To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team.
Traditionally, there has been much emphasis on research to make AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts.
The latter involves making AI more human-like and having it develop a theory of our minds.
In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI's mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks.
We instantiate these ideas within the domain of Visual Question Answering (VQA).
We find that using just a few examples (50), lay people can be trained to better predict responses and oncoming failures of a complex VQA model.
We further evaluate the role existing explanation (or interpretability) modalities play in helping humans build ToAIM.
Explainable AI has received considerable scientific and popular attention in recent times.
Surprisingly, we find that having access to the model's internal states - its confidence in its top-k predictions, explicit or implicit attention maps which highlight regions in the image (and words in the question) the model is looking at (and listening to) while answering a question about an image - do not help people better predict its behavior.
Clinical measurements that can be represented as time series constitute an important fraction of the electronic health records and are often both uncertain and incomplete.
Recurrent neural networks are a special class of neural networks that are particularly suitable to process time series data but, in their original formulation, cannot explicitly deal with missing data.
In this paper, we explore imputation strategies for handling missing values in classifiers based on recurrent neural network (RNN) and apply a recently proposed recurrent architecture, the Gated Recurrent Unit with Decay, specifically designed to handle missing data.
We focus on the problem of detecting surgical site infection in patients by analyzing time series of their blood sample measurements and we compare the results obtained with different RNN-based classifiers.
Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research.
In recent years, deep neural networks have achieved significant success in named entity recognition and many other Natural Language Processing (NLP) tasks.
Most of these algorithms are trained end to end, and can automatically learn features from large scale labeled datasets.
However, these data-driven methods typically lack the capability of processing rare or unseen entities.
Previous statistical methods and feature engineering practice have demonstrated that human knowledge can provide valuable information for handling rare and unseen cases.
In this paper, we address the problem by incorporating dictionaries into deep neural networks for the Chinese CNER task.
Two different architectures that extend the Bi-directional Long Short-Term Memory (Bi-LSTM) neural network and five different feature representation schemes are proposed to handle the task.
Computational results on the CCKS-2017 Task 2 benchmark dataset show that the proposed method achieves the highly competitive performance compared with the state-of-the-art deep learning methods.
Increasingly stringent performance requirements for motion control necessitate the use of increasingly detailed models of the system behavior.
Motion systems inherently move, therefore, spatio-temporal models of the flexible dynamics are essential.
In this paper, a two-step approach for the identification of the spatio-temporal behavior of mechanical systems is developed and applied to a prototype industrial wafer stage with a lightweight design for fast and highly accurate positioning.
The proposed approach exploits a modal modeling framework and combines recently developed powerful linear time invariant (LTI) identification tools with a spline-based mode-shape interpolation approach to estimate the spatial system behavior.
The experimental results for the wafer stage application confirm the suitability of the proposed approach for the identification of complex position-dependent mechanical systems, and show the pivotal role of the obtained models for improved motion control performance.
As more and more personal photos are shared and tagged in social media, avoiding privacy risks such as unintended recognition becomes increasingly challenging.
We propose a new hybrid approach to obfuscate identities in photos by head replacement.
Our approach combines state of the art parametric face synthesis with latest advances in Generative Adversarial Networks (GAN) for data-driven image synthesis.
On the one hand, the parametric part of our method gives us control over the facial parameters and allows for explicit manipulation of the identity.
On the other hand, the data-driven aspects allow for adding fine details and overall realism as well as seamless blending into the scene context.
In our experiments, we show highly realistic output of our system that improves over the previous state of the art in obfuscation rate while preserving a higher similarity to the original image content.
With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research.
They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis.
Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand.
In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed.
Advanced clustering algorithms are also discussed.
After that, the existing clustering evaluation metrics are reviewed.
A summary with future insights is provided at the end.
Amidst growing concern over media manipulation, NLP attention has focused on overt strategies like censorship and "fake news'".
Here, we draw on two concepts from the political science literature to explore subtler strategies for government media manipulation: agenda-setting (selecting what topics to cover) and framing (deciding how topics are covered).
We analyze 13 years (100K articles) of the Russian newspaper Izvestia and identify a strategy of distraction: articles mention the U.S. more frequently in the month directly following an economic downturn in Russia.
We introduce embedding-based methods for cross-lingually projecting English frames to Russian, and discover that these articles emphasize U.S. moral failings and threats to the U.S. Our work offers new ways to identify subtle media manipulation strategies at the intersection of agenda-setting and framing.
Ventricular Fibrillation (VF), one of the most dangerous arrhythmias, is responsible for sudden cardiac arrests.
Thus, various algorithms have been developed to predict VF from Electrocardiogram (ECG), which is a binary classification problem.
In the literature, we find a number of algorithms based on signal processing, where, after some robust mathematical operations the decision is given based on a predefined threshold over a single value.
On the other hand, some machine learning based algorithms are also reported in the literature; however, these algorithms merely combine some parameters and make a prediction using those as features.
Both the approaches have their perks and pitfalls; thus our motivation was to coalesce them to get the best out of the both worlds.
Hence we have developed, VFPred that, in addition to employing a signal processing pipeline, namely, Empirical Mode Decomposition and Discrete Time Fourier Transform for useful feature extraction, uses a Support Vector Machine for efficient classification.
VFPred turns out to be a robust algorithm as it is able to successfully segregate the two classes with equal confidence (Sensitivity = 99.99%, Specificity = 98.40%) even from a short signal of 5 seconds long, whereas existing works though requires longer signals, flourishes in one but fails in the other.
A mobile ad hoc network (MANET) is a non-centralised, multihop, wireless network that lacks a common infrastructure and hence it needs self-organisation.
The biggest challenge in MANETs is to find a path between communicating nodes, which is the MANET routing problem.
Biology-inspired techniques such as ant colony optimisation (ACO) which have proven to be very adaptable in other problem domains, have been applied to the MANET routing problem as it forms a good fit to the problem.
The general characteristics of these biological systems, which include their capability for self-organisation, self-healing and local decision making, make them suitable for routing in MANETs.
In this paper, we discuss a few ACO based protocols, namely AntNet, hybrid ACO (AntHocNet), ACO based routing algorithm (ARA), imProved ant colony optimisation routing algorithm for mobile ad hoc NETworks (PACONET), ACO based on demand distance vector (Ant-AODV) and ACO based dynamic source routing (Ant-DSR), and determine their performance in terms of quality of service (QoS) parameters, such as end-to-end delay and packet delivery ratio, using Network Simulator 2 (NS2).
We also compare them with well known protocols, ad hoc on demand distance vector (AODV) and dynamic source routing (DSR), based on the random waypoint mobility model.
The simulation results show how this biology-inspired approach helps in improving QoS parameters.
Big Data concern large-volume, growing data sets that are complex and have multiple autonomous sources.
Earlier technologies were not able to handle storage and processing of huge data thus Big Data concept comes into existence.
This is a tedious job for users unstructured data.
So, there should be some mechanism which classify unstructured data into organized form which helps user to easily access required data.
Classification techniques over big transactional database provide required data to the users from large datasets more simple way.
There are two main classification techniques, supervised and unsupervised.
In this paper we focused on to study of different supervised classification techniques.
Further this paper shows a advantages and limitations.
In this work, we investigate the structure and evolution of a peer-to-peer (P2P) payment application.
A unique aspect of the network under consideration is that the edges among nodes represent financial transactions among individuals who shared an offline social interaction.
Our dataset comes from Venmo, the most popular P2P mobile payment service.
We present a series of static and dynamic measurements that summarize the key aspects of any social network, namely the degree distribution, density and connectivity.
We find that the degree distributions do not follow a power-law distribution, confirming previous studies that real-world social networks are rarely scale-free.
The giant component of Venmo is eventually composed of 99.9% of all nodes, and its clustering coefficient reaches 0.2.
Last, we examine the "topological" version of the small-world hypothesis and find that Venmo users are separated by a mean of 5.9 steps and a median of 6 steps.
Reading comprehension has embraced a booming in recent NLP research.
Several institutes have released the Cloze-style reading comprehension data, and these have greatly accelerated the research of machine comprehension.
In this work, we firstly present Chinese reading comprehension datasets, which consist of People Daily news dataset and Children's Fairy Tale (CFT) dataset.
Also, we propose a consensus attention-based neural network architecture to tackle the Cloze-style reading comprehension problem, which aims to induce a consensus attention over every words in the query.
Experimental results show that the proposed neural network significantly outperforms the state-of-the-art baselines in several public datasets.
Furthermore, we setup a baseline for Chinese reading comprehension task, and hopefully this would speed up the process for future research.
Identifying the stance of a news article body with respect to a certain headline is the first step to automated fake news detection.
In this paper, we introduce a 2-stage ensemble model to solve the stance detection task.
By using only hand-crafted features as input to a gradient boosting classifier, we are able to achieve a score of 9161.5 out of 11651.25 (78.63%) on the official Fake News Challenge (Stage 1) dataset.
We identify the most useful features for detecting fake news and discuss how sampling techniques can be used to improve recall accuracy on a highly imbalanced dataset.
The challenge of describing model drift is an open question in unsupervised learning.
It can be difficult to evaluate at what point an unsupervised model has deviated beyond what would be expected from a different sample from the same population.
This is particularly true for models without a probabilistic interpretation.
One such family of techniques, Topological Data Analysis, and the Mapper algorithm in particular, has found use in a variety of fields, but describing model drift for Mapper graphs is an understudied area as even existing techniques for measuring distances between related constructs like graphs or simplicial complexes fail to account for the fact that Mapper graphs represent a combination of topological, metric, and density information.
In this paper, we develop an optimal transport based metric which we call the Network Augmented Wasserstein Distance for evaluating distances between Mapper graphs and demonstrate the value of the metric for model drift analysis by using the metric to transform the model drift problem into an anomaly detection problem over dynamic graphs.
This paper proposes three measures to quantify the characteristics of online signature templates in terms of distinctiveness, complexity and repeatability.
A distinctiveness measure of a signature template is computed from a set of enrolled signature samples and a statistical assumption about random signatures.
Secondly, a complexity measure of the template is derived from a set of enrolled signature samples.
Finally, given a signature template, a measure to quantify the repeatability of the online signature is derived from a validation set of samples.
These three measures can then be used as an indicator for the performance of the system in rejecting random forgery samples and skilled forgery samples and the performance of users in providing accepted genuine samples, respectively.
The effectiveness of these three measures and their applications are demonstrated through experiments performed on three online signature datasets and one keystroke dynamics dataset using different verification algorithms.
This paper reports on work performed in the context of the COMPASS SESAR-JU WP-E project, on developing an approach for identifying and filtering inaccurate trajectories (ghost flights) in historical data originating from the EUROCONTROL-operated Demand Data Repository (DDR).
Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes.
This paper examines three simple magnitude-based pruning schemes to compress NMT models, namely class-blind, class-uniform, and class-distribution, which differ in terms of how pruning thresholds are computed for the different classes of weights in the NMT architecture.
We demonstrate the efficacy of weight pruning as a compression technique for a state-of-the-art NMT system.
We show that an NMT model with over 200 million parameters can be pruned by 40% with very little performance loss as measured on the WMT'14 English-German translation task.
This sheds light on the distribution of redundancy in the NMT architecture.
Our main result is that with retraining, we can recover and even surpass the original performance with an 80%-pruned model.
Many malware families utilize domain generation algorithms (DGAs) to establish command and control (C&C) connections.
While there are many methods to pseudorandomly generate domains, we focus in this paper on detecting (and generating) domains on a per-domain basis which provides a simple and flexible means to detect known DGA families.
Recent machine learning approaches to DGA detection have been successful on fairly simplistic DGAs, many of which produce names of fixed length.
However, models trained on limited datasets are somewhat blind to new DGA variants.
In this paper, we leverage the concept of generative adversarial networks to construct a deep learning based DGA that is designed to intentionally bypass a deep learning based detector.
In a series of adversarial rounds, the generator learns to generate domain names that are increasingly more difficult to detect.
In turn, a detector model updates its parameters to compensate for the adversarially generated domains.
We test the hypothesis of whether adversarially generated domains may be used to augment training sets in order to harden other machine learning models against yet-to-be-observed DGAs.
We detail solutions to several challenges in training this character-based generative adversarial network (GAN).
In particular, our deep learning architecture begins as a domain name auto-encoder (encoder + decoder) trained on domains in the Alexa one million.
Then the encoder and decoder are reassembled competitively in a generative adversarial network (detector + generator), with novel neural architectures and training strategies to improve convergence.
Upcoming many core processors are expected to employ a distributed memory architecture similar to currently available supercomputers, but parallel pattern mining algorithms amenable to the architecture are not comprehensively studied.
We present a novel closed pattern mining algorithm with a well-engineered communication protocol, and generalize it to find statistically significant patterns from personal genome data.
For distributing communication evenly, it employs global load balancing with multiple stacks distributed on a set of cores organized as a hypercube with random edges.
Our algorithm achieved up to 1175-fold speedup by using 1200 cores for solving a problem with 11,914 items and 697 transactions, while the naive approach of separating the search space failed completely.
Genetic algorithms are considered as an original way to solve problems, probably because of their generality and of their "blind" nature.
But GAs are also unusual since the features of many implementations (among all that could be thought of) are principally led by the biological metaphor, while efficiency measurements intervene only afterwards.
We propose here to examine the relevance of these biomimetic aspects, by pointing out some fundamental similarities and divergences between GAs and the genome of living beings shaped by natural selection.
One of the main differences comes from the fact that GAs rely principally on the so-called implicit parallelism, while giving to the mutation/selection mechanism the second role.
Such differences could suggest new ways of employing GAs on complex problems, using complex codings and starting from nearly homogeneous populations.
To overcome the travelling difficulty for the visually impaired group, this paper presents a novel ETA (Electronic Travel Aids)-smart guiding device in the shape of a pair of eyeglasses for giving these people guidance efficiently and safely.
Different from existing works, a novel multi sensor fusion based obstacle avoiding algorithm is proposed, which utilizes both the depth sensor and ultrasonic sensor to solve the problems of detecting small obstacles, and transparent obstacles, e.g. the French door.
For totally blind people, three kinds of auditory cues were developed to inform the direction where they can go ahead.
Whereas for weak sighted people, visual enhancement which leverages the AR (Augment Reality) technique and integrates the traversable direction is adopted.
The prototype consisting of a pair of display glasses and several low cost sensors is developed, and its efficiency and accuracy were tested by a number of users.
The experimental results show that the smart guiding glasses can effectively improve the user's travelling experience in complicated indoor environment.
Thus it serves as a consumer device for helping the visually impaired people to travel safely.
The value 1 problem is a decision problem for probabilistic automata over finite words: given a probabilistic automaton, are there words accepted with probability arbitrarily close to 1?
This problem was proved undecidable recently; to overcome this, several classes of probabilistic automata of different nature were proposed, for which the value 1 problem has been shown decidable.
In this paper, we introduce yet another class of probabilistic automata, called leaktight automata, which strictly subsumes all classes of probabilistic automata whose value 1 problem is known to be decidable.
We prove that for leaktight automata, the value 1 problem is decidable (in fact, PSPACE-complete) by constructing a saturation algorithm based on the computation of a monoid abstracting the behaviours of the automaton.
We rely on algebraic techniques developed by Simon to prove that this abstraction is complete.
Furthermore, we adapt this saturation algorithm to decide whether an automaton is leaktight.
Finally, we show a reduction allowing to extend our decidability results from finite words to infinite ones, implying that the value 1 problem for probabilistic leaktight parity automata is decidable.
We show that the matching problem that underlies optical flow requires multiple strategies, depending on the amount of image motion and other factors.
We then study the implications of this observation on training a deep neural network for representing image patches in the context of descriptor based optical flow.
We propose a metric learning method, which selects suitable negative samples based on the nature of the true match.
This type of training produces a network that displays multiple strategies depending on the input and leads to state of the art results on the KITTI 2012 and KITTI 2015 optical flow benchmarks.
In the present paper we describe the technology for translating algorithmic descriptions of discrete functions to SAT.
The proposed methods and algorithms of translation are aimed at application to the problems of SAT-based cryptanalysis.
In the theoretical part of the paper we justify the main principles of general reduction to SAT for discrete functions from a class containing the majority of functions employed in cryptography.
Based on these principles we describe the Transalg software system, developed with SAT-based cryptanalysis specifics in mind.
We show the results of applications of Transalg to construction of a number of attacks on various cryptographic functions.
Some of the corresponding attacks are state of the art.
In the paper we also present the vast experimental data, obtained using the SAT-solvers that took first places at the SAT-competitions in the recent several years.
The main goal in many fields in empirical sciences is to discover causal relationships among a set of variables from observational data.
PC algorithm is one of the promising solutions to learn the underlying causal structure by performing a number of conditional independence tests.
In this paper, we propose a novel GPU-based parallel algorithm, called cuPC, to accelerate an order-independent version of PC.
The cuPC algorithm has two variants, cuPC-E and cuPC-S, which parallelize conditional independence tests over the pairs of variables under the tests, and over the conditional sets, respectively.
In particular, cuPC-E offers two degrees of parallelization by performing tests of multiple pairs of variables and also the tests of each pair in parallel.
In the other hand, cuPC-S reuses the results of computations of a test for a given conditional set in other tests on the same conditional set.
Experiment results on GTX 1080 GPU show two to three orders of magnitude speedup.
For instance, in one of the most challenging benchmarks, cuPC-S reduces the runtime from about 73 hours to about one minute and achieves a significant speedup factor of about 4000 X.
In previous work, we developed a closed-loop speech chain model based on deep learning, in which the architecture enabled the automatic speech recognition (ASR) and text-to-speech synthesis (TTS) components to mutually improve their performance.
This was accomplished by the two parts teaching each other using both labeled and unlabeled data.
This approach could significantly improve model performance within a single-speaker speech dataset, but only a slight increase could be gained in multi-speaker tasks.
Furthermore, the model is still unable to handle unseen speakers.
In this paper, we present a new speech chain mechanism by integrating a speaker recognition model inside the loop.
We also propose extending the capability of TTS to handle unseen speakers by implementing one-shot speaker adaptation.
This enables TTS to mimic voice characteristics from one speaker to another with only a one-shot speaker sample, even from a text without any speaker information.
In the speech chain loop mechanism, ASR also benefits from the ability to further learn an arbitrary speaker's characteristics from the generated speech waveform, resulting in a significant improvement in the recognition rate.
We represent planning as a set of loosely coupled network flow problems, where each network corresponds to one of the state variables in the planning domain.
The network nodes correspond to the state variable values and the network arcs correspond to the value transitions.
The planning problem is to find a path (a sequence of actions) in each network such that, when merged, they constitute a feasible plan.
In this paper we present a number of integer programming formulations that model these loosely coupled networks with varying degrees of flexibility.
Since merging may introduce exponentially many ordering constraints we implement a so-called branch-and-cut algorithm, in which these constraints are dynamically generated and added to the formulation when needed.
Our results are very promising, they improve upon previous planning as integer programming approaches and lay the foundation for integer programming approaches for cost optimal planning.
This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest.
We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors.
We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with com- mon latent factors using different loss functions.
Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links.
Boolean network models have gained popularity in computational systems biology over the last dozen years.
Many of these networks use canalizing Boolean functions, which has led to increased interest in the study of these functions.
The canalizing depth of a function describes how many canalizing variables can be recursively picked off, until a non-canalizing function remains.
In this paper, we show how every Boolean function has a unique algebraic form involving extended monomial layers and a well-defined core polynomial.
This generalizes recent work on the algebraic structure of nested canalizing functions, and it yields a stratification of all Boolean functions by their canalizing depth.
As a result, we obtain closed formulas for the number of n-variable Boolean functions with depth k, which simultaneously generalizes enumeration formulas for canalizing, and nested canalizing functions.
The optimization of algorithm (hyper-)parameters is crucial for achieving peak performance across a wide range of domains, ranging from deep neural networks to solvers for hard combinatorial problems.
The resulting algorithm configuration (AC) problem has attracted much attention from the machine learning community.
However, the proper evaluation of new AC procedures is hindered by two key hurdles.
First, AC benchmarks are hard to set up.
Second and even more significantly, they are computationally expensive: a single run of an AC procedure involves many costly runs of the target algorithm whose performance is to be optimized in a given AC benchmark scenario.
One common workaround is to optimize cheap-to-evaluate artificial benchmark functions (e.g., Branin) instead of actual algorithms; however, these have different properties than realistic AC problems.
Here, we propose an alternative benchmarking approach that is similarly cheap to evaluate but much closer to the original AC problem: replacing expensive benchmarks by surrogate benchmarks constructed from AC benchmarks.
These surrogate benchmarks approximate the response surface corresponding to true target algorithm performance using a regression model, and the original and surrogate benchmark share the same (hyper-)parameter space.
In our experiments, we construct and evaluate surrogate benchmarks for hyperparameter optimization as well as for AC problems that involve performance optimization of solvers for hard combinatorial problems, drawing training data from the runs of existing AC procedures.
We show that our surrogate benchmarks capture overall important characteristics of the AC scenarios, such as high- and low-performing regions, from which they were derived, while being much easier to use and orders of magnitude cheaper to evaluate.
Information extraction (IE) from text has largely focused on relations between individual entities, such as who has won which award.
However, some facts are never fully mentioned, and no IE method has perfect recall.
Thus, it is beneficial to also tap contents about the cardinalities of these relations, for example, how many awards someone has won.
We introduce this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE.
We present a distant supervision method using conditional random fields.
A preliminary evaluation results in precision between 3% and 55%, depending on the difficulty of relations.
We address the problem of learning hierarchical deep neural network policies for reinforcement learning.
In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.
Each layer is also augmented with latent random variables, which are sampled from a prior distribution during the training of that layer.
The maximum entropy objective causes these latent variables to be incorporated into the layer's policy, and the higher level layer can directly control the behavior of the lower layer through this latent space.
Furthermore, by constraining the mapping from latent variables to actions to be invertible, higher layers retain full expressivity: neither the higher layers nor the lower layers are constrained in their behavior.
Our experimental evaluation demonstrates that we can improve on the performance of single-layer policies on standard benchmark tasks simply by adding additional layers, and that our method can solve more complex sparse-reward tasks by learning higher-level policies on top of high-entropy skills optimized for simple low-level objectives.
Super point is a kind of special host in the network which contacts with huge of other hosts.
Estimating its cardinality, the number of other hosts contacting with it, plays important roles in network management.
But all of existing works focus on discrete time window super point cardinality estimation which has great latency and ignores many measuring periods.
Sliding time window measures super point cardinality in a finer granularity than that of discrete time window but also more complex.
This paper firstly introduces an algorithm to estimate super point cardinality under sliding time window from distributed edge routers.
This algorithm's ability of sliding super point cardinality estimating comes from a novel method proposed in this paper which can record the time that a host appears.
Based on this method, two sliding cardinality estimators, sliding rough estimator and sliding linear estimator, are devised for super points detection and their cardinalities estimation separately.
When using these two estimators together, the algorithm consumes the smallest memory with the highest accuracy.
This sliding super point cardinality algorithm can be deployed in distributed environment and acquire the global super points' cardinality by merging estimators of distributed nodes.
Both of these estimators could process packets parallel which makes it becom possible to deal with high speed network in real time by GPU.
Experiments on a real world traffic show that this algorithm have the highest accuracy and the smallest memory comparing with others when running under discrete time window.
Under sliding time window, this algorithm also has the same performance as under discrete time window.
Analogy-based effort estimation (ABE) is one of the efficient methods for software effort estimation because of its outstanding performance and capability of handling noisy datasets.
Conventional ABE models usually use the same number of analogies for all projects in the datasets in order to make good estimates.
The authors' claim is that using same number of analogies may produce overall best performance for the whole dataset but not necessarily best performance for each individual project.
Therefore there is a need to better understand the dataset characteristics in order to discover the optimum set of analogies for each project rather than using a static k nearest projects.
Method: We propose a new technique based on Bisecting k-medoids clustering algorithm to come up with the best set of analogies for each individual project before making the prediction.
Results & Conclusions: With Bisecting k-medoids it is possible to better understand the dataset characteristic, and automatically find best set of analogies for each test project.
Performance figures of the proposed estimation method are promising and better than those of other regular ABE models
In the context of resource allocation in cloud-radio access networks, recent studies assume either signal-level or scheduling-level coordination.
This paper, instead, considers a hybrid level of coordination for the scheduling problem in the downlink of a multi-cloud radio-access network, as a means to benefit from both scheduling policies.
Consider a multi-cloud radio access network, where each cloud is connected to several base-stations (BSs) via high capacity links, and therefore allows joint signal processing between them.
Across the multiple clouds, however, only scheduling-level coordination is permitted, as it requires a lower level of backhaul communication.
The frame structure of every BS is composed of various time/frequency blocks, called power-zones (PZs), and kept at fixed power level.
The paper addresses the problem of maximizing a network-wide utility by associating users to clouds and scheduling them to the PZs, under the practical constraints that each user is scheduled, at most, to a single cloud, but possibly to many BSs within the cloud, and can be served by one or more distinct PZs within the BSs' frame.
The paper solves the problem using graph theory techniques by constructing the conflict graph.
The scheduling problem is, then, shown to be equivalent to a maximum-weight independent set problem in the constructed graph, in which each vertex symbolizes an association of cloud, user, BS and PZ, with a weight representing the utility of that association.
Simulation results suggest that the proposed hybrid scheduling strategy provides appreciable gain as compared to the scheduling-level coordinated networks, with a negligible degradation to signal-level coordination.
Early detection of breast cancer can increase treatment efficiency.
Architectural Distortion (AD) is a very subtle contraction of the breast tissue and may represent the earliest sign of cancer.
Since it is very likely to be unnoticed by radiologists, several approaches have been proposed over the years but none using deep learning techniques.
To train a Convolutional Neural Network (CNN), which is a deep neural architecture, is necessary a huge amount of data.
To overcome this problem, this paper proposes a data augmentation approach applied to clinical image dataset to properly train a CNN.
Results using receiver operating characteristic analysis showed that with a very limited dataset we could train a CNN to detect AD in digital mammography with area under the curve (AUC = 0.74).
Deep reinforcement learning (deep RL) research has grown significantly in recent years.
A number of software offerings now exist that provide stable, comprehensive implementations for benchmarking.
At the same time, recent deep RL research has become more diverse in its goals.
In this paper we introduce Dopamine, a new research framework for deep RL that aims to support some of that diversity.
Dopamine is open-source, TensorFlow-based, and provides compact and reliable implementations of some state-of-the-art deep RL agents.
We complement this offering with a taxonomy of the different research objectives in deep RL research.
While by no means exhaustive, our analysis highlights the heterogeneity of research in the field, and the value of frameworks such as ours.
Irregular low-density parity check (LDPC) codes are particularly well-suited for transmission schemes that require unequal error protection (UEP) of the transmitted data due to the different connection degrees of its variable nodes.
However, this UEP capability is strongly dependent on the connection profile among the protection classes.
This paper applies a multi-edge type analysis of LDPC codes for optimizing such connection profile according to the performance requirements of each protection class.
This allows the construction of UEP-LDPC codes where the difference between the performance of the protection classes can be adjusted and with an UEP capability that does not vanish as the number of decoding iterations grows.
This paper presents a formal approach to specify and verify object-oriented programs written in the `programming to interfaces' paradigm.
Besides the methods to be invoked by its clients, an interface also declares a set of abstract function/predicate symbols, together with a set of constraints on these symbols.
For each method declared in this interface, a specification template is given using these abstract symbols.
A class implementing this interface can give its own definitions to the abstract symbols, as long as all the constraints are satisfied.
This class implements all the methods declared in the interface such that the method specification templates declared in the interface are satisfied w.r.t. the definitions of the abstract function symbols in this class.
Based on the constraints on the abstract symbols, the client code using interfaces can be specified and verified precisely without knowing what classes implement these interfaces.
Given more information about the implementing classes, the specifications of the client code can be specialized into more precise ones without re-verifying the client code.
Several commonly used interfaces and their implementations (including Iterator, Observer, Comparable, and Comparator) are used to demonstrate that the approach in this paper is both precise and flexible.
The number of word forms in agglutinative languages is theoretically infinite and this variety in word forms introduces sparsity in many natural language processing tasks.
Part-of-speech tagging (PoS tagging) is one of these tasks that often suffers from sparsity.
In this paper, we present an unsupervised Bayesian model using Hidden Markov Models (HMMs) for joint PoS tagging and stemming for agglutinative languages.
We use stemming to reduce sparsity in PoS tagging.
Two tasks are jointly performed to provide a mutual benefit in both tasks.
Our results show that joint POS tagging and stemming improves PoS tagging scores.
We present results for Turkish and Finnish as agglutinative languages and English as a morphologically poor language.
MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations.
Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability.
This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
Catering to the incentives of people with limited rationality is a challenging research direction that requires novel paradigms to design mechanisms and approximation algorithms.
Obviously strategyproof (OSP) mechanisms have recently emerged as the concept of interest to this research agenda.
However, the majority of the literature in the area has either highlighted the shortcomings of OSP or focused on the "right" definition rather than on the construction of these mechanisms.
We here give the first set of tight results on the approximation guarantee of OSP mechanisms for scheduling related machines and a characterization of optimal OSP mechanisms for set system problems.
By extending the well-known cycle monotonicity technique, we are able to concentrate on the algorithmic component of OSP mechanisms and provide some novel paradigms for their design.
One major goal of vision is to infer physical models of objects, surfaces, and their layout from sensors.
In this paper, we aim to interpret indoor scenes from one RGBD image.
Our representation encodes the layout of orthogonal walls and the extent of objects, modeled with CAD-like 3D shapes.
We parse both the visible and occluded portions of the scene and all observable objects, producing a complete 3D parse.
Such a scene interpretation is useful for robotics and visual reasoning, but difficult to produce due to the well-known challenge of segmentation, the high degree of occlusion, and the diversity of objects in indoor scenes.
We take a data-driven approach, generating sets of potential object regions, matching to regions in training images, and transferring and aligning associated 3D models while encouraging fit to observations and spatial consistency.
We use support inference to aid interpretation and propose a retrieval scheme that uses convolutional neural networks (CNNs) to classify regions and retrieve objects with similar shapes.
We demonstrate the performance of our method on our newly annotated NYUd v2 dataset with detailed 3D shapes.
As robots become increasingly prevalent in human environments, there will inevitably be times when a robot needs to interrupt a human to initiate an interaction.
Our work introduces the first interruptibility-aware mobile robot system, and evaluates the effects of interruptibility-awareness on human task performance, robot task performance, and on human interpretation of the robot's social aptitude.
Our results show that our robot is effective at predicting interruptibility at high accuracy, allowing it to interrupt at more appropriate times.
Results of a large-scale user study show that while participants are able to maintain task performance even in the presence of interruptions, interruptibility-awareness improves the robot's task performance and improves participant social perception of the robot.
Commercial detection in news broadcast videos involves judicious selection of meaningful audio-visual feature combinations and efficient classifiers.
And, this problem becomes much simpler if these combinations can be learned from the data.
To this end, we propose an Multiple Kernel Learning based method for boosting successful kernel functions while ignoring the irrelevant ones.
We adopt a intermediate fusion approach where, a SVM is trained with a weighted linear combination of different kernel functions instead of single kernel function.
Each kernel function is characterized by a feature set and kernel type.
We identify the feature sub-space locations of the prediction success of a particular classifier trained only with particular kernel function.
We propose to estimate a weighing function using support vector regression (with RBF kernel) for each kernel function which has high values (near 1.0) where the classifier learned on kernel function succeeded and lower values (nearly 0.0) otherwise.
Second contribution of this work is TV News Commercials Dataset of 150 Hours of News videos.
Classifier trained with our proposed scheme has outperformed the baseline methods on 6 of 8 benchmark dataset and our own TV commercials dataset.
We formulate two estimation problems for pipeline systems in which measurements of compressible gas flow through a network of pipes is affected by time-varying injections, withdrawals, and compression.
We consider a state estimation problem that is then extended to a joint state and parameter estimation problem that can be used for data assimilation.
In both formulations, the flow dynamics are described on each pipe by space- and time-dependent density and mass flux that evolve according to a system of coupled partial differential equations, in which momentum dissipation is modelled using the Darcy-Wiesbach friction approximation.
These dynamics are first spatially discretized to obtain a system of nonlinear ordinary differential equations on which state and parameter estimation formulations are given as nonlinear least squares problems.
A rapid, scalable computational method for performing a nonlinear least squares estimation is developed.
Extensive simulations and computational experiments on multiple pipeline test networks demonstrate the effectiveness of the formulations in obtaining state and parameter estimates in the presence of measurement and process noise.
Supporting IPv6/UDP/CoAP protocols over Low Power Wide Area Networks (LPWANs) can bring open networking, interconnection, and cooperation to this new type of Internet of Things networks.
However, accommodating these protocols over these very low bandwidth networks requires efficient header compression schemes to meet the limited frame size of these networks, where only one or two octets are available to transmit all headers.
Recently, the Internet Engineering Task Force (IETF) LPWAN working group drafted the Static Context Header Compression (SCHC), a new header compression scheme for LPWANs, which can provide a good compression factor without complex synchronization.
In this paper, we present an implementation and evaluation of SCHC.
We compare SCHC with IPHC, which also targets constrained networks.
Additionally, we propose an enhancement of SCHC, Layered SCHC (LSCHC).
LSCHC is a layered context that reduces memory consumption and processing complexity, and adds flexibility when compressing packets.
Finally, we perform calculations to show the impact of SCHC/LSCHC on an example LPWAN technology, e.g.
LoRaWAN, from the point of view of transmission time and reliability.
In this paper, we propose a novel continuous authentication system for smartphone users.
The proposed system entirely relies on unlabeled phone movement patterns collected through smartphone accelerometer.
The data was collected in a completely unconstrained environment over five to twelve days.
The contexts of phone usage were identified using k-means clustering.
Multiple profiles, one for each context, were created for every user.
Five machine learning algorithms were employed for classification of genuine and impostors.
The performance of the system was evaluated over a diverse population of 57 users.
The mean equal error rates achieved by Logistic Regression, Neural Network, kNN, SVM, and Random Forest were 13.7%, 13.5%, 12.1%, 10.7%, and 5.6% respectively.
A series of statistical tests were conducted to compare the performance of the classifiers.
The suitability of the proposed system for different types of users was also investigated using the failure to enroll policy.
Energy storage has great potential in grid congestion relief.
By making large-scale energy storage portable through trucking, its capability to address grid congestion can be greatly enhanced.
This paper explores a business model of large-scale portable energy storage for spatiotemporal arbitrage over nodes with congestion.
We propose a spatiotemporal arbitrage model to determine the optimal operation and transportation schedules of portable storage.
To validate the business model, we simulate the schedules of a Tesla Semi full of Tesla Powerpack doing arbitrage over two nodes in California with local transmission congestion.
The results indicate that the contributions of portable storage to congestion relief are much greater than that of stationary storage, and that trucking storage can bring net profit in energy arbitrage applications.
This work explores fundamental modeling and algorithmic issues arising in the well-established MapReduce framework.
First, we formally specify a computational model for MapReduce which captures the functional flavor of the paradigm by allowing for a flexible use of parallelism.
Indeed, the model diverges from a traditional processor-centric view by featuring parameters which embody only global and local memory constraints, thus favoring a more data-centric view.
Second, we apply the model to the fundamental computation task of matrix multiplication presenting upper and lower bounds for both dense and sparse matrix multiplication, which highlight interesting tradeoffs between space and round complexity.
Finally, building on the matrix multiplication results, we derive further space-round tradeoffs on matrix inversion and matching.
Morphological declension, which aims to inflect nouns to indicate number, case and gender, is an important task in natural language processing (NLP).
This research proposal seeks to address the degree to which Recurrent Neural Networks (RNNs) are efficient in learning to decline noun cases.
Given the challenge of data sparsity in processing morphologically rich languages and also, the flexibility of sentence structures in such languages, we believe that modeling morphological dependencies can improve the performance of neural network models.
It is suggested to carry out various experiments to understand the interpretable features that may lead to a better generalization of the learned models on cross-lingual tasks.
A typical problem in MOOCs is the missing opportunity for course conductors to individually support students in overcoming their problems and misconceptions.
This paper presents the results of automatically intervening on struggling students during programming exercises and offering peer feedback and tailored bonus exercises.
To improve learning success, we do not want to abolish instructionally desired trial and error but reduce extensive struggle and demotivation.
Therefore, we developed adaptive automatic just-in-time interventions to encourage students to ask for help if they require considerably more than average working time to solve an exercise.
Additionally, we offered students bonus exercises tailored for their individual weaknesses.
The approach was evaluated within a live course with over 5,000 active students via a survey and metrics gathered alongside.
Results show that we can increase the call outs for help by up to 66% and lower the dwelling time until issuing action.
Learnings from the experiments can further be used to pinpoint course material to be improved and tailor content to be audience specific.
In this paper, we propose an efficient coding scheme for the binary Chief Executive Officer (CEO) problem under logarithmic loss criterion.
Courtade and Weissman obtained the exact rate-distortion bound for a two-link binary CEO problem under this criterion.
We find the optimal test-channel model and its parameters for the encoder of each link by using the given bound.
Furthermore, an efficient encoding scheme based on compound LDGM-LDPC codes is presented to achieve the theoretical rates.
In the proposed encoding scheme, a binary quantizer using LDGM codes and a syndrome-decoding employing LDPC codes are applied.
An iterative decoding is also presented as a fusion center to reconstruct the observation bits.
The proposed decoder consists of a sum-product algorithm with a side information from other decoder and a soft estimator.
The output of the CEO decoder is the probability of source bits conditional to the received sequences of both links.
This method outperforms the majority-based estimation of the source bits utilized in the prior studies of the binary CEO problem.
Our numerical examples verify a close performance of the proposed coding scheme to the theoretical bound in several cases.
A database of objects discovered in houses in the Roman city of Pompeii provides a unique view of ordinary life in an ancient city.
Experts have used this collection to study the structure of Roman households, exploring the distribution and variability of tasks in architectural spaces, but such approaches are necessarily affected by modern cultural assumptions.
In this study we present a data-driven approach to household archeology, treating it as an unsupervised labeling problem.
This approach scales to large data sets and provides a more objective complement to human interpretation.
There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action.
We propose the novel problem of automatic advertisement understanding.
To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads.
Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer ("What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace).
We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies.
We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads.
In this paper, we present a resistive switching memristor cell for implementing universal logic gates.
The cell has a weighted control input whose resistance is set based on a control signal that generalizes the operational regime from NAND to NOR functionality.
We further show how threshold logic in the voltage-controlled resistive cell can be used to implement a XOR logic.
Building on the same principle we implement a half adder and a 4-bit CLA (Carry Look-ahead Adder) and show that in comparison with CMOS-only logic, the proposed system shows significant improvements in terms of device area, power dissipation and leakage power.
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to growing amounts of data as well as growing needs for accurate and confident predictions in critical applications.
In contrast to other parallelisation techniques, it can be applied to a broad class of learning algorithms without further mathematical derivations and without writing dedicated code, while at the same time maintaining theoretical performance guarantees.
Moreover, our parallelisation scheme is able to reduce the runtime of many learning algorithms to polylogarithmic time on quasi-polynomially many processing units.
This is a significant step towards a general answer to an open question on the efficient parallelisation of machine learning algorithms in the sense of Nick's Class (NC).
The cost of this parallelisation is in the form of a larger sample complexity.
Our empirical study confirms the potential of our parallelisation scheme with fixed numbers of processors and instances in realistic application scenarios.
Communication tools make the world like a small village and as a consequence people can contact with others who are from different societies or who speak different languages.
This communication cannot happen effectively without Machine Translation because they can be found anytime and everywhere.
There are a number of studies that have developed Machine Translation for the English language with so many other languages except the Arabic it has not been considered yet.
Therefore we aim to highlight a roadmap for our proposed translation machine to provide an enhanced Arabic English translation based on Semantic.
This article provides a quantitative analysis of privacy-compromising mechanisms on 1 million popular websites.
Findings indicate that nearly 9 in 10 websites leak user data to parties of which the user is likely unaware; more than 6 in 10 websites spawn third- party cookies; and more than 8 in 10 websites load Javascript code from external parties onto users' computers.
Sites that leak user data contact an average of nine external domains, indicating that users may be tracked by multiple entities in tandem.
By tracing the unintended disclosure of personal browsing histories on the Web, it is revealed that a handful of U.S. companies receive the vast bulk of user data.
Finally, roughly 1 in 5 websites are potentially vulnerable to known National Security Agency spying techniques at the time of analysis.
Thinking of todays web search scenario which is mainly keyword based, leads to the need of effective and meaningful search provided by Semantic Web.
Existing search engines are vulnerable to provide relevant answers to users query due to their dependency on simple data available in web pages.
On other hand, semantic search engines provide efficient and relevant results as the semantic web manages information with well defined meaning using ontology.
A Meta-Search engine is a search tool that forwards users query to several existing search engines and provides combined results by using their own page ranking algorithm.
SemanTelli is a meta semantic search engine that fetches results from different semantic search engines such as Hakia, DuckDuckGo, SenseBot through intelligent agents.
This paper proposes enhancement of SemanTelli with improved snippet analysis based page ranking algorithm and support for image and news search.
CASP is an extension of ASP that allows for numerical constraints to be added in the rules.
PDDL+ is an extension of the PDDL standard language of automated planning for modeling mixed discrete-continuous dynamics.
In this paper, we present CASP solutions for dealing with PDDL+ problems, i.e., encoding from PDDL+ to CASP, and extensions to the algorithm of the EZCSP CASP solver in order to solve CASP programs arising from PDDL+ domains.
An experimental analysis, performed on well-known linear and non-linear variants of PDDL+ domains, involving various configurations of the EZCSP solver, other CASP solvers, and PDDL+ planners, shows the viability of our solution.
Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated.
In this work, we introduce Neural Process Networks to understand procedural text through (neural) simulation of action dynamics.
Our model complements existing memory architectures with dynamic entity tracking by explicitly modeling actions as state transformers.
The model updates the states of the entities by executing learned action operators.
Empirical results demonstrate that our proposed model can reason about the unstated causal effects of actions, allowing it to provide more accurate contextual information for understanding and generating procedural text, all while offering more interpretable internal representations than existing alternatives.
With the prevalence of video sharing, there are increasing demands for automatic video digestion such as highlight detection.
Recently, platforms with crowdsourced time-sync video comments have emerged worldwide, providing a good opportunity for highlight detection.
However, this task is non-trivial: (1) time-sync comments often lag behind their corresponding shot; (2) time-sync comments are semantically sparse and noisy; (3) to determine which shots are highlights is highly subjective.
The present paper aims to tackle these challenges by proposing a framework that (1) uses concept-mapped lexical-chains for lag calibration; (2) models video highlights based on comment intensity and combination of emotion and concept concentration of each shot; (3) summarize each detected highlight using improved SumBasic with emotion and concept mapping.
Experiments on large real-world datasets show that our highlight detection method and summarization method both outperform other benchmarks with considerable margins.
In this paper, we present a new approach of distributed clustering for spatial datasets, based on an innovative and efficient aggregation technique.
This distributed approach consists of two phases: 1) local clustering phase, where each node performs a clustering on its local data, 2) aggregation phase, where the local clusters are aggregated to produce global clusters.
This approach is characterised by the fact that the local clusters are represented in a simple and efficient way.
And The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in both response time and memory allocation.
We evaluated the approach with different datasets and compared it to well-known clustering techniques.
The experimental results show that our approach is very promising and outperforms all those algorithms
General treebank analyses are graph structured, but parsers are typically restricted to tree structures for efficiency and modeling reasons.
We propose a new representation and algorithm for a class of graph structures that is flexible enough to cover almost all treebank structures, while still admitting efficient learning and inference.
In particular, we consider directed, acyclic, one-endpoint-crossing graph structures, which cover most long-distance dislocation, shared argumentation, and similar tree-violating linguistic phenomena.
We describe how to convert phrase structure parses, including traces, to our new representation in a reversible manner.
Our dynamic program uniquely decomposes structures, is sound and complete, and covers 97.3% of the Penn English Treebank.
We also implement a proof-of-concept parser that recovers a range of null elements and trace types.
Machine Learning models are vulnerable to adversarial attacks that rely on perturbing the input data.
This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack.
First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data.
Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder.
We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions.
When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation.
Ultra-reliable and low-latency communications (URLLC) is expected to be supported without compromising the resource usage efficiency.
In this paper, we study how to maximize energy efficiency (EE) for URLLC under the stringent quality of service (QoS) requirement imposed on the end-to-end (E2E) delay and overall packet loss, where the E2E delay includes queueing delay and transmission delay, and the overall packet loss consists of queueing delay violation, transmission error with finite blocklength channel codes, and proactive packet dropping in deep fading.
Transmit power, bandwidth and number of active antennas are jointly optimized to maximize the system EE under the QoS constraints.
Since the achievable rate with finite blocklength channel codes is not convex in radio resources, it is challenging to optimize resource allocation.
By analyzing the properties of the optimization problem, the global optimal solution is obtained.
Simulation and numerical results validate the analysis and show that the proposed policy can improve EE significantly compared with existing policy.
In this paper, we define a distance for the HSL colour system.
Next, the proposed distance is used for a fuzzy colour clustering algorithm construction.
The presented algorithm is related to the well-known fuzzy c-means algorithm.
Finally, the clustering algorithm is used as colour reduction method.
The obtained experimental results are presented to demonstrate the effectiveness of our approach.
Hierarchical temporal memory (HTM) is a biomimetic sequence memory algorithm that holds promise for invariant representations of spatial and spatiotemporal inputs.
This paper presents a comprehensive neuromemristive crossbar architecture for the spatial pooler (SP) and the sparse distributed representation classifier, which are fundamental to the algorithm.
There are several unique features in the proposed architecture that tightly link with the HTM algorithm.
A memristor that is suitable for emulating the HTM synapses is identified and a new Z-window function is proposed.
The architecture exploits the concept of synthetic synapses to enable potential synapses in the HTM.
The crossbar for the SP avoids dark spots caused by unutilized crossbar regions and supports rapid on-chip training within 2 clock cycles.
This research also leverages plasticity mechanisms such as neurogenesis and homeostatic intrinsic plasticity to strengthen the robustness and performance of the SP.
The proposed design is benchmarked for image recognition tasks using MNIST and Yale faces datasets, and is evaluated using different metrics including entropy, sparseness, and noise robustness.
Detailed power analysis at different stages of the SP operations is performed to demonstrate the suitability for mobile platforms.
Results of image stitching can be perceptually divided into single-perspective and multiple-perspective.
Compared to the multiple-perspective result, the single-perspective result excels in perspective consistency but suffers from projective distortion.
In this paper, we propose two single-perspective warps for natural image stitching.
The first one is a parametric warp, which is a combination of the as-projective-as-possible warp and the quasi-homography warp via dual-feature.
The second one is a mesh-based warp, which is determined by optimizing a total energy function that simultaneously emphasizes different characteristics of the single-perspective warp, including alignment, naturalness, distortion and saliency.
A comprehensive evaluation demonstrates that the proposed warp outperforms some state-of-the-art warps, including homography, APAP, AutoStitch, SPHP and GSP.
The monocular visual-inertial system (VINS), which consists one camera and one low-cost inertial measurement unit (IMU), is a popular approach to achieve accurate 6-DOF state estimation.
However, such locally accurate visual-inertial odometry is prone to drift and cannot provide absolute pose estimation.
Leveraging history information to relocalize and correct drift has become a hot topic.
In this paper, we propose a monocular visual-inertial SLAM system, which can relocalize camera and get the absolute pose in a previous-built map.
Then 4-DOF pose graph optimization is performed to correct drifts and achieve global consistent.
The 4-DOF contains x, y, z, and yaw angle, which is the actual drifted direction in the visual-inertial system.
Furthermore, the proposed system can reuse a map by saving and loading it in an efficient way.
Current map and previous map can be merged together by the global pose graph optimization.
We validate the accuracy of our system on public datasets and compare against other state-of-the-art algorithms.
We also evaluate the map merging ability of our system in the large-scale outdoor environment.
The source code of map reuse is integrated into our public code, VINS-Mono.
The major sources of abundant data are constantly expanding with the available data collection methodologies in various applications - medical, insurance, scientific, bio-informatics and business.
These data sets may be distributed geographically, rich in size and as well as dimensions also.
To analyze these data sets to find out the hidden patterns, it is required to down- load the data to a centralized site which is a challenging task in terms of the limited bandwidth available and computationally also expensive.
The covariance matrix is one of the methods to estimate the relation between any two dimensions.
In this paper, we propose a communication efficient algorithm to estimate the covariance matrix in a distributed manner.
The global covariance matrix is computed by merging the local covariance matrices using a distributed approach.
The results show that it is exactly same as centralized method with good speed-up in terms of computation.
The reason for speed-up is because of the parallel construction of local covariances and distributing the cross-covariances among the nodes so that the load is balanced.
The results are analyzed by considering Mfeat data set on the various partitions which address the scalability also.
Surveys can be viewed as programs, complete with logic, control flow, and bugs.
Word choice or the order in which questions are asked can unintentionally bias responses.
Vague, confusing, or intrusive questions can cause respondents to abandon a survey.
Surveys can also have runtime errors: inattentive respondents can taint results.
This effect is especially problematic when deploying surveys in uncontrolled settings, such as on the web or via crowdsourcing platforms.
Because the results of surveys drive business decisions and inform scientific conclusions, it is crucial to make sure they are correct.
We present SurveyMan, a system for designing, deploying, and automatically debugging surveys.
Survey authors write their surveys in a lightweight domain-specific language aimed at end users.
SurveyMan statically analyzes the survey to provide feedback to survey authors before deployment.
It then compiles the survey into JavaScript and deploys it either to the web or a crowdsourcing platform.
SurveyMan's dynamic analyses automatically find survey bugs, and control for the quality of responses.
We evaluate SurveyMan's algorithms analytically and empirically, demonstrating its effectiveness with case studies of social science surveys conducted via Amazon's Mechanical Turk.
In this paper we consider two social organizations -- service-oriented communities and fractal organizations -- and discuss how their main characteristics provide an answer to several shortcomings of traditional organizations.
In particular, we highlight their ability to tap into the vast basins of "social energy" of our societies.
This is done through the establishing of mutualistic relationships among the organizational components.
The paper also introduces a mathematical model of said mutualistic processes as well as its translation in terms of semantic service description and matching.
Preliminary investigations of the resilience of fractal social organizations are reported.
Simulations show that fractal organizations outperform non-fractal organizations and are able to quickly recover from disruptions and changes characterizing dynamic environments.
A growing number of people are changing the way they consume news, replacing the traditional physical newspapers and magazines by their virtual online versions or/and weblogs.
The interactivity and immediacy present in online news are changing the way news are being produced and exposed by media corporations.
News websites have to create effective strategies to catch people's attention and attract their clicks.
In this paper we investigate possible strategies used by online news corporations in the design of their news headlines.
We analyze the content of 69,907 headlines produced by four major global media corporations during a minimum of eight consecutive months in 2014.
In order to discover strategies that could be used to attract clicks, we extracted features from the text of the news headlines related to the sentiment polarity of the headline.
We discovered that the sentiment of the headline is strongly related to the popularity of the news and also with the dynamics of the posted comments on that particular news.
This paper proposes a new method to provide personalized tour recommendation for museum visits.
It combines an optimization of preference criteria of visitors with an automatic extraction of artwork importance from museum information based on Natural Language Processing using textual energy.
This project includes researchers from computer and social sciences.
Some results are obtained with numerical experiments.
They show that our model clearly improves the satisfaction of the visitor who follows the proposed tour.
This work foreshadows some interesting outcomes and applications about on-demand personalized visit of museums in a very near future.
Interference is emerging as a fundamental bottleneck in many important wireless communication scenarios, including dense cellular networks and cognitive networks with spectrum sharing by multiple service providers.
Although multipleantenna (MIMO) signal processing is known to offer useful degrees of freedom to cancel interference, extreme-value theoretic analysis recently showed that, even in the absence of MIMO processing, the scaling law of the capacity in the number of users for a multi-cell network with and without inter-cell interference was asymptotically identical provided a simple signal to noise and interference ratio (SINR) maximizing scheduler is exploited.
This suggests that scheduling can help reduce inter-cell interference substantially, thus possibly limiting the need for multiple-antenna processing.
However, the convergence limits of interference after scheduling in a multi-cell setting are not yet identified.
In this paper1 we analyze such limits theoretically.
We consider channel statistics under Rayleigh fading with equal path loss for all users or with unequal path loss.
We uncover two surprisingly different behaviors for such systems.
For the equal path loss case, we show that scheduling alone can cause the residual interference to converge to zero for large number of users.
With unequal path loss however, the interference are shown to converge in average to a nonzero constant.
Simulations back our findings.
One of the purposes of Big Data systems is to support analysis of data gathered from heterogeneous data sources.
Since data warehouses have been used for several decades to achieve the same goal, they could be leveraged also to provide analysis of data stored in Big Data systems.
The problem of adapting data warehouse data and schemata to changes in these requirements as well as data sources has been studied by many researchers worldwide.
However, innovative methods must be developed also to support evolution of data warehouses that are used to analyze data stored in Big Data systems.
In this paper, we propose a data warehouse architecture that allows to perform different kinds of analytical tasks, including OLAP-like analysis, on big data loaded from multiple heterogeneous data sources with different latency and is capable of processing changes in data sources as well as evolving analysis requirements.
The operation of the architecture is highly based on the metadata that are outlined in the paper.
Data security, which is concerned with the prevention of unauthorized access to computers, databases, and websites, helps protect digital privacy and ensure data integrity.
It is extremely difficult, however, to make security watertight, and security breaches are not uncommon.
The consequences of stolen credentials go well beyond the leakage of other types of information because they can further compromise other systems.
This paper criticizes the practice of using clear-text identity attributes, such as Social Security or driver's license numbers -- which are in principle not even secret -- as acceptable authentication tokens or assertions of ownership, and proposes a simple protocol that straightforwardly applies public-key cryptography to make identity claims verifiable, even when they are issued remotely via the Internet.
This protocol has the potential of elevating the business practices of credit providers, rental agencies, and other service companies that have hitherto exposed consumers to the risk of identity theft, to where identity theft becomes virtually impossible.
We investigate nearest neighbor and generative models for transferring pose between persons.
We take in a video of one person performing a sequence of actions and attempt to generate a video of another person performing the same actions.
Our generative model (pix2pix) outperforms k-NN at both generating corresponding frames and generalizing outside the demonstrated action set.
Our most salient contribution is determining a pipeline (pose detection, face detection, k-NN based pairing) that is effective at perform-ing the desired task.
We also detail several iterative improvements and failure modes.
In this paper, we describe a new Las Vegas algorithm to solve the elliptic curve discrete logarithm problem.
The algorithm depends on a property of the group of rational points of an elliptic curve and is thus not a generic algorithm.
The algorithm that we describe has some similarities with the most powerful index-calculus algorithm for the discrete logarithm problem over a finite field.
Virtual heart models have been proposed to enhance the safety of implantable cardiac devices through closed loop validation.
To communicate with a virtual heart, devices have been driven by cardiac signals at specific sites.
As a result, only the action potentials of these sites are sensed.
However, the real device implanted in the heart will sense a complex combination of near and far-field extracellular potential signals.
Therefore many device functions, such as blanking periods and refractory periods, are designed to handle these unexpected signals.
To represent these signals, we develop an intracardiac electrogram (IEGM) model as an interface between the virtual heart and the device.
The model can capture not only the local excitation but also far-field signals and pacing afterpotentials.
Moreover, the sensing controller can specify unipolar or bipolar electrogram (EGM) sensing configurations and introduce various oversensing and undersensing modes.
The simulation results show that the model is able to reproduce clinically observed sensing problems, which significantly extends the capabilities of the virtual heart model in the context of device validation.
N-continuous orthogonal frequency division multiplexing (NC-OFDM) is a promising technique to obtain significant sidelobe suppression for baseband OFDM signals, in future 5G wireless communications.
However, the precoder of NC-OFDM usually causes severe interference and high complexity.
To reduce the interference and complexity, this paper proposes an improved time-domain N-continuous OFDM (TD-NC-OFDM) by shortening the smooth signal, which is linearly combined by rectangularly pulsed OFDM basis signals truncated by a smooth window.
Furthermore, we obtain an asymptotic spectrum analysis of the TD-NC-OFDM signals by a closed-form expression, calculate its low complexity in OFDM transceiver, and derive a closed-form expression of the received signal-to-interference-plus-noise ratio (SINR).
Simulation results show that the proposed low-interference TD-NC-OFDM can achieve similar suppression performance but introduce negligible bit error rate (BER) degradation and much lower computational complexity, compared to conventional NC-OFDM.
Software defect prediction is an important aspect of preventive maintenance of a software.
Many techniques have been employed to improve software quality through defect prediction.
This paper introduces an approach of defect prediction through a machine learning algorithm, support vector machines (SVM), by using the code smells as the factor.
Smell prediction model based on support vector machines was used to predict defects in the subsequent releases of the eclipse software.
The results signify the role of smells in predicting the defects of a software.
The results can further be used as a baseline to investigate further the role of smells in predicting defects.
Network coding permits to deploy distributed packet delivery algorithms that locally adapt to the network availability in media streaming applications.
However, it may also increase delay and computational complexity if it is not implemented efficiently.
We address here the effective placement of nodes that implement randomized network coding in overlay networks, so that the goodput is kept high while the delay for decoding stays small in streaming applications.
We first estimate the decoding delay at each client, which depends on the innovative rate in the network.
This estimation permits to identify the nodes that have to perform coding for a reduced decoding delay.
We then propose two iterative algorithms for selecting the nodes that should perform network coding.
The first algorithm relies on the knowledge of the full network statistics.
The second algorithm uses only local network statistics at each node.
Simulation results show that large performance gains can be achieved with the selection of only a few network coding nodes.
Moreover, the second algorithm performs very closely to the central estimation strategy, which demonstrates that the network coding nodes can be selected efficiently in a distributed manner.
Our scheme shows large gains in terms of achieved throughput, delay and video quality in realistic overlay networks when compared to methods that employ traditional streaming strategies as well as random network nodes selection algorithms.
We model the coexistence of DSRC and WiFi networks as a strategic form game with the networks as the players.
Nodes in a DSRC network must support messaging of status updates that are time sensitive.
Such nodes would like to achieve a small age of information of status updates.
In contrast, nodes in a WiFi network would like to achieve large throughputs.
Each network chooses a medium access probability to be used by all its nodes.
We investigate Nash and Stackelberg equilibrium strategies.
In the first chapter of Shannon's "A Mathematical Theory of Communication," it is shown that the maximum entropy rate of an input process of a constrained system is limited by the combinatorial capacity of the system.
Shannon considers systems where the constraints define regular languages and uses results from matrix theory in his derivations.
In this work, the regularity constraint is dropped.
Using generating functions, it is shown that the maximum entropy rate of an input process is upper-bounded by the combinatorial capacity in general.
The presented results also allow for a new approach to systems with regular constraints.
As an example, the results are applied to binary sequences that fulfill the (j,k) run-length constraint and by using the proposed framework, a simple formula for the combinatorial capacity is given and a maxentropic input process is defined.
T-Reqs is a text-based requirements management solution based on the git version control system.
It combines useful conventions, templates and helper scripts with powerful existing solutions from the git ecosystem and provides a working solution to address some known requirements engineering challenges in large-scale agile system development.
Specifically, it allows agile cross-functional teams to be aware of requirements at system level and enables them to efficiently propose updates to those requirements.
Based on our experience with T-Reqs, we i) relate known requirements challenges of large-scale agile system development to tool support; ii) list key requirements for tooling in such a context; and iii) propose concrete solutions for challenges.
The list segment predicate ls used in separation logic for verifying programs with pointers is well-suited to express properties on singly-linked lists.
We study the effects of adding ls to the full propositional separation logic with the separating conjunction and implication, which is motivated by the recent design of new fragments in which all these ingredients are used indifferently and verification tools start to handle the magic wand connective.
This is a very natural extension that has not been studied so far.
We show that the restriction without the separating implication can be solved in polynomial space by using an appropriate abstraction for memory states whereas the full extension is shown undecidable by reduction from first-order separation logic.
Many variants of the logic and fragments are also investigated from the computational point of view when ls is added, providing numerous results about adding reachability predicates to propositional separation logic.
In this paper, we propose a novel approach (SAPEO) to support the survival selection process in multi-objective evolutionary algorithms with surrogate models - it dynamically chooses individuals to evaluate exactly based on the model uncertainty and the distinctness of the population.
We introduce variants that differ in terms of the risk they allow when doing survival selection.
Here, the anytime performance of different SAPEO variants is evaluated in conjunction with an SMS-EMOA using the BBOB bi-objective benchmark.
We compare the obtained results with the performance of the regular SMS-EMOA, as well as another surrogate-assisted approach.
The results open up general questions about the applicability and required conditions for surrogate-assisted multi-objective evolutionary algorithms to be tackled in the future.
Motivated by applications in social network community analysis, we introduce a new clustering paradigm termed motif clustering.
Unlike classical clustering, motif clustering aims to minimize the number of clustering errors associated with both edges and certain higher order graph structures (motifs) that represent "atomic units" of social organizations.
Our contributions are two-fold: We first introduce motif correlation clustering, in which the goal is to agnostically partition the vertices of a weighted complete graph so that certain predetermined "important" social subgraphs mostly lie within the same cluster, while "less relevant" social subgraphs are allowed to lie across clusters.
We then proceed to introduce the notion of motif covers, in which the goal is to cover the vertices of motifs via the smallest number of (near) cliques in the graph.
Motif cover algorithms provide a natural solution for overlapping clustering and they also play an important role in latent feature inference of networks.
For both motif correlation clustering and its extension introduced via the covering problem, we provide hardness results, algorithmic solutions and community detection results for two well-studied social networks.
In this paper, Suprasegmental Hidden Markov Models (SPHMMs) have been used to enhance the recognition performance of text-dependent speaker identification in the shouted environment.
Our speech database consists of two databases: our collected database and the Speech Under Simulated and Actual Stress (SUSAS) database.
Our results show that SPHMMs significantly enhance speaker identification performance compared to Second-Order Circular Hidden Markov Models (CHMM2s) in the shouted environment.
Using our collected database, speaker identification performance in this environment is 68% and 75% based on CHMM2s and SPHMMs respectively.
Using the SUSAS database, speaker identification performance in the same environment is 71% and 79% based on CHMM2s and SPHMMs respectively.
When engineering complex and distributed software and hardware systems (increasingly used in many sectors, such as manufacturing, aerospace, transportation, communication, energy, and health-care), quality has become a big issue, since failures can have economics consequences and can also endanger human life.
Model-based specifications of a component-based system permit to explicitly model the structure and behaviour of components and their integration.
In particular Software Architectures (SA) has been advocated as an effective means to produce quality systems.
In this chapter by combining different technologies and tools for analysis and development, we propose an architecture-centric model-driven approach to validate required properties and to generate the system code.
Functional requirements are elicited and used for identifying expected properties the architecture shall express.
The architectural compliance to the properties is formally demonstrated, and the produced architectural model is used to automatically generate the Java code.
Suitable transformations assure that the code is conforming to both structural and behavioural SA constraints.
This chapter describes the process and discusses how some existing tools and languages can be exploited to support the approach.
Blind people can now use maps located at Mapy.cz, thanks to the long-standing joint efforts of the ELSA Center at the Czech Technical University in Prague, the Teiresias Center at Masaryk University, and the company Seznam.cz.
Conventional map underlays are automatically adjusted so that they could be read through touch after being printed on microcapsule paper, which opens a whole new perspective in the use of tactile maps.
Users may select an area of their choice in the Czech Republic (only within its boundaries, for the time being) and also the production of tactile maps, including the preparation of the map underlays, takes no more than several minutes.
Multimodal medical image fusion helps to increase efficiency in medical diagnosis.
This paper presents multimodal medical image fusion by selecting relevant features using Principle Component Analysis (PCA) and Particle Swarm Optimization techniques (PSO).
DTCWT is used for decomposition of the images into low and high frequency coefficients.
Fusion rules such as combination of minimum, maximum and simple averaging are applied to approximate and detailed coefficients.
The fused image is reconstructed by inverse DTCWT.
Performance metrics are evaluated and it shows that DTCWT-PCA performs better than DTCWT-PSO in terms of Structural Similarity Index Measure (SSIM) and Cross Correlation (CC).
Computation time and feature vector size is reduced in DTCWT-PCA compared to DTCWT-PSO for feature selection which proves robustness and storage capacity.
We study the relationship between the sentiment levels of Twitter users and the evolving network structure that the users created by @-mentioning each other.
We use a large dataset of tweets to which we apply three sentiment scoring algorithms, including the open source SentiStrength program.
Specifically we make three contributions.
Firstly we find that people who have potentially the largest communication reach (according to a dynamic centrality measure) use sentiment differently than the average user: for example they use positive sentiment more often and negative sentiment less often.
Secondly we find that when we follow structurally stable Twitter communities over a period of months, their sentiment levels are also stable, and sudden changes in community sentiment from one day to the next can in most cases be traced to external events affecting the community.
Thirdly, based on our findings, we create and calibrate a simple agent-based model that is capable of reproducing measures of emotive response comparable to those obtained from our empirical dataset.
In this project, we combine AlphaGo algorithm with Curriculum Learning to crack the game of Gomoku.
Modifications like Double Networks Mechanism and Winning Value Decay are implemented to solve the intrinsic asymmetry and short-sight of Gomoku.
Our final AI AlphaGomoku, through two days' training on a single GPU, has reached humans' playing level.
This paper presents a solution based on dual quaternion algebra to the general problem of pose (i.e., position and orientation) consensus for systems composed of multiple rigid-bodies.
The dual quaternion algebra is used to model the agents' poses and also in the distributed control laws, making the proposed technique easily applicable to formation control of general robotic systems.
The proposed pose consensus protocol has guaranteed convergence when the interaction among the agents is represented by directed graphs with directed spanning trees, which is a more general result when compared to the literature on formation control.
In order to illustrate the proposed pose consensus protocol and its extension to the problem of formation control, we present a numerical simulation with a large number of free-flying agents and also an application of cooperative manipulation by using real mobile manipulators.
We present a new technique for learning visual-semantic embeddings for cross-modal retrieval.
Inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, we introduce a simple change to common loss functions used for multi-modal embeddings.
That, combined with fine-tuning and use of augmented data, yields significant gains in retrieval performance.
We showcase our approach, VSE++, on MS-COCO and Flickr30K datasets, using ablation studies and comparisons with existing methods.
On MS-COCO our approach outperforms state-of-the-art methods by 8.8% in caption retrieval and 11.3% in image retrieval (at R@1).
This paper investigates stochastic nondeterminism on continuous state spaces by relating nondeterministic kernels and stochastic effectivity functions to each other.
Nondeterministic kernels are functions assigning each state a set o subprobability measures, and effectivity functions assign to each state an upper-closed set of subsets of measures.
Both concepts are generalizations of Markov kernels used for defining two different models: Nondeterministic labelled Markov processes and stochastic game models, respectively.
We show that an effectivity function that maps into principal filters is given by an image-countable nondeterministic kernel, and that image-finite kernels give rise to effectivity functions.
We define state bisimilarity for the latter, considering its connection to morphisms.
We provide a logical characterization of bisimilarity in the finitary case.
A generalization of congruences (event bisimulations) to effectivity functions and its relation to the categorical presentation of bisimulation are also studied.
This paper proposes a novel multimodal fusion approach, aiming to produce best possible decisions by integrating information coming from multiple media.
While most of the past multimodal approaches either work by projecting the features of different modalities into the same space, or by coordinating the representations of each modality through the use of constraints, our approach borrows from both visions.
More specifically, assuming each modality can be processed by a separated deep convolutional network, allowing to take decisions independently from each modality, we introduce a central network linking the modality specific networks.
This central network not only provides a common feature embedding but also regularizes the modality specific networks through the use of multi-task learning.
The proposed approach is validated on 4 different computer vision tasks on which it consistently improves the accuracy of existing multimodal fusion approaches.
Algebraic effects are computational effects that can be represented by an equational theory whose operations produce the effects at hand.
The free model of this theory induces the expected computational monad for the corresponding effect.
Algebraic effects include exceptions, state, nondeterminism, interactive input/output, and time, and their combinations.
Exception handling, however, has so far received no algebraic treatment.
We present such a treatment, in which each handler yields a model of the theory for exceptions, and each handling construct yields the homomorphism induced by the universal property of the free model.
We further generalise exception handlers to arbitrary algebraic effects.
The resulting programming construct includes many previously unrelated examples from both theory and practice, including relabelling and restriction in Milner's CCS, timeout, rollback, and stream redirection.
This paper proposes a novel framework to reconstruct the dynamic magnetic resonance images (DMRI) with motion compensation (MC).
Due to the inherent motion effects during DMRI acquisition, reconstruction of DMRI using motion estimation/compensation (ME/MC) has been studied under a compressed sensing (CS) scheme.
In this paper, by embedding the intensity-based optical flow (OF) constraint into the traditional CS scheme, we are able to couple the DMRI reconstruction with motion field estimation.
The formulated optimization problem is solved by a primal-dual algorithm with linesearch due to its efficiency when dealing with non-differentiable problems.
With the estimated motion field, the DMRI reconstruction is refined through MC.
By employing the multi-scale coarse-to-fine strategy, we are able to update the variables(temporal image sequences and motion vectors) and to refine the image reconstruction alternately.
Moreover, the proposed framework is capable of handling a wide class of prior information (regularizations) for DMRI reconstruction, such as sparsity, low rank and total variation.
Experiments on various DMRI data, ranging from in vivo lung to cardiac dataset, validate the reconstruction quality improvement using the proposed scheme in comparison to several state-of-the-art algorithms.
In recent years, there have been many works that use website fingerprinting techniques to enable a local adversary to determine which website a Tor user is visiting.
However, most of these works rely on manually extracted features, and thus are fragile: a small change in the protocol or a simple defense often renders these attacks useless.
In this work, we leverage deep learning techniques to create a more robust attack that does not require any manually extracted features.
Specifically, we propose Var-CNN, an attack that uses model variations on convolutional neural networks with both the packet sequence and packet timing data.
In open-world settings, Var-CNN attains higher true positive rate and lower false positive rate than any prior work at 90.9% and 0.3%, respectively.
Moreover, these improvements are observed even with low amounts of training data, where deep learning techniques often suffer.
Given the severity of our attacks, we also introduce a new countermeasure, DynaFlow, based on dynamically adjusting flows to protect against website fingerprinting attacks.
DynaFlow provides a similar level of security as current state-of-the-art and defeats all attacks, including our own, while being over 40% more efficient than existing defenses.
Moreover, unlike many prior defenses, DynaFlow can protect dynamically generated websites as well.
The discovery of influential entities in all kinds of networks (e.g. social, digital, or computer) has always been an important field of study.
In recent years, Online Social Networks (OSNs) have been established as a basic means of communication and often influencers and opinion makers promote politics, events, brands or products through viral content.
In this work, we present a systematic review across i) online social influence metrics, properties, and applications and ii) the role of semantic in modeling OSNs information.
We end up with the conclusion that both areas can jointly provide useful insights towards the qualitative assessment of viral user-generated content, as well as for modeling the dynamic properties of influential content and its flow dynamics.
Proper management of requirements is crucial to successful development software within limited time and cost.
Nonfunctional requirements (NFR) are one of the key criteria to derive a comparison among various software systems.
In most of software development NFR have be specified as an additional requirement of software.
NFRs such as performance, reliability, maintainability, security, accuracy etc. have to be considered at the early stage of software development as functional requirement (FR).
However, identifying NFR is not an easy task.
Although there are well developed techniques for eliciting functional requirement, there is a lack of elicitation mechanism for NFR and there is no proper consensus regarding NFR elicitation techniques.
Eliciting NFRs are considered to be one of the challenging jobs in requirement analysis.
This paper proposes a UML use case based questionary approach to identifying and classifying NFR of a system.
The proposed approach is illustrated by using a Point of Sale (POS) case study
In this study, a novel illuminant color estimation framework is proposed for color constancy, which incorporates the high representational capacity of deep-learning-based models and the great interpretability of assumption-based models.
The well-designed building block, feature map reweight unit (ReWU), helps to achieve comparative accuracy on benchmark datasets with respect to prior state-of-the-art models while requiring only 1%-5% model size and 8%-20% computational cost.
In addition to local color estimation, a confidence estimation branch is also included such that the model is able to produce point estimate and its uncertainty estimate simultaneously, which provides useful clues for local estimates aggregation and multiple illumination estimation.
The source code and the dataset are available at https://github.com/QiuJueqin/Reweight-CC.
Many modern Artificial Intelligence (AI) systems make use of data embeddings, particularly in the domain of Natural Language Processing (NLP).
These embeddings are learnt from data that has been gathered "from the wild" and have been found to contain unwanted biases.
In this paper we make three contributions towards measuring, understanding and removing this problem.
We present a rigorous way to measure some of these biases, based on the use of word lists created for social psychology applications; we observe how gender bias in occupations reflects actual gender bias in the same occupations in the real world; and finally we demonstrate how a simple projection can significantly reduce the effects of embedding bias.
All this is part of an ongoing effort to understand how trust can be built into AI systems.
In a full-duplex (FD) multi-user network, the system performance is not only limited by the self-interference but also by the co-channel interference due to the simultaneous uplink and downlink transmissions.
Joint design of the uplink/downlink transmission direction of users and the power allocation is crucial for achieving high system performance in the FD multi-user network.
In this paper, we investigate the joint uplink/downlink transmission direction assignment (TDA), user paring (UP) and power allocation problem for maximizing the system max-min fairness (MMF) rate in a FD multi-user orthogonal frequency division multiple access (OFDMA) system.
The problem is formulated with a two-time-scale structure where the TDA and the UP variables are for optimizing a long-term MMF rate while the power allocation is for optimizing an instantaneous MMF rate during each channel coherence interval.
We show that the studied joint MMF rate maximization problem is NP-hard in general.
To obtain high-quality suboptimal solutions, we propose efficient methods based on simple relaxation and greedy rounding techniques.
Simulation results are presented to show that the proposed algorithms are effective and achieve higher MMF rates than the existing heuristic methods.
A social network consists of a set of actors and a set of relationships between them which describe certain patterns of communication.
Most current networks are huge and difficult to analyze and visualize.
One of the methods frequently used is to extract the most important features, namely to create a certain abstraction, that is the transformation of a large network to a much smaller one, so the latter is a useful summary of the original one, still keeping the most important characteristics.
In the case of a social network it can be achieved in two ways.
One is to find groups of actors and present only them and relationships between them.
The other is to find actors who play similar roles and to construct a smaller network in which the connection between the actors would be replaced with connections between the roles.
Classifying actors by the roles they are playing in the network can help to understand 'who is who' in a social network.
This classification can be very useful, because it gives us a comprehensive view of the network and helps to understand how the network is organized, and to predict how it could behave in the case of certain events (internal or external).
Communicating and sharing intelligence among agents is an important facet of achieving Artificial General Intelligence.
As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs).
While GANs have recently been shown to be very effective for image generation and other tasks, these networks have been limited to mostly single generator-discriminator networks.
We show that we can obtain multi-agent GANs that communicate through message passing to achieve better image generation.
The objectives of the individual agents in this framework are two fold: a co-operation objective and a competing objective.
The co-operation objective ensures that the message sharing mechanism guides the other generator to generate better than itself while the competing objective encourages each generator to generate better than its counterpart.
We analyze and visualize the messages that these GANs share among themselves in various scenarios.
We quantitatively show that the message sharing formulation serves as a regularizer for the adversarial training.
Qualitatively, we show that the different generators capture different traits of the underlying data distribution.
Energy harvesting is a technology for enabling green, sustainable, and autonomous wireless networks.
In this paper, a large-scale wireless network with energy harvesting transmitters is considered, where a group of transmitters forms a cluster to cooperatively serve a desired receiver amid interference and noise.
To characterize the link-level performance, closed-form expressions are derived for the transmission success probability at a receiver in terms of key parameters such as node densities, energy harvesting parameters, channel parameters, and cluster size, for a given cluster geometry.
The analysis is further extended to characterize a network-level performance metric, capturing the tradeoff between link quality and the fraction of receivers served.
Numerical simulations validate the accuracy of the analytical model.
Several useful insights are provided.
For example, while more cooperation helps improve the link-level performance, the network-level performance might degrade with the cluster size.
Numerical results show that a small cluster size (typically 3 or smaller) optimizes the network-level performance.
Furthermore, substantial performance can be extracted with a relatively small energy buffer.
Moreover, the utility of having a large energy buffer increases with the energy harvesting rate as well as with the cluster size in sufficiently dense networks.
The majority of deep neural network (DNN) based speech enhancement algorithms rely on the mean-square error (MSE) criterion of short-time spectral amplitudes (STSA), which has no apparent link to human perception, e.g. speech intelligibility.
Short-Time Objective Intelligibility (STOI), a popular state-of-the-art speech intelligibility estimator, on the other hand, relies on linear correlation of speech temporal envelopes.
This raises the question if a DNN training criterion based on envelope linear correlation (ELC) can lead to improved speech intelligibility performance of DNN based speech enhancement algorithms compared to algorithms based on the STSA-MSE criterion.
In this paper we derive that, under certain general conditions, the STSA-MSE and ELC criteria are practically equivalent, and we provide empirical data to support our theoretical results.
Furthermore, our experimental findings suggest that the standard STSA minimum-MSE estimator is near optimal, if the objective is to enhance noisy speech in a manner which is optimal with respect to the STOI speech intelligibility estimator.
The availability of large-scale annotated image datasets coupled with recent advances in supervised deep learning methods are enabling the derivation of representative image features that can potentially impact different image analysis problems.
However, such supervised approaches are not feasible in the medical domain where it is challenging to obtain a large volume of labelled data due to the complexity of manual annotation and inter- and intra-observer variability in label assignment.
Algorithms designed to work on small annotated datasets are useful but have limited applications.
In an effort to address the lack of annotated data in the medical image analysis domain, we propose an algorithm for hierarchical unsupervised feature learning.
Our algorithm introduces three new contributions: (i) we use kernel learning to identify and represent invariant characteristics across image sub-patches in an unsupervised manner; (ii) we leverage the sparsity inherent to medical image data and propose a new sparse convolutional kernel network (S-CKN) that can be pre-trained in a layer-wise fashion, thereby providing initial discriminative features for medical data; and (iii) we propose a spatial pyramid pooling framework to capture subtle geometric differences in medical image data.
Our experiments evaluate our algorithm in two common application areas of medical image retrieval and classification using two public datasets.
Our results demonstrate that the medical image feature representations extracted with our algorithm enable a higher accuracy in both application areas compared to features extracted from other conventional unsupervised methods.
Furthermore, our approach achieves an accuracy that is competitive with state-of-the-art supervised CNNs.
External or internal domain-specific languages (DSLs) or (fluent) APIs?
Whoever you are -- a developer or a user of a DSL -- you usually have to choose your side; you should not!
What about metamorphic DSLs that change their shape according to your needs?
We report on our 4-years journey of providing the "right" support (in the domain of feature modeling), leading us to develop an external DSL, different shapes of an internal API, and maintain all these languages.
A key insight is that there is no one-size-fits-all solution or no clear superiority of a solution compared to another.
On the contrary, we found that it does make sense to continue the maintenance of an external and internal DSL.
The vision that we foresee for the future of software languages is their ability to be self-adaptable to the most appropriate shape (including the corresponding integrated development environment) according to a particular usage or task.
We call metamorphic DSL such a language, able to change from one shape to another shape.
Enterprise software systems make complex interactions with other services in their environment.
Developing and testing for production-like conditions is therefore a challenging task.
Prior approaches include emulations of the dependency services using either explicit modelling or record-and-replay approaches.
Models require deep knowledge of the target services while record-and-replay is limited in accuracy.
We present a new technique that improves the accuracy of record-and-replay approaches, without requiring prior knowledge of the services.
The approach uses multiple sequence alignment to derive message prototypes from recorded system interactions and a scheme to match incoming request messages against message prototypes to generate response messages.
We introduce a modified Needleman-Wunsch algorithm for distance calculation during message matching, wildcards in message prototypes for high variability sections, and entropy-based weightings in distance calculations for increased accuracy.
Combined, our new approach has shown greater than 99% accuracy for four evaluated enterprise system messaging protocols.
Deep Learning is one of the newest trends in Machine Learning and Artificial Intelligence research.
It is also one of the most popular scientific research trends now-a-days.
Deep learning methods have brought revolutionary advances in computer vision and machine learning.
Every now and then, new and new deep learning techniques are being born, outperforming state-of-the-art machine learning and even existing deep learning techniques.
In recent years, the world has seen many major breakthroughs in this field.
Since deep learning is evolving at a huge speed, its kind of hard to keep track of the regular advances especially for new researchers.
In this paper, we are going to briefly discuss about recent advances in Deep Learning for past few years.
Structured prediction is ubiquitous in applications of machine learning such as knowledge extraction and natural language processing.
Structure often can be formulated in terms of logical constraints.
We consider the question of how to perform efficient active learning in the presence of logical constraints among variables inferred by different classifiers.
We propose several methods and provide theoretical results that demonstrate the inappropriateness of employing uncertainty guided sampling, a commonly used active learning method.
Furthermore, experiments on ten different datasets demonstrate that the methods significantly outperform alternatives in practice.
The results are of practical significance in situations where labeled data is scarce.
Accurate Traffic Sign Detection (TSD) can help intelligent systems make better decisions according to the traffic regulations.
TSD, regarded as a typical small object detection problem in some way, is fundamental in Advanced Driver Assistance Systems (ADAS) and self-driving.
However, although deep neural networks have achieved human even superhuman performance on several tasks, due to their own limitations, small object detection is still an open question.
In this paper, we proposed a brain-inspired network, named as KB-RANN, to handle this problem.
Attention mechanism is an essential function of our brain, we used a novel recurrent attentive neural network to improve the detection accuracy in a fine-grained manner.
Further, we combined domain specific knowledge and intuitive knowledge to improve the efficiency.
Experimental result shows that our methods achieved better performance than several popular methods widely used in object detection.
More significantly, we transplanted our method on our designed embedded system and deployed on our self-driving car successfully.
Convolutional Neural Networks (CNNs) have recently emerged as the dominant model in computer vision.
If provided with enough training data, they predict almost any visual quantity.
In a discrete setting, such as classification, CNNs are not only able to predict a label but often predict a confidence in the form of a probability distribution over the output space.
In continuous regression tasks, such a probability estimate is often lacking.
We present a regression framework which models the output distribution of neural networks.
This output distribution allows us to infer the most likely labeling following a set of physical or modeling constraints.
These constraints capture the intricate interplay between different input and output variables, and complement the output of a CNN.
However, they may not hold everywhere.
Our setup further allows to learn a confidence with which a constraint holds, in the form of a distribution of the constrain satisfaction.
We evaluate our approach on the problem of intrinsic image decomposition, and show that constrained structured regression significantly increases the state-of-the-art.
Several studies have been conducted on understanding third-party user tracking on the web.
However, web trackers can only track users on sites where they are embedded by the publisher, thus obtaining a fragmented view of a user's online footprint.
In this work, we investigate a different form of user tracking, where browser extensions are repurposed to capture the complete online activities of a user and communicate the collected sensitive information to a third-party domain.
We conduct an empirical study of spying browser extensions on the Chrome Web Store.
First, we present an in-depth analysis of the spying behavior of these extensions.
We observe that these extensions steal a variety of sensitive user information, such as the complete browsing history (e.g., the sequence of web traversals), online social network (OSN) access tokens, IP address, and user geolocation.
Second, we investigate the potential for automatically detecting spying extensions by applying machine learning schemes.
We show that using a Recurrent Neural Network (RNN), the sequences of browser API calls can be a robust feature, outperforming hand-crafted features (used in prior work on malicious extensions) to detect spying extensions.
Our RNN based detection scheme achieves a high precision (90.02%) and recall (93.31%) in detecting spying extensions.
The new era of computing called Cloud Computing allows the user to access the cloud services dynamically over the Internet wherever and whenever needed.
Cloud consists of data and resources; and the cloud services include the delivery of software, infrastructure, applications, and storage over the Internet based on user demand through Internet.
In short, cloud computing is a business and economic model allowing the users to utilize high-end computing and storage virtually with minimal infrastructure on their end.
Cloud has three service models namely, Cloud Software-as-a-Service (SaaS), Cloud Platform-as-a-Service (PaaS), and Cloud Infrastructure-as-a-Service (IaaS).
This paper talks in depth of cloud infrastructure service management.
The capability to operate cloud-native applications can generate enormous business growth and value.
But enterprise architects should be aware that cloud-native applications are vulnerable to vendor lock-in.
We investigated cloud-native application design principles, public cloud service providers, and industrial cloud standards.
All results indicate that most cloud service categories seem to foster vendor lock-in situations which might be especially problematic for enterprise architectures.
This might sound disillusioning at first.
However, we present a reference model for cloud-native applications that relies only on a small subset of well standardized IaaS services.
The reference model can be used for codifying cloud technologies.
It can guide technology identification, classification, adoption, research and development processes for cloud-native application and for vendor lock-in aware enterprise architecture engineering methodologies.
Variational auto-encoders (VAEs) provide an attractive solution to image generation problem.
However, they tend to produce blurred and over-smoothed images due to their dependence on pixel-wise reconstruction loss.
This paper introduces a new approach to alleviate this problem in the VAE based generative models.
Our model simultaneously learns to match the data, reconstruction loss and the latent distributions of real and fake images to improve the quality of generated samples.
To compute the loss distributions, we introduce an auto-encoder based discriminator model which allows an adversarial learning procedure.
The discriminator in our model also provides perceptual guidance to the VAE by matching the learned similarity metric of the real and fake samples in the latent space.
To stabilize the overall training process, our model uses an error feedback approach to maintain the equilibrium between competing networks in the model.
Our experiments show that the generated samples from our proposed model exhibit a diverse set of attributes and facial expressions and scale up to high-resolution images very well.
This paper applies the multibond graph approach for rigid multibody systems to model the dynamics of general spatial mechanisms.
The commonly used quick return mechanism which comprises of revolute as well as prismatic joints has been chosen as a representative example to demonstrate the application of this technique and its resulting advantages.
In this work, the links of the quick return mechanism are modeled as rigid bodies.
The rigid links are then coupled at the joints based on the nature of constraint.
This alternative method of formulation of system dynamics, using Bond Graphs, offers a rich set of features that include pictorial representation of the dynamics of translation and rotation for each link of the mechanism in the inertial frame, representation and handling of constraints at the joints, depiction of causality, obtaining dynamic reaction forces and moments at various locations in the mechanism and so on.
Yet another advantage of this approach is that the coding for simulation can be carried out directly from the Bond Graph in an algorithmic manner, without deriving system equations.
In this work, the program code for simulation is written in MATLAB.
The vector and tensor operations are conveniently represented in MATLAB, resulting in a compact and optimized code.
The simulation results are plotted and discussed in detail.
In the field of robust geometric computation it is often necessary to make exact decisions based on inexact floating-point arithmetic.
One common approach is to store the computation history in an arithmetic expression dag and to re-evaluate the expression with increasing precision until an exact decision can be made.
We show that exact-decisions number types based on expression dags can be evaluated faster in practice through parallelization on multiple cores.
We compare the impact of several restructuring methods for the expression dag on its running time in a parallel environment.
This document describes a library for similarity searching.
Even though the library contains a variety of metric-space access methods, our main focus is on search methods for non-metric spaces.
Because there are fewer exact solutions for non-metric spaces, many of our methods give only approximate answers.
Thus, the methods are evaluated in terms of efficiency-effectiveness trade-offs rather than merely in terms of their efficiency.
Our goal is, therefore, to provide not only state-of-the-art approximate search methods for both non-metric and metric spaces, but also the tools to measure search quality.
We concentrate on technical details, i.e., how to compile the code, run the benchmarks, evaluate results, and use our code in other applications.
Additionally, we explain how to extend the code by adding new search methods and spaces.
One source of disturbance in a pulsed T-ray signal is attributed to ambient water vapor.
Water molecules in the gas phase selectively absorb T-rays at discrete frequencies corresponding to their molecular rotational transitions.
This results in prominent resonances spread over the T-ray spectrum, and in the time domain the T-ray signal is observed as fluctuations after the main pulse.
These effects are generally undesired, since they may mask critical spectroscopic data.
So, ambient water vapor is commonly removed from the T-ray path by using a closed chamber during the measurement.
Yet, in some applications a closed chamber is not applicable.
This situation, therefore, motivates the need for another method to reduce these unwanted artifacts.
This paper presents a study on a computational means to address the problem.
Initially, a complex frequency response of water vapor is modeled from a spectroscopic catalog.
Using a deconvolution technique, together with fine tuning of the strength of each resonance, parts of the water-vapor response are removed from a measured T-ray signal, with minimal signal distortion.
Deep-neural-network (DNN) based noise suppression systems yield significant improvements over conventional approaches such as spectral subtraction and non-negative matrix factorization, but do not generalize well to noise conditions they were not trained for.
In comparison to DNNs, humans show remarkable noise suppression capabilities that yield successful speech intelligibility under various adverse listening conditions and negative signal-to-noise ratios (SNRs).
Motivated by the excellent human performance, this paper explores whether numerical models that simulate human cochlear signal processing can be combined with DNNs to improve the robustness of DNN based noise suppression systems.
Five cochlear models were coupled to fully-connected and recurrent NN-based noise suppression systems and were trained and evaluated for a variety of noise conditions using objective metrics: perceptual speech quality (PESQ), segmental SNR and cepstral distance.
The simulations show that biophysically-inspired cochlear models improve the generalizability of DNN-based noise suppression systems for unseen noise and negative SNRs.
This approach thus leads to robust noise suppression systems that are less sensitive to the noise type and noise level.
Because cochlear models capture the intrinsic nonlinearities and dynamics of peripheral auditory processing, it is shown here that accounting for their deterministic signal processing improves machine hearing and avoids overtraining of multi-layer DNNs.
We hence conclude that machines hear better when realistic cochlear models are used at the input of DNNs.
Computing layer similarities is an important way of characterizing multiplex networks because various static properties and dynamic processes depend on the relationships between layers.
We provide a taxonomy and experimental evaluation of approaches to compare layers in multiplex networks.
Our taxonomy includes, systematizes and extends existing approaches, and is complemented by a set of practical guidelines on how to apply them.
Recent state-of-the-art scene text recognition methods have primarily focused on horizontal text in images.
However, in several Asian countries, including China, large amounts of text in signs, books, and TV commercials are vertically directed.
Because the horizontal and vertical texts exhibit different characteristics, developing an algorithm that can simultaneously recognize both types of text in real environments is necessary.
To address this problem, we adopted the direction encoding mask (DEM) and selective attention network (SAN) methods based on supervised learning.
DEM contains directional information to compensate in cases that lack text direction; therefore, our network is trained using this information to handle the vertical text.
The SAN method is designed to work individually for both types of text.
To train the network to recognize both types of text and to evaluate the effectiveness of the designed model, we prepared a new synthetic vertical text dataset and collected an actual vertical text dataset (VTD142) from the Web.
Using these datasets, we proved that our proposed model can accurately recognize both vertical and horizontal text and can achieve state-of-the-art results in experiments using benchmark datasets, including the street view test (SVT), IIIT-5k, and ICDAR.
Although our model is relatively simple as compared to its predecessors, it maintains the accuracy and is trained in an end-to-end manner.
The area of Handwritten Signature Verification has been broadly researched in the last decades, but remains an open research problem.
The objective of signature verification systems is to discriminate if a given signature is genuine (produced by the claimed individual), or a forgery (produced by an impostor).
This has demonstrated to be a challenging task, in particular in the offline (static) scenario, that uses images of scanned signatures, where the dynamic information about the signing process is not available.
Many advancements have been proposed in the literature in the last 5-10 years, most notably the application of Deep Learning methods to learn feature representations from signature images.
In this paper, we present how the problem has been handled in the past few decades, analyze the recent advancements in the field, and the potential directions for future research.
One of the main aims of the so-called Web of Data is to be able to handle heterogeneous resources where data can be expressed in either XML or RDF.
The design of programming languages able to handle both XML and RDF data is a key target in this context.
In this paper we present a framework called XQOWL that makes possible to handle XML and RDF/OWL data with XQuery.
XQOWL can be considered as an extension of the XQuery language that connects XQuery with SPARQL and OWL reasoners.
XQOWL embeds SPARQL queries (via Jena SPARQL engine) in XQuery and enables to make calls to OWL reasoners (HermiT, Pellet and FaCT++) from XQuery.
It permits to combine queries against XML and RDF/OWL resources as well as to reason with RDF/OWL data.
Therefore input data can be either XML or RDF/OWL and output data can be formatted in XML (also using RDF/OWL XML serialization).
A major issue of locally repairable codes is their robustness.
If a local repair group is not able to perform the repair process, this will result in increasing the repair cost.
Therefore, it is critical for a locally repairable code to have multiple repair groups.
In this paper we consider robust locally repairable coding schemes which guarantee that there exist multiple alternative local repair groups for any single failure such that the failed node can still be repaired locally even if some of the repair groups are not available.
We use linear programming techniques to establish upper bounds on the code size of these codes.
Furthermore, we address the update efficiency problem of the distributed data storage networks.
Any modification on the stored data will result in updating the content of the storage nodes.
Therefore, it is essential to minimise the number of nodes which need to be updated by any change in the stored data.
We characterise the update-efficient storage code properties and establish the necessary conditions that the weight enumerator of these codes need to satisfy.
In this paper we present the performance of parallel text processing with Map Reduce on a cloud platform.
Scientific papers in Turkish language are processed using Zemberek NLP library.
Experiments were run on a Hadoop cluster and compared with the single machines performance.
We present a transition-based AMR parser that directly generates AMR parses from plain text.
We use Stack-LSTMs to represent our parser state and make decisions greedily.
In our experiments, we show that our parser achieves very competitive scores on English using only AMR training data.
Adding additional information, such as POS tags and dependency trees, improves the results further.
Vision algorithms capable of interpreting scenes from a real-time video stream are necessary for computer-assisted surgery systems to achieve context-aware behavior.
In laparoscopic procedures one particular algorithm needed for such systems is the identification of surgical phases, for which the current state of the art is a model based on a CNN-LSTM.
A number of previous works using models of this kind have trained them in a fully supervised manner, requiring a fully annotated dataset.
Instead, our work confronts the problem of learning surgical phase recognition in scenarios presenting scarce amounts of annotated data (under 25% of all available video recordings).
We propose a teacher/student type of approach, where a strong predictor called the teacher, trained beforehand on a small dataset of ground truth-annotated videos, generates synthetic annotations for a larger dataset, which another model - the student - learns from.
In our case, the teacher features a novel CNN-biLSTM-CRF architecture, designed for offline inference only.
The student, on the other hand, is a CNN-LSTM capable of making real-time predictions.
Results for various amounts of manually annotated videos demonstrate the superiority of the new CNN-biLSTM-CRF predictor as well as improved performance from the CNN-LSTM trained using synthetic labels generated for unannotated videos.
For both offline and online surgical phase recognition with very few annotated recordings available, this new teacher/student strategy provides a valuable performance improvement by efficiently leveraging the unannotated data.
Information extraction identifies useful and relevant text in a document and converts unstructured text into a form that can be loaded into a database table.
Named entity extraction is a main task in the process of information extraction and is a classification problem in which words are assigned to one or more semantic classes or to a default non-entity class.
A word which can belong to one or more classes and which has a level of uncertainty in it can be best handled by a self learning Fuzzy Logic Technique.
This paper proposes a method for detecting the presence of spatial uncertainty in the text and dealing with spatial ambiguity using named entity extraction techniques coupled with self learning fuzzy logic techniques
Confluence denotes the property of a state transition system that states can be rewritten in more than one way yielding the same result.
Although it is a desirable property, confluence is often too strict in practical applications because it also considers states that can never be reached in practice.
Additionally, sometimes states that have the same semantics in the practical context are considered as different states due to different syntactic representations.
By introducing suitable invariants and equivalence relations on the states, programs may have the property to be confluent modulo the equivalence relation w.r.t. the invariant which often is desirable in practice.
In this paper, a sufficient and necessary criterion for confluence modulo equivalence w.r.t. an invariant for Constraint Handling Rules (CHR) is presented.
It is the first approach that covers invariant-based confluence modulo equivalence for the de facto standard semantics of CHR.
There is a trade-off between practical applicability and the simplicity of proving a confluence property.
Therefore, a better manageable subset of equivalence relations has been identified that allows for the proposed confluence criterion and and simplifies the confluence proofs by using well established CHR analysis methods.
It is widely acknowledged that function symbols are an important feature in answer set programming, as they make modeling easier, increase the expressive power, and allow us to deal with infinite domains.
The main issue with their introduction is that the evaluation of a program might not terminate and checking whether it terminates or not is undecidable.
To cope with this problem, several classes of logic programs have been proposed where the use of function symbols is restricted but the program evaluation termination is guaranteed.
Despite the significant body of work in this area, current approaches do not include many simple practical programs whose evaluation terminates.
In this paper, we present the novel classes of rule-bounded and cycle-bounded programs, which overcome different limitations of current approaches by performing a more global analysis of how terms are propagated from the body to the head of rules.
Results on the correctness, the complexity, and the expressivity of the proposed approach are provided.
We propose two robust methods for anomaly detection in dynamic networks in which the properties of normal traffic are time-varying.
We formulate the robust anomaly detection problem as a binary composite hypothesis testing problem and propose two methods: a model-free and a model-based one, leveraging techniques from the theory of large deviations.
Both methods require a family of Probability Laws (PLs) that represent normal properties of traffic.
We devise a two-step procedure to estimate this family of PLs.
We compare the performance of our robust methods and their vanilla counterparts, which assume that normal traffic is stationary, on a network with a diurnal normal pattern and a common anomaly related to data exfiltration.
Simulation results show that our robust methods perform better than their vanilla counterparts in dynamic networks.
This paper addresses the problem of distributed event localization using noisy range measurements with respect to sensors with known positions.
Event localization is fundamental in many wireless sensor network applications such as homeland security, law enforcement, and environmental studies.
However, most existing distributed algorithms require the target event to be within the convex hull of the deployed sensors.
Based on the alternating direction method of multipliers (ADMM), we propose two scalable distributed algorithms named GS-ADMM and J-ADMM which do not require the target event to be within the convex hull of the deployed sensors.
More specifically, the two algorithms can be implemented in a scenario in which the entire sensor network is divided into several clusters with cluster heads collecting measurements within each cluster and exchanging intermediate computation information to achieve localization consistency (consensus) across all clusters.
This scenario is important in many applications such as homeland security and law enforcement.
Simulation results confirm effectiveness of the proposed algorithms.
In recent years, an increasing amount of data is collected in different and often, not cooperative, databases.
The problem of privacy-preserving, distributed calculations over separated databases and, a relative to it, issue of private data release were intensively investigated.
However, despite a considerable progress, computational complexity, due to an increasing size of data, remains a limiting factor in real-world deployments, especially in case of privacy-preserving computations.
In this paper, we present a general method for trade off between performance and accuracy of distributed calculations by performing data sampling.
Sampling was a topic of extensive research that recently received a boost of interest.
We provide a sampling method targeted at separate, non-collaborating, vertically partitioned datasets.
The method is exemplified and tested on approximation of intersection set both without and with privacy-preserving mechanism.
An analysis of the bound on error as a function of the sample size is discussed and heuristic algorithm is suggested to further improve the performance.
The algorithms were implemented and experimental results confirm the validity of the approach.
The set covering problem (SCP) is one of the representative combinatorial optimization problems, having many practical applications.
This paper investigates the development of an algorithm to solve SCP by employing chemical reaction optimization (CRO), a general-purpose metaheuristic.
It is tested on a wide range of benchmark instances of SCP.
The simulation results indicate that this algorithm gives outstanding performance compared with other heuristics and metaheuristics in solving SCP.
This study investigates the mean capacity of multiple-input multiple-output (MIMO) systems for spatially semi-correlated flat fading channels.
In reality, the capacity degrades dramatic due to the channel covariance (CC) when correlations exist at the transmitter or receiver or on both sides.
Most existing works have so far considered the traditional channel covariance matrices that have not been entirely constructed.
Thus, we propose an iterative channel covariance (ICC) matrix using a matrix splitting (MS) technique with a guaranteed zero correlations coefficient in the case of the downlink correlated MIMO channel, to maximize the mean capacity.
Our numerical results show that the proposed ICC method achieves the maximum channel gains with high signal-to-noise ratio (SNR) scenarios.
Today, the largest Lustre file systems store billions of entries.
On such systems, classic tools based on namespace scanning become unusable.
Operations such as managing file lifetime, scheduling data copies, and generating overall filesystem statistics become painful as they require collecting, sorting and aggregating information for billions of records.
Robinhood Policy Engine is an open source software developed to address these challenges.
It makes it possible to schedule automatic actions on huge numbers of filesystem entries.
It also gives a synthetic understanding of file systems contents by providing overall statistics about data ownership, age and size profiles.
Even if it can be used with any POSIX filesystem, Robinhood supports Lustre specific features like OSTs, pools, HSM, ChangeLogs, and DNE.
It implements specific support for these features, and takes advantage of them to manage Lustre file systems efficiently.
The importance of graph search algorithm choice to the directed relation graph with error propagation (DRGEP) method is studied by comparing basic and modified depth-first search, basic and R-value-based breadth-first search (RBFS), and Dijkstra's algorithm.
By using each algorithm with DRGEP to produce skeletal mechanisms from a detailed mechanism for n-heptane with randomly-shuffled species order, it is demonstrated that only Dijkstra's algorithm and RBFS produce results independent of species order.
In addition, each algorithm is used with DRGEP to generate skeletal mechanisms for n-heptane covering a comprehensive range of autoignition conditions for pressure, temperature, and equivalence ratio.
Dijkstra's algorithm combined with a coefficient scaling approach is demonstrated to produce the most compact skeletal mechanism with a similar performance compared to larger skeletal mechanisms resulting from the other algorithms.
The computational efficiency of each algorithm is also compared by applying the DRGEP method with each search algorithm on the large detailed mechanism for n-alkanes covering n-octane to n-hexadecane with 2115 species and 8157 reactions.
Dijkstra's algorithm implemented with a binary heap priority queue is demonstrated as the most efficient method, with a CPU cost two orders of magnitude less than the other search algorithms.
Since the advent of deep learning, it has been used to solve various problems using many different architectures.
The application of such deep architectures to auditory data is also not uncommon.
However, these architectures do not always adequately consider the temporal dependencies in data.
We thus propose a new generic architecture called the Deep Belief Network - Bidirectional Long Short-Term Memory (DBN-BLSTM) network that models sequences by keeping track of the temporal information while enabling deep representations in the data.
We demonstrate this new architecture by applying it to the task of music generation and obtain state-of-the-art results.
This paper explores the problem of page migration in ring networks.
A ring network is a connected graph, in which each node is connected with exactly two other nodes.
In this problem, one of the nodes in a given network holds a page of size D. This node is called the server and the page is a non-duplicable data in the network.
Requests are issued by nodes to access the page one after another.
Every time a new request is issued, the server must serve the request and may migrate to another node before the next request arrives.
A service costs the distance between the server and the requesting node, and the migration costs the distance of the migration multiplied by D. The problem is to minimize the total costs of services and migrations.
We study this problem in uniform model, for which the page has a unit size, i.e.D=1.
A 3.326-competitive algorithm improving the current best upper bound is designed.
We show that this ratio is tight for our algorithm.
Selecting a representative vector for a set of vectors is a very common requirement in many algorithmic tasks.
Traditionally, the mean or median vector is selected.
Ontology classes are sets of homogeneous instance objects that can be converted to a vector space by word vector embeddings.
This study proposes a methodology to derive a representative vector for ontology classes whose instances were converted to the vector space.
We start by deriving five candidate vectors which are then used to train a machine learning model that would calculate a representative vector for the class.
We show that our methodology out-performs the traditional mean and median vector representations.
The present study proposes LitStoryTeller, an interactive system for visually exploring the semantic structure of a scientific article.
We demonstrate how LitStoryTeller could be used to answer some of the most fundamental research questions, such as how a new method was built on top of existing methods, based on what theoretical proof and experimental evidences.
More importantly, LitStoryTeller can assist users to understand the full and interesting story a scientific paper, with a concise outline and important details.
The proposed system borrows a metaphor from screen play, and visualizes the storyline of a scientific paper by arranging its characters (scientific concepts or terminologies) and scenes (paragraphs/sentences) into a progressive and interactive storyline.
Such storylines help to preserve the semantic structure and logical thinking process of a scientific paper.
Semantic structures, such as scientific concepts and comparative sentences, are extracted using existing named entity recognition APIs and supervised classifiers, from a scientific paper automatically.
Two supplementary views, ranked entity frequency view and entity co-occurrence network view, are provided to help users identify the "main plot" of such scientific storylines.
When collective documents are ready, LitStoryTeller also provides a temporal entity evolution view and entity community view for collection digestion.
Compatibility between items, such as clothes and shoes, is a major factor among customer's purchasing decisions.
However, learning "compatibility" is challenging due to (1) broader notions of compatibility than those of similarity, (2) the asymmetric nature of compatibility, and (3) only a small set of compatible and incompatible items are observed.
We propose an end-to-end trainable system to embed each item into a latent vector and project a query item into K compatible prototypes in the same space.
These prototypes reflect the broad notions of compatibility.
We refer to both the embedding and prototypes as "Compatibility Family".
In our learned space, we introduce a novel Projected Compatibility Distance (PCD) function which is differentiable and ensures diversity by aiming for at least one prototype to be close to a compatible item, whereas none of the prototypes are close to an incompatible item.
We evaluate our system on a toy dataset, two Amazon product datasets, and Polyvore outfit dataset.
Our method consistently achieves state-of-the-art performance.
Finally, we show that we can visualize the candidate compatible prototypes using a Metric-regularized Conditional Generative Adversarial Network (MrCGAN), where the input is a projected prototype and the output is a generated image of a compatible item.
We ask human evaluators to judge the relative compatibility between our generated images and images generated by CGANs conditioned directly on query items.
Our generated images are significantly preferred, with roughly twice the number of votes as others.
In generalized zero shot learning (GZSL), the set of classes are split into seen and unseen classes, where training relies on the semantic features of the seen and unseen classes and the visual representations of only the seen classes, while testing uses the visual representations of the seen and unseen classes.
Current methods address GZSL by learning a transformation from the visual to the semantic space, exploring the assumption that the distribution of classes in the semantic and visual spaces is relatively similar.
Such methods tend to transform unseen testing visual representations into one of the seen classes' semantic features instead of the semantic features of the correct unseen class, resulting in low accuracy GZSL classification.
Recently, generative adversarial networks (GAN) have been explored to synthesize visual representations of the unseen classes from their semantic features - the synthesized representations of the seen and unseen classes are then used to train the GZSL classifier.
This approach has been shown to boost GZSL classification accuracy, however, there is no guarantee that synthetic visual representations can generate back their semantic feature in a multi-modal cycle-consistent manner.
This constraint can result in synthetic visual representations that do not represent well their semantic features.
In this paper, we propose the use of such constraint based on a new regularization for the GAN training that forces the generated visual features to reconstruct their original semantic features.
Once our model is trained with this multi-modal cycle-consistent semantic compatibility, we can then synthesize more representative visual representations for the seen and, more importantly, for the unseen classes.
Our proposed approach shows the best GZSL classification results in the field in several publicly available datasets.
A graph-based classification method is proposed for semi-supervised learning in the case of Euclidean data and for classification in the case of graph data.
Our manifold learning technique is based on a convex optimization problem involving a convex quadratic regularization term and a concave quadratic loss function with a trade-off parameter carefully chosen so that the objective function remains convex.
As shown empirically, the advantage of considering a concave loss function is that the learning problem becomes more robust in the presence of noisy labels.
Furthermore, the loss function considered here is then more similar to a classification loss while several other methods treat graph-based classification problems as regression problems.
This paper studies the relation between activity on Twitter and sales.
While research exists into the relation between Tweets and movie and book sales, this paper shows that the same relations do not hold for products that receive less attention on social media.
For such products, classification of Tweets is far more important to determine a relation.
Also, for such products advanced statistical relations, in addition to correlation, are required to relate Twitter activity and sales.
In a case study that involves Tweets and sales from a company in four countries, the paper shows how, by classifying Tweets, such relations can be identified.
In particular, the paper shows evidence that positive Tweets by persons (as opposed to companies) can be used to forecast sales and that peaks in positive Tweets by persons are strongly related to an increase in sales.
These results can be used to improve sales forecasts and to increase sales in marketing campaigns.
Over the last years, scientific workflows have become mature enough to be used in a production style.
However, despite the increasing maturity, there is still a shortage of tools for searching, adapting, and reusing workflows that hinders a more generalized adoption by the scientific communities.
Indeed, due to the limited availability of machine-readable scientific metadata and the heterogeneity of workflow specification formats and representations, new ways to leverage alternative sources of information that complement existing approaches are needed.
In this paper we address such limitations by applying statistically enriched generalized trie structures to exploit workflow execution provenance information in order to assist the analysis, indexing and search of scientific workflows.
Our method bridges the gap between the description of what a workflow is supposed to do according to its specification and related metadata and what it actually does as recorded in its provenance execution trace.
In doing so, we also prove that the proposed method outperforms SPARQL 1.1 Property Paths for querying provenance graphs.
Most recent MaxSAT algorithms rely on a succession of calls to a SAT solver in order to find an optimal solution.
In particular, several algorithms take advantage of the ability of SAT solvers to identify unsatisfiable subformulas.
Usually, these MaxSAT algorithms perform better when small unsatisfiable subformulas are found early.
However, this is not the case in many problem instances, since the whole formula is given to the SAT solver in each call.
In this paper, we propose to partition the MaxSAT formula using a resolution-based graph representation.
Partitions are then iteratively joined by using a proximity measure extracted from the graph representation of the formula.
The algorithm ends when only one partition remains and the optimal solution is found.
Experimental results show that this new approach further enhances a state of the art MaxSAT solver to optimally solve a larger set of industrial problem instances.
Machine learning (ML) is becoming a commodity.
Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data.
It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data.
We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model.
In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model.
We then explain how the adversary can extract memorized information from the model.
We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB).
In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data.
Data reconciliation (DR) and Principal Component Analysis (PCA) are two popular data analysis techniques in process industries.
Data reconciliation is used to obtain accurate and consistent estimates of variables and parameters from erroneous measurements.
PCA is primarily used as a method for reducing the dimensionality of high dimensional data and as a preprocessing technique for denoising measurements.
These techniques have been developed and deployed independently of each other.
The primary purpose of this article is to elucidate the close relationship between these two seemingly disparate techniques.
This leads to a unified framework for applying PCA and DR. Further, we show how the two techniques can be deployed together in a collaborative and consistent manner to process data.
The framework has been extended to deal with partially measured systems and to incorporate partial knowledge available about the process model.
For optimal placement and orchestration of network services, it is crucial that their structure and semantics are specified clearly and comprehensively and are available to an orchestrator.
Existing specification approaches are either ambiguous or miss important aspects regarding the behavior of virtual network functions (VNFs) forming a service.
We propose to formally and unambiguously specify the behavior of these functions and services using Queuing Petri Nets (QPNs).
QPNs are an established method that allows to express queuing, synchronization, stochastically distributed processing delays, and changing traffic volume and characteristics at each VNF.
With QPNs, multiple VNFs can be connected to complete network services in any structure, even specifying bidirectional network services containing loops.
We propose a tool-based workflow that supports the specification of network services and the automatic generation of corresponding simulation code to enable an in-depth analysis of their behavior and performance.
In a case study, we show how developers can benefit from analysis insights, e.g., to anticipate the impact of different service configurations.
We also discuss how management and orchestration systems can benefit from our clear and comprehensive specification approach and its extensive analysis possibilities, leading to better placement of VNFs and improved Quality of Service.
Although deep learning can provide promising results in medical image analysis, the lack of very large annotated datasets confines its full potential.
Furthermore, limited positive samples also create unbalanced datasets which limit the true positive rates of trained models.
As unbalanced datasets are mostly unavoidable, it is greatly beneficial if we can extract useful knowledge from negative samples to improve classification accuracy on limited positive samples.
To this end, we propose a new strategy for building medical image analysis pipelines that target disease detection.
We train a discriminative segmentation model only on normal images to provide a source of knowledge to be transferred to a disease detection classifier.
We show that using the feature maps of a trained segmentation network, deviations from normal anatomy can be learned by a two-class classification network on an extremely unbalanced training dataset with as little as one positive for 17 negative samples.
We demonstrate that even though the segmentation network is only trained on normal cardiac computed tomography images, the resulting feature maps can be used to detect pericardial effusion and cardiac septal defects with two-class convolutional classification networks.
Appearance based person re-identification in a real-world video surveillance system with non-overlapping camera views is a challenging problem for many reasons.
Current state-of-the-art methods often address the problem by relying on supervised learning of similarity metrics or ranking functions to implicitly model appearance transformation between cameras for each camera pair, or group, in the system.
This requires considerable human effort to annotate data.
Furthermore, the learned models are camera specific and not transferable from one set of cameras to another.
Therefore, the annotation process is required after every network expansion or camera replacement, which strongly limits their applicability.
Alternatively, we propose a novel modeling approach to harness complementary appearance information without supervised learning that significantly outperforms current state-of-the-art unsupervised methods on multiple benchmark datasets.
Machine comprehension(MC) style question answering is a representative problem in natural language processing.
Previous methods rarely spend time on the improvement of encoding layer, especially the embedding of syntactic information and name entity of the words, which are very crucial to the quality of encoding.
Moreover, existing attention methods represent each query word as a vector or use a single vector to represent the whole query sentence, neither of them can handle the proper weight of the key words in query sentence.
In this paper, we introduce a novel neural network architecture called Multi-layer Embedding with Memory Network(MEMEN) for machine reading task.
In the encoding layer, we employ classic skip-gram model to the syntactic and semantic information of the words to train a new kind of embedding layer.
We also propose a memory network of full-orientation matching of the query and passage to catch more pivotal information.
Experiments show that our model has competitive results both from the perspectives of precision and efficiency in Stanford Question Answering Dataset(SQuAD) among all published results and achieves the state-of-the-art results on TriviaQA dataset.
Energy consumption is a growing issue in data centers, impacting their economic viability and their public image.
In this work we empirically characterize the power and energy consumed by different types of servers.
In particular, in order to understand the behavior of their energy and power consumption, we perform measurements in different servers.
In each of them, we exhaustively measure the power consumed by the CPU, the disk, and the network interface under different configurations, identifying the optimal operational levels.
One interesting conclusion of our study is that the curve that defines the minimal CPU power as a function of the load is neither linear nor purely convex as has been previously assumed.
Moreover, we find that the efficiency of the various server components can be maximized by tuning the CPU frequency and the number of active cores as a function of the system and network load, while the block size of I/O operations should be always maximized by applications.
We also show how to estimate the energy consumed by an application as a function of some simple parameters, like the CPU load, and the disk and network activity.
We validate the proposed approach by accurately estimating the energy of a map-reduce computation in a Hadoop platform.
Accuracy-driven computation is a strategy widely used in exact-decisions number types for robust geometric algorithms.
This work provides an overview on the usage of error bounds in accuracy-driven computation, compares different approaches on the representation and computation of these error bounds and points out some caveats.
The stated claims are supported by experiments.
The field of property testing of probability distributions, or distribution testing, aims to provide fast and (most likely) correct answers to questions pertaining to specific aspects of very large datasets.
In this work, we consider a property of particular interest, monotonicity of distributions.
We focus on the complexity of monotonicity testing across different models of access to the distributions; and obtain results in these new settings that differ significantly from the known bounds in the standard sampling model.
In this paper we present a theoretical analysis of graph-based service composition in terms of its dependency with service discovery.
Driven by this analysis we define a composition framework by means of integration with fine-grained I/O service discovery that enables the generation of a graph-based composition which contains the set of services that are semantically relevant for an input-output request.
The proposed framework also includes an optimal composition search algorithm to extract the best composition from the graph minimising the length and the number of services, and different graph optimisations to improve the scalability of the system.
A practical implementation used for the empirical analysis is also provided.
This analysis proves the scalability and flexibility of our proposal and provides insights on how integrated composition systems can be designed in order to achieve good performance in real scenarios for the Web.
Structured prediction energy networks (SPENs; Belanger & McCallum 2016) use neural network architectures to define energy functions that can capture arbitrary dependencies among parts of structured outputs.
Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them.
We replace this use of gradient descent with a neural network trained to approximate structured argmax inference.
This "inference network" outputs continuous values that we treat as the output structure.
We develop large-margin training criteria for joint training of the structured energy function and inference network.
On multi-label classification we report speed-ups of 10-60x compared to (Belanger et al, 2017) while also improving accuracy.
For sequence labeling with simple structured energies, our approach performs comparably to exact inference while being much faster at test time.
We then demonstrate improved accuracy by augmenting the energy with a "label language model" that scores entire output label sequences, showing it can improve handling of long-distance dependencies in part-of-speech tagging.
Finally, we show how inference networks can replace dynamic programming for test-time inference in conditional random fields, suggestive for their general use for fast inference in structured settings.
It has been shown recently that deep convolutional generative adversarial networks (GANs) can learn to generate music in the form of piano-rolls, which represent music by binary-valued time-pitch matrices.
However, existing models can only generate real-valued piano-rolls and require further post-processing, such as hard thresholding (HT) or Bernoulli sampling (BS), to obtain the final binary-valued results.
In this paper, we study whether we can have a convolutional GAN model that directly creates binary-valued piano-rolls by using binary neurons.
Specifically, we propose to append to the generator an additional refiner network, which uses binary neurons at the output layer.
The whole network is trained in two stages.
Firstly, the generator and the discriminator are pretrained.
Then, the refiner network is trained along with the discriminator to learn to binarize the real-valued piano-rolls the pretrained generator creates.
Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures.
Moreover, deterministic binary neurons perform better than stochastic ones in both objective measures and a subjective test.
The source code, training data and audio examples of the generated results can be found at https://salu133445.github.io/bmusegan/ .
Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training.
Target-side monolingual data plays an important role in boosting fluency for phrase-based statistical machine translation, and we investigate the use of monolingual data for NMT.
In contrast to previous work, which combines NMT models with separately trained language models, we note that encoder-decoder NMT architectures already have the capacity to learn the same information as a language model, and we explore strategies to train with monolingual data without changing the neural network architecture.
By pairing monolingual training data with an automatic back-translation, we can treat it as additional parallel training data, and we obtain substantial improvements on the WMT 15 task English<->German (+2.8-3.7 BLEU), and for the low-resourced IWSLT 14 task Turkish->English (+2.1-3.4 BLEU), obtaining new state-of-the-art results.
We also show that fine-tuning on in-domain monolingual and parallel data gives substantial improvements for the IWSLT 15 task English->German.
In this paper, we present a new method for detecting road users in an urban environment which leads to an improvement in multiple object tracking.
Our method takes as an input a foreground image and improves the object detection and segmentation.
This new image can be used as an input to trackers that use foreground blobs from background subtraction.
The first step is to create foreground images for all the frames in an urban video.
Then, starting from the original blobs of the foreground image, we merge the blobs that are close to one another and that have similar optical flow.
The next step is extracting the edges of the different objects to detect multiple objects that might be very close (and be merged in the same blob) and to adjust the size of the original blobs.
At the same time, we use the optical flow to detect occlusion of objects that are moving in opposite directions.
Finally, we make a decision on which information we keep in order to construct a new foreground image with blobs that can be used for tracking.
The system is validated on four videos of an urban traffic dataset.
Our method improves the recall and precision metrics for the object detection task compared to the vanilla background subtraction method and improves the CLEAR MOT metrics in the tracking tasks for most videos.
With the advent of semantic web, various tools and techniques have been introduced for presenting and organizing knowledge.
Concept hierarchies are one such technique which gained significant attention due to its usefulness in creating domain ontologies that are considered as an integral part of semantic web.
Automated concept hierarchy learning algorithms focus on extracting relevant concepts from unstructured text corpus and connect them together by identifying some potential relations exist between them.
In this paper, we propose a novel approach for identifying relevant concepts from plain text and then learns hierarchy of concepts by exploiting subsumption relation between them.
To start with, we model topics using a probabilistic topic model and then make use of some lightweight linguistic process to extract semantically rich concepts.
Then we connect concepts by identifying an "is-a" relationship between pair of concepts.
The proposed method is completely unsupervised and there is no need for a domain specific training corpus for concept extraction and learning.
Experiments on large and real-world text corpora such as BBC News dataset and Reuters News corpus shows that the proposed method outperforms some of the existing methods for concept extraction and efficient concept hierarchy learning is possible if the overall task is guided by a probabilistic topic modeling algorithm.
We present a dual subspace ascent algorithm for support vector machine training that respects a budget constraint limiting the number of support vectors.
Budget methods are effective for reducing the training time of kernel SVM while retaining high accuracy.
To date, budget training is available only for primal (SGD-based) solvers.
Dual subspace ascent methods like sequential minimal optimization are attractive for their good adaptation to the problem structure, their fast convergence rate, and their practical speed.
By incorporating a budget constraint into a dual algorithm, our method enjoys the best of both worlds.
We demonstrate considerable speed-ups over primal budget training methods.
Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously.
While sometimes the underlying task relationship structure is known, often the structure needs to be estimated from data at hand.
In this paper, we present a novel family of models for MTL, applicable to regression and classification problems, capable of learning the structure of task relationships.
In particular, we consider a joint estimation problem of the task relationship structure and the individual task parameters, which is solved using alternating minimization.
The task relationship structure learning component builds on recent advances in structure learning of Gaussian graphical models based on sparse estimators of the precision (inverse covariance) matrix.
We illustrate the effectiveness of the proposed model on a variety of synthetic and benchmark datasets for regression and classification.
We also consider the problem of combining climate model outputs for better projections of future climate, with focus on temperature in South America, and show that the proposed model outperforms several existing methods for the problem.
When setup/hold times of bistable elements are violated, they may become metastable, i.e., enter a transient state that is neither digital 0 nor 1.
In general, metastability cannot be avoided, a problem that manifests whenever taking discrete measurements of analog values.
Metastability of the output then reflects uncertainty as to whether a measurement should be rounded up or down to the next possible measurement outcome.
Surprisingly, Lenzen and Medina (ASYNC 2016) showed that metastability can be contained, i.e., measurement values can be correctly sorted without resolving metastability first.
However, both their work and the state of the art by Bund et al.
(DATE 2017) leave open whether such a solution can be as small and fast as standard sorting networks.
We show that this is indeed possible, by providing a circuit that sorts Gray code inputs (possibly containing a metastable bit) and has asymptotically optimal depth and size.
Concretely, for 10-channel sorting networks and 16-bit wide inputs, we improve by 48.46% in delay and by 71.58% in area over Bund et al.
Our simulations indicate that straightforward transistor-level optimization is likely to result in performance on par with standard (non-containing) solutions.
Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text.
Existing Recurrent Neural Network (RNN) layers are biased towards short-term dependencies and hence not suited to such tasks.
We present a recurrent layer which is instead biased towards coreferent dependencies.
The layer uses coreference annotations extracted from an external system to connect entity mentions belonging to the same cluster.
Incorporating this layer into a state-of-the-art reading comprehension model improves performance on three datasets -- Wikihop, LAMBADA and the bAbi AI tasks -- with large gains when training data is scarce.
Cloud for Gaming refers to the use of cloud computing technologies to build large-scale gaming infrastructures, with the goal of improving scalability and responsiveness, improve the user's experience and enable new business models.
Hyperspectral image (HSI) classification is a hot topic in the remote sensing community.
This paper proposes a new framework of spectral-spatial feature extraction for HSI classification, in which for the first time the concept of deep learning is introduced.
Specifically, the model of autoencoder is exploited in our framework to extract various kinds of features.
First we verify the eligibility of autoencoder by following classical spectral information based classification and use autoencoders with different depth to classify hyperspectral image.
Further in the proposed framework, we combine PCA on spectral dimension and autoencoder on the other two spatial dimensions to extract spectral-spatial information for classification.
The experimental results show that this framework achieves the highest classification accuracy among all methods, and outperforms classical classifiers such as SVM and PCA-based SVM.
Software testing is an important and valuable part of the software development life cycle.
Due to time, cost and other circumstances, exhaustive testing is not feasible that's why there is a need to automate the software testing process.
Testing effectiveness can be achieved by the State Transition Testing (STT) which is commonly used in real time, embedded and web-based type of software systems.
Aim of the current paper is to present an algorithm by applying an ant colony optimization technique, for generation of optimal and minimal test sequences for behavior specification of software.
Present paper approach generates test sequence in order to obtain the complete software coverage.
This paper also discusses the comparison between two metaheuristic techniques (Genetic Algorithm and Ant Colony optimization) for transition based testing
Being able to automatically repair programs is an extremely challenging task.
In this paper, we present MintHint, a novel technique for program repair that is a departure from most of today's approaches.
Instead of trying to fully automate program repair, which is often an unachievable goal, MintHint performs statistical correlation analysis to identify expressions that are likely to occur in the repaired code and generates, using pattern-matching based synthesis, repair hints from these expressions.
Intuitively, these hints suggest how to rectify a faulty statement and help developers find a complete, actual repair.
MintHint can address a variety of common faults, including incorrect, spurious, and missing expressions.
We present a user study that shows that developers' productivity can improve manyfold with the use of repair hints generated by MintHint -- compared to having only traditional fault localization information.
We also apply MintHint to several faults of a widely used Unix utility program to further assess the effectiveness of the approach.
Our results show that MintHint performs well even in situations where (1) the repair space searched does not contain the exact repair, and (2) the operational specification obtained from the test cases for repair is incomplete or even imprecise.
HistCite TM is a large-scale computer tool for mapping science.
Its power of visualization combines the production of historiographs on the basis of the analysis of co-citations of documents, with the use of bibliometrics specific indicators.
The objective of this article is, to present the advantages of the new bibliometrics configuration of HistCite TM (2004) when identifying articles.
The analysis of the histograms that produces HistCite TM , in terms of cumulative advantage and aging of the citations.
And the comparative study of the results of HistCite TM , in its indicators of amplitude and recognition.
Also is examined its treatment of the sampling problems, by formalizing the Kendall method of estimating the robust standard deviation.
The most successful parallel SAT and MaxSAT solvers follow a portfolio approach, where each thread applies a different algorithm (or the same algorithm configured differently) to solve a given problem instance.
The main goal of building a portfolio is to diversify the search process being carried out by each thread.
As soon as one thread finishes, the instance can be deemed solved.
In this paper we present a new open source distributed solver for MaxSAT solving that addresses two issues commonly found in multicore parallel solvers, namely memory contention and scalability.
Preliminary results show that our non-portfolio distributed MaxSAT solver outperforms its sequential version and is able to solve more instances as the number of processes increases.
Seam-cutting and seam-driven techniques have been proven effective for handling imperfect image series in image stitching.
Generally, seam-driven is to utilize seam-cutting to find a best seam from one or finite alignment hypotheses based on a predefined seam quality metric.
However, the quality metrics in most methods are defined to measure the average performance of the pixels on the seam without considering the relevance and variance among them.
This may cause that the seam with the minimal measure is not optimal (perception-inconsistent) in human perception.
In this paper, we propose a novel coarse-to-fine seam estimation method which applies the evaluation in a different way.
For pixels on the seam, we develop a patch-point evaluation algorithm concentrating more on the correlation and variation of them.
The evaluations are then used to recalculate the difference map of the overlapping region and reestimate a stitching seam.
This evaluation-reestimation procedure iterates until the current seam changes negligibly comparing with the previous seams.
Experiments show that our proposed method can finally find a nearly perception-consistent seam after several iterations, which outperforms the conventional seam-cutting and other seam-driven methods.
The aim of this work is studying the use of copulas and vines in the optimization with Estimation of Distribution Algorithms (EDAs).
Two EDAs are built around the multivariate product and normal copulas, and other two are based on pair-copula decomposition of vine models.
Empirically we study the effect of both marginal distributions and dependence structure separately, and show that both aspects play a crucial role in the success of the optimization.
The results show that the use of copulas and vines opens new opportunities to a more appropriate modeling of search distributions in EDAs.
Recent research on deep neural networks has focused primarily on improving accuracy.
For a given accuracy level, it is typically possible to identify multiple DNN architectures that achieve that accuracy level.
With equivalent accuracy, smaller DNN architectures offer at least three advantages: (1) Smaller DNNs require less communication across servers during distributed training.
(2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car.
(3) Smaller DNNs are more feasible to deploy on FPGAs and other hardware with limited memory.
To provide all of these advantages, we propose a small DNN architecture called SqueezeNet.
SqueezeNet achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters.
Additionally, with model compression techniques we are able to compress SqueezeNet to less than 0.5MB (510x smaller than AlexNet).
The SqueezeNet architecture is available for download here: https://github.com/DeepScale/SqueezeNet
Most existing GANs architectures that generate images use transposed convolution or resize-convolution as their upsampling algorithm from lower to higher resolution feature maps in the generator.
We argue that this kind of fixed operation is problematic for GANs to model objects that have very different visual appearances.
We propose a novel adaptive convolution method that learns the upsampling algorithm based on the local context at each location to address this problem.
We modify a baseline GANs architecture by replacing normal convolutions with adaptive convolutions in the generator.
Experiments on CIFAR-10 dataset show that our modified models improve the baseline model by a large margin.
Furthermore, our models achieve state-of-the-art performance on CIFAR-10 and STL-10 datasets in the unsupervised setting.
We introduce MeSys, a meaning-based approach, for solving English math word problems (MWPs) via understanding and reasoning in this paper.
It first analyzes the text, transforms both body and question parts into their corresponding logic forms, and then performs inference on them.
The associated context of each quantity is represented with proposed role-tags (e.g., nsubj, verb, etc.), which provides the flexibility for annotating an extracted math quantity with its associated context information (i.e., the physical meaning of this quantity).
Statistical models are proposed to select the operator and operands.
A noisy dataset is designed to assess if a solver solves MWPs mainly via understanding or mechanical pattern matching.
Experimental results show that our approach outperforms existing systems on both benchmark datasets and the noisy dataset, which demonstrates that the proposed approach understands the meaning of each quantity in the text more.
This paper addresses the problem of IDE interface complexity by introducing single-window graphical user interface.
This approach lies in removing additional child windows from IDE, thus allowing a user to keep only text editor window open.
We describe an abstract model of IDE GUI that is based on most popular modern integrated environments and has generalized user interface parts.
Then this abstract model is reorganized into single windowed interface model: access to common IDE functions is provided from the code editing window while utility windows are removed without loss of IDE functionality.
After that the implementation of single-window GUI on KDevelop 4 is described.
And finally tool views and usability of several well- known IDEs are surveyed.
Recently, the dense binary pixel Gigavision camera had been introduced, emulating a digital version of the photographic film.
While seems to be a promising solution for HDR imaging, its output is not directly usable and requires an image reconstruction process.
In this work, we formulate this problem as the minimization of a convex objective combining a maximum-likelihood term with a sparse synthesis prior.
We present MLNet - a novel feed-forward neural network, producing acceptable output quality at a fixed complexity and is two orders of magnitude faster than iterative algorithms.
We present state of the art results in the abstract.
The objective of Dem@Care is the development of a complete system providing personal health services to people with dementia, as well as medical professionals and caregivers, by using a multitude of sensors, for context-aware, multi-parametric monitoring of lifestyle, ambient environment, and health parameters.
Multi-sensor data analysis, combined with intelligent decision making mechanisms, will allow an accurate representation of the person's current status and will provide the appropriate feedback, both to the person and the associated caregivers, enhancing the standard clinical workflow.
Within the project framework, several data collection activities have taken place to assist technical development and evaluation tasks.
In all these activities, particular attention has been paid to adhere to ethical guidelines and preserve the participants' privacy.
This technical report describes shorty the (a) the main objectives of the project, (b) the main ethical principles and (c) the datasets that have been already created.
Containers are an emerging technology that hold promise for improving productivity and code portability in scientific computing.
We examine Linux container technology for the distribution of a non-trivial scientific computing software stack and its execution on a spectrum of platforms from laptop computers through to high performance computing (HPC) systems.
We show on a workstation and a leadership-class HPC system that when deployed appropriately there are no performance penalties running scientific programs inside containers.
For Python code run on large parallel computers, the run time is reduced inside a container due to faster library imports.
The software distribution approach and data that we present will help developers and users decide on whether container technology is appropriate for them.
We also provide guidance for the vendors of HPC systems that rely on proprietary libraries for performance on what they can do to make containers work seamlessly and without performance penalty.
Context: Software code reviews are an important part of the development process, leading to better software quality and reduced overall costs.
However, finding appropriate code reviewers is a complex and time-consuming task.
Goals: In this paper, we propose a large-scale study to compare performance of two main source code reviewer recommendation algorithms (RevFinder and a Naive Bayes-based approach) in identifying the best code reviewers for opened pull requests.
Method: We mined data from Github and Gerrit repositories, building a large dataset of 51 projects, with more than 293K pull requests analyzed, 180K owners and 157K reviewers.
Results: Based on the large analysis, we can state that i) no model can be generalized as best for all projects, ii) the usage of a different repository (Gerrit, GitHub) can have impact on the the recommendation results, iii) exploiting sub-projects information available in Gerrit can improve the recommendation results.
We present a selective bibliography about efficient SAT solving, focused on optimizations for the CDCL-based algorithms.
Rate adaptation in 802.11 WLANs has received a lot of attention from the research community, with most of the proposals aiming at maximising throughput based on network conditions.
Considering energy consumption, an implicit assumption is that optimality in throughput implies optimality in energy efficiency, but this assumption has been recently put into question.
In this paper, we address via analysis and experimentation the relation between throughput performance and energy efficiency in multi-rate 802.11 scenarios.
We demonstrate the trade-off between these performance figures, confirming that they may not be simultaneously optimised, and analyse their sensitivity towards the energy consumption parameters of the device.
Our results provide the means to design novel rate adaptation schemes that takes energy consumption into account.
Glaucoma is a chronic eye disease that leads to irreversible vision loss.
Most of the existing automatic screening methods firstly segment the main structure, and subsequently calculate the clinical measurement for detection and screening of glaucoma.
However, these measurement-based methods rely heavily on the segmentation accuracy, and ignore various visual features.
In this paper, we introduce a deep learning technique to gain additional image-relevant information, and screen glaucoma from the fundus image directly.
Specifically, a novel Disc-aware Ensemble Network (DENet) for automatic glaucoma screening is proposed, which integrates the deep hierarchical context of the global fundus image and the local optic disc region.
Four deep streams on different levels and modules are respectively considered as global image stream, segmentation-guided network, local disc region stream, and disc polar transformation stream.
Finally, the output probabilities of different streams are fused as the final screening result.
The experiments on two glaucoma datasets (SCES and new SINDI datasets) show our method outperforms other state-of-the-art algorithms.
The purpose of the current study is to systematically review the crowdsourcing literature, extract the activities which have been cited, and synthesise these activities into a general process model.
For this to happen, we reviewed the related literature on crowdsourcing methods as well as relevant case studies and extracted the activities which they referred to as part of crowdsourcing projects.
The systematic review of the related literature and an in-depth analysis of the steps in those papers were followed by a synthesis of the extracted activities resulting in an eleven-phase process model.
This process model covers all of the activities suggested by the literature.
This paper then briefly discusses activities in each phase and concludes with a number of implications for both academics and practitioners.
Many computer vision algorithms depend on a variety of parameter choices and settings that are typically hand-tuned in the course of evaluating the algorithm.
While such parameter tuning is often presented as being incidental to the algorithm, correctly setting these parameter choices is frequently critical to evaluating a method's full potential.
Compounding matters, these parameters often must be re-tuned when the algorithm is applied to a new problem domain, and the tuning process itself often depends on personal experience and intuition in ways that are hard to describe.
Since the performance of a given technique depends on both the fundamental quality of the algorithm and the details of its tuning, it can be difficult to determine whether a given technique is genuinely better, or simply better tuned.
In this work, we propose a meta-modeling approach to support automated hyper parameter optimization, with the goal of providing practical tools to replace hand-tuning with a reproducible and unbiased optimization process.
Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from parameters that govern not only how individual processing steps are applied, but even which processing steps are included.
A hyper parameter optimization algorithm transforms this graph into a program for optimizing that performance metric.
Our approach yields state of the art results on three disparate computer vision problems: a face-matching verification task (LFW), a face identification task (PubFig83) and an object recognition task (CIFAR-10), using a single algorithm.
More broadly, we argue that the formalization of a meta-model supports more objective, reproducible, and quantitative evaluation of computer vision algorithms, and that it can serve as a valuable tool for guiding algorithm development.
Modern cryptocurrencies exploit decentralised blockchains to record a public and unalterable history of transactions.
Besides transactions, further information is stored for different, and often undisclosed, purposes, making the blockchains a rich and increasingly growing source of valuable information, in part of difficult interpretation.
Many data analytics have been developed, mostly based on specifically designed and ad-hoc engineered approaches.
We propose a general-purpose framework, seamlessly supporting data analytics on both Bitcoin and Ethereum - currently the two most prominent cryptocurrencies.
Such a framework allows us to integrate relevant blockchain data with data from other sources, and to organise them in a database, either SQL or NoSQL.
Our framework is released as an open-source Scala library.
We illustrate the distinguishing features of our approach on a set of significant use cases, which allow us to empirically compare ours to other competing proposals, and evaluate the impact of the database choice on scalability.
In this paper, the idea of a new artificial intelligence based optimization algorithm, which is inspired from the nature of vortex, has been provided briefly.
As also a bio-inspired computation algorithm, the idea is generally focused on a typical vortex flow / behavior in nature and inspires from some dynamics that are occurred in the sense of vortex nature.
Briefly, the algorithm is also a swarm-oriented evolutional problem solution approach; because it includes many methods related to elimination of weak swarm members and trying to improve the solution process by supporting the solution space via new swarm members.
In order have better idea about success of the algorithm; it has been tested via some benchmark functions.
At this point, the obtained results show that the algorithm can be an alternative to the literature in terms of single-objective optimization solution ways.
Vortex Optimization Algorithm (VOA) is the name suggestion by the authors; for this new idea of intelligent optimization approach.
The use of open-source software (OSS) is ever-increasing, and so is the number of open-source vulnerabilities being discovered and publicly disclosed.
The gains obtained from the reuse of community-developed libraries may be offset by the cost of detecting, assessing, and mitigating their vulnerabilities in a timely fashion.
In this paper we present a novel method to detect, assess and mitigate OSS vulnerabilities that improves on state-of-the-art approaches, which commonly depend on metadata to identify vulnerable OSS dependencies.
Our solution instead is code-centric and combines static and dynamic analysis to determine the reachability of the vulnerable portion of libraries used (directly or transitively) by an application.
Taking this usage into account, our approach then supports developers in choosing among the existing non-vulnerable library versions.
VULAS, the tool implementing our code-centric and usage-based approach, is officially recommended by SAP to scan its Java software, and has been successfully used to perform more than 250000 scans of about 500 applications since December 2016.
We report on our experience and on the lessons we learned when maturing the tool from a research prototype to an industrial-grade solution.
Recently deep reinforcement learning (DRL) has achieved outstanding success on solving many difficult and large-scale RL problems.
However the high sample cost required for effective learning often makes DRL unaffordable in resource-limited applications.
With the aim of improving sample efficiency and learning performance, we will develop a new DRL algorithm in this paper that seamless integrates entropy-induced and bootstrap-induced techniques for efficient and deep exploration of the learning environment.
Specifically, a general form of Tsallis entropy regularizer will be utilized to drive entropy-induced exploration based on efficient approximation of optimal action-selection policies.
Different from many existing works that rely on action dithering strategies for exploration, our algorithm is efficient in exploring actions with clear exploration value.
Meanwhile, by employing an ensemble of Q-networks under varied Tsallis entropy regularization, the diversity of the ensemble can be further enhanced to enable effective bootstrap-induced exploration.
Experiments on Atari game playing tasks clearly demonstrate that our new algorithm can achieve more efficient and effective exploration for DRL, in comparison to recently proposed exploration methods including Bootstrapped Deep Q-Network and UCB Q-Ensemble.
Android apps should be designed to cope with stop-start events, which are the events that require stopping and restoring the execution of an app while leaving its state unaltered.
These events can be caused by run-time configuration changes, such as a screen rotation, and by context-switches, such as a switch from one app to another.
When a stop-start event occurs, Android saves the state of the app, handles the event, and finally restores the saved state.
To let Android save and restore the state correctly, apps must provide the appropriate support.
Unfortunately, Android developers often implement this support incorrectly, or do not implement it at all.
This bad practice makes apps to incorrectly react to stop-start events, thus generating what we defined data loss problems, that is Android apps that lose user data, behave unexpectedly, and crash due to program variables that lost their values.
Data loss problems are difficult to detect because they might be observed only when apps are in specific states and with specific inputs.
Covering all the possible cases with testing may require a large number of test cases whose execution must be checked manually to discover whether the app under test has been correctly restored after each stop-start event.
It is thus important to complement traditional in-house testing activities with mechanisms that can protect apps as soon as a data loss problem occurs in the field.
In this paper we present DataLossHealer, a technique for automatically identifying and healing data loss problems in the field as soon as they occur.
DataLossHealer is a technique that checks at run-time whether states are recovered correctly, and heals the app when needed.
DataLossHealer can learn from experience, incrementally reducing the overhead that is introduced avoiding to monitor interactions that have been managed correctly by the app in the past.
With the rapid growth of online fashion market, demand for effective fashion recommendation systems has never been greater.
In fashion recommendation, the ability to find items that goes well with a few other items based on style is more important than picking a single item based on the user's entire purchase history.
Since the same user may have purchased dress suits in one month and casual denims in another, it is impossible to learn the latent style features of those items using only the user ratings.
If we were able to represent the style features of fashion items in a reasonable way, we will be able to recommend new items that conform to some small subset of pre-purchased items that make up a coherent style set.
We propose Style2Vec, a vector representation model for fashion items.
Based on the intuition of distributional semantics used in word embeddings, Style2Vec learns the representation of a fashion item using other items in matching outfits as context.
Two different convolutional neural networks are trained to maximize the probability of item co-occurrences.
For evaluation, a fashion analogy test is conducted to show that the resulting representation connotes diverse fashion related semantics like shapes, colors, patterns and even latent styles.
We also perform style classification using Style2Vec features and show that our method outperforms other baselines.
Traditional data mining algorithms are exceptional at seeing patterns in data that humans cannot, but are often confused by details that are obvious to the organic eye.
Algorithms that include humans "in-the-loop" have proved beneficial for accuracy by allowing a user to provide direction in these situations, but the slowness of human interactions causes execution times to increase exponentially.
Thus, we seek to formalize frameworks that include humans "over-the-loop", giving the user an option to intervene when they deem it necessary while not having user feedback be an execution requirement.
With this strategy, we hope to increase the accuracy of solutions with minimal losses in execution time.
This paper describes our vision of this strategy and associated problems.
Recent works have shown promise in using microarchitectural execution patterns to detect malware programs.
These detectors belong to a class of detectors known as signature-based detectors as they catch malware by comparing a program's execution pattern (signature) to execution patterns of known malware programs.
In this work, we propose a new class of detectors - anomaly-based hardware malware detectors - that do not require signatures for malware detection, and thus can catch a wider range of malware including potentially novel ones.
We use unsupervised machine learning to build profiles of normal program execution based on data from performance counters, and use these profiles to detect significant deviations in program behavior that occur as a result of malware exploitation.
We show that real-world exploitation of popular programs such as IE and Adobe PDF Reader on a Windows/x86 platform can be detected with nearly perfect certainty.
We also examine the limits and challenges in implementing this approach in face of a sophisticated adversary attempting to evade anomaly-based detection.
The proposed detector is complementary to previously proposed signature-based detectors and can be used together to improve security.
Advances in deep learning have led to substantial increases in prediction accuracy but have been accompanied by increases in the cost of rendering predictions.
We conjecture that fora majority of real-world inputs, the recent advances in deep learning have created models that effectively "overthink" on simple inputs.
In this paper, we revisit the classic question of building model cascades that primarily leverage class asymmetry to reduce cost.
We introduce the "I Don't Know"(IDK) prediction cascades framework, a general framework to systematically compose a set of pre-trained models to accelerate inference without a loss in prediction accuracy.
We propose two search based methods for constructing cascades as well as a new cost-aware objective within this framework.
The proposed IDK cascade framework can be easily adopted in the existing model serving systems without additional model re-training.
We evaluate the proposed techniques on a range of benchmarks to demonstrate the effectiveness of the proposed framework.
Research on influence maximization has often to cope with marketing needs relating to the propagation of information towards specific users.
However, little attention has been paid to the fact that the success of an information diffusion campaign might depend not only on the number of the initial influencers to be detected but also on their diversity w.r.t. the target of the campaign.
Our main hypothesis is that if we learn seeds that are not only capable of influencing but also are linked to more diverse (groups of) users, then the influence triggers will be diversified as well, and hence the target users will get higher chance of being engaged.
Upon this intuition, we define a novel problem, named Diversity-sensitive Targeted Influence Maximization (DTIM), which assumes to model user diversity by exploiting only topological information within a social graph.
To the best of our knowledge, we are the first to bring the concept of topology-driven diversity into targeted IM problems, for which we define two alternative definitions.
Accordingly, we propose approximate solutions of DTIM, which detect a size-k set of users that maximizes the diversity-sensitive capital objective function, for a given selection of target users.
We evaluate our DTIM methods on a special case of user engagement in online social networks, which concerns users who are not actively involved in the community life.
Experimental evaluation on real networks has demonstrated the meaningfulness of our approach, also highlighting the opportunity of further development of solutions for DTIM applications.
Mobility and network traffic have been traditionally studied separately.
Their interaction is vital for generations of future mobile services and effective caching, but has not been studied in depth with real-world big data.
In this paper, we characterize mobility encounters and study the correlation between encounters and web traffic profiles using large-scale datasets (30TB in size) of WiFi and NetFlow traces.
The analysis quantifies these correlations for the first time, across spatio-temporal dimensions, for device types grouped into on-the-go Flutes and sit-to-use Cellos.
The results consistently show a clear relation between mobility encounters and traffic across different buildings over multiple days, with encountered pairs showing higher traffic similarity than non-encountered pairs, and long encounters being associated with the highest similarity.
We also investigate the feasibility of learning encounters through web traffic profiles, with implications for dissemination protocols, and contact tracing.
This provides a compelling case to integrate both mobility and web traffic dimensions in future models, not only at an individual level, but also at pairwise and collective levels.
We have released samples of code and data used in this study on GitHub, to support reproducibility and encourage further research (https://github.com/BabakAp/encounter-traffic).
This paper proposes an evolutionary Particle Filter with a memory guided proposal step size update and an improved, fully-connected Quantum-behaved Particle Swarm Optimization (QPSO) resampling scheme for visual tracking applications.
The proposal update step uses importance weights proportional to velocities encountered in recent memory to limit the swarm movement within probable regions of interest.
The QPSO resampling scheme uses a fitness weighted mean best update to bias the swarm towards the fittest section of particles while also employing a simulated annealing operator to avoid subpar fine tune during latter course of iterations.
By moving particles closer to high likelihood landscapes of the posterior distribution using such constructs, the sample impoverishment problem that plagues the Particle Filter is mitigated to a great extent.
Experimental results using benchmark sequences imply that the proposed method outperforms competitive candidate trackers such as the Particle Filter and the traditional Particle Swarm Optimization based Particle Filter on a suite of tracker performance indices.
In many sequential decision making tasks, it is challenging to design reward functions that help an RL agent efficiently learn behavior that is considered good by the agent designer.
A number of different formulations of the reward-design problem, or close variants thereof, have been proposed in the literature.
In this paper we build on the Optimal Rewards Framework of Singh et.al. that defines the optimal intrinsic reward function as one that when used by an RL agent achieves behavior that optimizes the task-specifying or extrinsic reward function.
Previous work in this framework has shown how good intrinsic reward functions can be learned for lookahead search based planning agents.
Whether it is possible to learn intrinsic reward functions for learning agents remains an open problem.
In this paper we derive a novel algorithm for learning intrinsic rewards for policy-gradient based learning agents.
We compare the performance of an augmented agent that uses our algorithm to provide additive intrinsic rewards to an A2C-based policy learner (for Atari games) and a PPO-based policy learner (for Mujoco domains) with a baseline agent that uses the same policy learners but with only extrinsic rewards.
Our results show improved performance on most but not all of the domains.
Event-based collections are often started with a web search, but the search results you find on Day 1 may not be the same as those you find on Day 7.
In this paper, we consider collections that originate from extracting URIs (Uniform Resource Identifiers) from Search Engine Result Pages (SERPs).
Specifically, we seek to provide insight about the retrievability of URIs of news stories found on Google, and to answer two main questions: first, can one "refind" the same URI of a news story (for the same query) from Google after a given time?
Second, what is the probability of finding a story on Google over a given period of time?
To answer these questions, we issued seven queries to Google every day for over seven months (2017-05-25 to 2018-01-12) and collected links from the first five SERPs to generate seven collections for each query.
The queries represent public interest stories: "healthcare bill," "manchester bombing," "london terrorism," "trump russia," "travel ban," "hurricane harvey," and "hurricane irma."
We tracked each URI in all collections over time to estimate the discoverability of URIs from the first five SERPs.
Our results showed that the daily average rate at which stories were replaced on the default Google SERP ranged from 0.21 -0.54, and a weekly rate of 0.39 - 0.79, suggesting the fast replacement of older stories by newer stories.
The probability of finding the same URI of a news story after one day from the initial appearance on the SERP ranged from 0.34 - 0.44.
After a week, the probability of finding the same news stories diminishes rapidly to 0.01 - 0.11.
Our findings suggest that due to the difficulty in retrieving the URIs of news stories from Google, collection building that originates from search engines should begin as soon as possible in order to capture the first stages of events, and should persist in order to capture the evolution of the events...
Human Skin detection deals with the recognition of skin-colored pixels and regions in a given image.
Skin color is often used in human skin detection because it is invariant to orientation and size and is fast to process.
A new human skin detection algorithm is proposed in this paper.
The three main parameters for recognizing a skin pixel are RGB (Red, Green, Blue), HSV (Hue, Saturation, Value) and YCbCr (Luminance, Chrominance) color models.
The objective of proposed algorithm is to improve the recognition of skin pixels in given images.
The algorithm not only considers individual ranges of the three color parameters but also takes into ac- count combinational ranges which provide greater accuracy in recognizing the skin area in a given image.
Anomaly detection plays an important role in modern data-driven security applications, such as detecting suspicious access to a socket from a process.
In many cases, such events can be described as a collection of categorical values that are considered as entities of different types, which we call heterogeneous categorical events.
Due to the lack of intrinsic distance measures among entities, and the exponentially large event space, most existing work relies heavily on heuristics to calculate abnormal scores for events.
Different from previous work, we propose a principled and unified probabilistic model APE (Anomaly detection via Probabilistic pairwise interaction and Entity embedding) that directly models the likelihood of events.
In this model, we embed entities into a common latent space using their observed co-occurrence in different events.
More specifically, we first model the compatibility of each pair of entities according to their embeddings.
Then we utilize the weighted pairwise interactions of different entity types to define the event probability.
Using Noise-Contrastive Estimation with "context-dependent" noise distribution, our model can be learned efficiently regardless of the large event space.
Experimental results on real enterprise surveillance data show that our methods can accurately detect abnormal events compared to other state-of-the-art abnormal detection techniques.
This paper presents a genetic stereo matching algorithm with fuzzy evaluation function.
The proposed algorithm presents a new encoding scheme in which a chromosome is represented by a disparity matrix.
Evolution is controlled by a fuzzy fitness function able to deal with noise and uncertain camera measurements, and uses classical evolutionary operators.
The result of the algorithm is accurate dense disparity maps obtained in a reasonable computational time suitable for real-time applications as shown in experimental results.
The last two decades have seen the emergence and steady development of tangible user interfaces.
While most of these interfaces are applied for input - with output still on traditional computer screens - the goal of programmable matter and actuated shape-changing materials is to directly use the physical objects for visual or tangible feedback.
Advances in material sciences and flexible display technologies are investigated to enable such reconfigurable physical objects.
While existing solutions aim for making physical objects more controllable via the digital world, we propose an approach where holograms (virtual objects) in a mixed reality environment are augmented with physical variables such as shape, texture or temperature.
As such, the support for mobility forms an important contribution of the proposed solution since it enables users to freely move within and across environments.
Furthermore, our augmented virtual objects can co-exist in a single environment with programmable matter and other actuated shape-changing solutions.
The future potential of the proposed approach is illustrated in two usage scenarios and we hope that the presentation of our work in progress on a novel way to realise tangible holograms will foster some lively discussions in the CHI community.
Analyzing signals arising from dynamical systems typically requires many modeling assumptions and parameter estimation.
In high dimensions, this modeling is particularly difficult due to the "curse of dimensionality".
In this paper, we propose a method for building an intrinsic representation of such signals in a purely data-driven manner.
First, we apply a manifold learning technique, diffusion maps, to learn the intrinsic model of the latent variables of the dynamical system, solely from the measurements.
Second, we use concepts and tools from control theory and build a linear contracting observer to estimate the latent variables in a sequential manner from new incoming measurements.
The effectiveness of the presented framework is demonstrated by applying it to a toy problem and to a music analysis application.
In these examples we show that our method reveals the intrinsic variables of the analyzed dynamical systems.
Artificial perception is traditionally handled by hand-designing task specific algorithms.
However, a truly autonomous robot should develop perceptive abilities on its own, by interacting with its environment, and adapting to new situations.
The sensorimotor contingencies theory proposes to ground the development of those perceptive abilities in the way the agent can actively transform its sensory inputs.
We propose a sensorimotor approach, inspired by this theory, in which the agent explores the world and discovers its properties by capturing the sensorimotor regularities they induce.
This work presents an application of this approach to the discovery of a so-called visual field as the set of regularities that a visual sensor imposes on a naive agent's experience.
A formalism is proposed to describe how those regularities can be captured in a sensorimotor predictive model.
Finally, the approach is evaluated on a simulated system coarsely inspired from the human retina.
This paper describes a dataset containing small images of text from everyday scenes.
The purpose of the dataset is to support the development of new automated systems that can detect and analyze text.
Although much research has been devoted to text detection and recognition in scanned documents, relatively little attention has been given to text detection in other types of images, such as photographs that are posted on social-media sites.
This new dataset, known as COCO-Text-Patch, contains approximately 354,000 small images that are each labeled as "text" or "non-text".
This dataset particularly addresses the problem of text verification, which is an essential stage in the end-to-end text detection and recognition pipeline.
In order to evaluate the utility of this dataset, it has been used to train two deep convolution neural networks to distinguish text from non-text.
One network is inspired by the GoogLeNet architecture, and the second one is based on CaffeNet.
Accuracy levels of 90.2% and 90.9% were obtained using the two networks, respectively.
All of the images, source code, and deep-learning trained models described in this paper will be publicly available
Stochastic configuration networks (SCNs) as a class of randomized learner model have been successfully employed in data analytics due to its universal approximation capability and fast modelling property.
The technical essence lies in stochastically configuring hidden nodes (or basis functions) based on a supervisory mechanism rather than data-independent randomization as usually adopted for building randomized neural networks.
Given image data modelling tasks, the use of one-dimensional SCNs potentially demolishes the spatial information of images, and may result in undesirable performance.
This paper extends the original SCNs to two-dimensional version, termed 2DSCNs, for fast building randomized learners with matrix-inputs.
Some theoretical analyses on the goodness of 2DSCNs against SCNs, including the complexity of the random parameter space, and the superiority of generalization, are presented.
Empirical results over one regression, four benchmark handwritten digits classification, and two human face recognition datasets demonstrate that the proposed 2DSCNs perform favourably and show good potential for image data analytics.
Higher-level cognition depends on the ability to learn models of the world.
We can characterize this at the computational level as a structure-learning problem with the goal of best identifying the prevailing causal relationships among a set of relata.
However, the computational cost of performing exact Bayesian inference over causal models grows rapidly as the number of relata increases.
This implies that the cognitive processes underlying causal learning must be substantially approximate.
A powerful class of approximations that focuses on the sequential absorption of successive inputs is captured by the Neurath's ship metaphor in philosophy of science, where theory change is cast as a stochastic and gradual process shaped as much by people's limited willingness to abandon their current theory when considering alternatives as by the ground truth they hope to approach.
Inspired by this metaphor and by algorithms for approximating Bayesian inference in machine learning, we propose an algorithmic-level model of causal structure learning under which learners represent only a single global hypothesis that they update locally as they gather evidence.
We propose a related scheme for understanding how, under these limitations, learners choose informative interventions that manipulate the causal system to help elucidate its workings.
We find support for our approach in the analysis of four experiments.
The use of explicit object detectors as an intermediate step to image captioning - which used to constitute an essential stage in early work - is often bypassed in the currently dominant end-to-end approaches, where the language model is conditioned directly on a mid-level image embedding.
We argue that explicit detections provide rich semantic information, and can thus be used as an interpretable representation to better understand why end-to-end image captioning systems work well.
We provide an in-depth analysis of end-to-end image captioning by exploring a variety of cues that can be derived from such object detections.
Our study reveals that end-to-end image captioning systems rely on matching image representations to generate captions, and that encoding the frequency, size and position of objects are complementary and all play a role in forming a good image representation.
It also reveals that different object categories contribute in different ways towards image captioning.
We aim to review available literature related to the telemonitoring of maternal health care for a comprehensive understanding of the roles of Medical Cyber-Physical-Systems (MCPS) as cutting edge technology in maternal risk factor management, and for understanding the possible research gap in the domain.
In this regard, we search literature through google scholar and PubMed databases for published studies that focus on maternal telemonitoring systems using sensors, Cyber-Physical-System (CPS) and information decision systems for addressing risk factors We extract 1340 articles relevant to maternal health care that addresses different risk factors as their managerial issues.
Of a large number of relevant articles, we included 26 prospective studies relating to sensors or Medical Cyber-Physical-Systems (MCPS) based maternal telemonitoring.
Of the 1340 primary articles, we have short-listed 26 articles (12 articles for risk factor analysis, 9 for synthesis matrices and 5 papers for finding essential elements.
We have extracted 17 vital symptoms as maternal risk factors during pregnancy.
Moreover, we have identified a number of cyber-frameworks as the basis of information decision support system to cope with the different maternal complexities.
We have found the Medical Cyber-Physical System (MCPS) as a promising technology to manage the vital risk factors quickly and efficiently by the care provider from a distant place to reduce the fatal risks.
Despite communication issues, MCPS is a key-enabling technology to cope with the advancement of telemonitoring paradigm in the maternal health care system.
XML query can be modeled by twig pattern query (TPQ) specifying predicates on XML nodes and XPath relationships satisfied between them.
A lot of TPQ types have been proposed; this paper takes into account a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery semantics and, in many cases, it leads to a more efficient query processing.
In general, there are two approaches to process the TPQ: holistic joins and binary joins.
Whereas the binary join approach builds a query plan as a tree of interconnected binary operators, the holistic join approach evaluates a whole query using one operator (i.e., using one complex algorithm).
Surprisingly, a thorough analytical and experimental comparison is still missing despite an enormous research effort in this area.
In this paper, we try to fill this gap; we analytically and experimentally show that the binary joins used in a fully-pipelined plan (i.e., the plan where each join operation does not wait for the complete result of the previous operation and no explicit sorting is used) can often outperform the holistic joins, especially for TPQs with a higher ratio of non-output query nodes.
The main contributions of this paper can be summarized as follows: (i) we introduce several improvements of existing binary join approaches allowing to build a fully-pipelined plan for a TPQ considering non-output query nodes, (ii) we prove that for a certain class of TPQs such a plan has the linear time complexity with respect to the size of the input and output as well as the linear space complexity with respect to the XML document depth (i.e., the same complexity as the holistic join approaches), (iii) we show that our improved binary join approach outperforms the holistic join approaches in many situations, and (iv) we propose a simple combined approach that uses advantages of both types of approaches.
Sparse principal component analysis (sparse PCA) aims at finding a sparse basis to improve the interpretability over the dense basis of PCA, meanwhile the sparse basis should cover the data subspace as much as possible.
In contrast to most of existing work which deal with the problem by adding some sparsity penalties on various objectives of PCA, in this paper, we propose a new method SPCArt, whose motivation is to find a rotation matrix and a sparse basis such that the sparse basis approximates the basis of PCA after the rotation.
The algorithm of SPCArt consists of three alternating steps: rotate PCA basis, truncate small entries, and update the rotation matrix.
Its performance bounds are also given.
SPCArt is efficient, with each iteration scaling linearly with the data dimension.
It is easy to choose parameters in SPCArt, due to its explicit physical explanations.
Besides, we give a unified view to several existing sparse PCA methods and discuss the connection with SPCArt.
Some ideas in SPCArt are extended to GPower, a popular sparse PCA algorithm, to overcome its drawback.
Experimental results demonstrate that SPCArt achieves the state-of-the-art performance.
It also achieves a good tradeoff among various criteria, including sparsity, explained variance, orthogonality, balance of sparsity among loadings, and computational speed.
Interest has been revived in the creation of a "bill of rights" for Internet users.
This paper analyzes users' rights into ten broad principles, as a basis for assessing what users regard as important and for comparing different multi-issue Internet policy proposals.
Stability of the principles is demonstrated in an experimental survey, which also shows that freedoms of users to participate in the design and coding of platforms appear to be viewed as inessential relative to other rights.
An analysis of users' rights frameworks that have emerged over the past twenty years similarly shows that such proposals tend to leave out freedoms related to software platforms, as opposed to user data or public networks.
Evaluating policy frameworks in a comparative analysis based on prior principles may help people to see what is missing and what is important as the future of the Internet continues to be debated.
A finite length analysis is introduced for irregular repetition slotted ALOHA (IRSA) that enables to accurately estimate its performance in the moderate-to-high packet loss probability regime, i.e., in the so-called waterfall region.
The analysis is tailored to the collision channel model, which enables mapping the description of the successive interference cancellation process onto the iterative erasure decoding of low-density parity-check codes.
The analysis provides accurate estimates of the packet loss probability of IRSA in the waterfall region as demonstrated by Monte Carlo simulations.
Deep convolutional neural networks (CNN) are widely used in modern artificial intelligence (AI) and smart vision systems but also limited by computation latency, throughput, and energy efficiency on a resource-limited scenario, such as mobile devices, internet of things (IoT), unmanned aerial vehicles (UAV), and so on.
A hardware streaming architecture is proposed to accelerate convolution and pooling computations for state-of-the-art deep CNNs.
It is optimized for energy efficiency by maximizing local data reuse to reduce off-chip DRAM data access.
In addition, image and feature decomposition techniques are introduced to optimize memory access pattern for an arbitrary size of image and number of features within limited on-chip SRAM capacity.
A prototype accelerator was implemented in TSMC 65 nm CMOS technology with 2.3 mm x 0.8 mm core area, which achieves 144 GOPS peak throughput and 0.8 TOPS/W peak energy efficiency.
Over the period of 6 years and three phases, the SEE-GRID programme has established a strong regional human network in the area of distributed scientific computing and has set up a powerful regional Grid infrastructure.
It attracted a number of user communities and applications from diverse fields from countries throughout the South-Eastern Europe.
From the infrastructure point view, the first project phase has established a pilot Grid infrastructure with more than 20 resource centers in 11 countries.
During the subsequent two phases of the project, the infrastructure has grown to currently 55 resource centers with more than 6600 CPUs and 750 TBs of disk storage, distributed in 16 participating countries.
Inclusion of new resource centers to the existing infrastructure, as well as a support to new user communities, has demanded setup of regionally distributed core services, development of new monitoring and operational tools, and close collaboration of all partner institution in managing such a complex infrastructure.
In this paper we give an overview of the development and current status of SEE-GRID regional infrastructure and describe its transition to the NGI-based Grid model in EGI, with the strong SEE regional collaboration.
A trusted electronic election system requires that all the involved information must go public, that is, it focuses not only on transparency but also privacy issues.
In other words, each ballot should be counted anonymously, correctly, and efficiently.
In this work, a lightweight E-voting system is proposed for voters to minimize their trust in the authority or government.
We ensure the transparency of election by putting all message on the Ethereum blockchain, in the meantime, the privacy of individual voter is protected via an efficient and effective ring signature mechanism.
Besides, the attractive self-tallying feature is also built in our system, which guarantees that everyone who can access the blockchain network is able to tally the result on his own, no third party is required after voting phase.
More importantly, we ensure the correctness of voting results and keep the Ethereum gas cost of individual participant as low as possible, at the same time.
Clearly, the pre-described characteristics make our system more suitable for large-scale election.
Text analytics based on supervised machine learning classifiers has shown great promise in a multitude of domains, but has yet to be applied to Seismology.
We test various standard models (Naive Bayes, k-Nearest Neighbors, Support Vector Machines, and Random Forests) on a seismological corpus of 100 articles related to the topic of precursory accelerating seismicity, spanning from 1988 to 2010.
This corpus was labelled in Mignan (2011) with the precursor whether explained by critical processes (i.e., cascade triggering) or by other processes (such as signature of main fault loading).
We investigate rather the classification process can be automatized to help analyze larger corpora in order to better understand trends in earthquake predictability research.
We find that the Naive Bayes model performs best, in agreement with the machine learning literature for the case of small datasets, with cross-validation accuracies of 86% for binary classification.
For a refined multiclass classification ('non-critical process' < 'agnostic' < 'critical process assumed' < 'critical process demonstrated'), we obtain up to 78% accuracy.
Prediction on a dozen of articles published since 2011 shows however a weak generalization with a F1-score of 60%, only slightly better than a random classifier, which can be explained by a change of authorship and use of different terminologies.
Yet, the model shows F1-scores greater than 80% for the two multiclass extremes ('non-critical process' versus 'critical process demonstrated') while it falls to random classifier results (around 25%) for papers labelled 'agnostic' or 'critical process assumed'.
Those results are encouraging in view of the small size of the corpus and of the high degree of abstraction of the labelling.
Domain knowledge engineering remains essential but can be made transparent by an investigation of Naive Bayes keyword posterior probabilities.
To improve system performance, modern operating systems (OSes) often undertake activities that require modification of virtual-to-physical page translation mappings.
For example, the OS may migrate data between physical frames to defragment memory and enable superpages.
The OS may migrate pages of data between heterogeneous memory devices.
We refer to all such activities as page remappings.
Unfortunately, page remappings are expensive.
We show that translation coherence is a major culprit and that systems employing virtualization are especially badly affected by their overheads.
In response, we propose hardware translation invalidation and coherence or HATRIC, a readily implementable hardware mechanism to piggyback translation coherence atop existing cache coherence protocols.
We perform detailed studies using KVM-based virtualization, showing that HATRIC achieves up to 30% performance and 10% energy benefits, for per-CPU area overheads of 2%.
We also quantify HATRIC's benefits on systems running Xen and find up to 33% performance improvements.
The Biham-Middleton-Levine (BML) traffic model is a simple two-dimensional, discrete Cellular Automaton (CA) that has been used to study self-organization and phase transitions arising in traffic flows.
From the computational point of view, the BML model exhibits the usual features of discrete CA, where the state of the automaton are updated according to simple rules that depend on the state of each cell and its neighbors.
In this paper we study the impact of various optimizations for speeding up CA computations by using the BML model as a case study.
In particular, we describe and analyze the impact of several parallel implementations that rely on CPU features, such as multiple cores or SIMD instructions, and on GPUs.
Experimental evaluation provides quantitative measures of the payoff of each technique in terms of speedup with respect to a plain serial implementation.
Our findings show that the performance gap between CPU and GPU implementations of the BML traffic model can be reduced by clever exploitation of all CPU features.
Learning with limited data is a key challenge for visual recognition.
Few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels.
This style of transfer learning is task-agnostic: the embedding function is not learned optimally discriminative with respect to the unseen classes, where discerning among them is the target task.
In this paper, we propose a novel approach to adapt the embedding model to the target classification task, yielding embeddings that are task-specific and are discriminative.
To this end, we employ a type of self-attention mechanism called Transformer to transform the embeddings from task-agnostic to task-specific by focusing on relating instances from the test instances to the training instances in both seen and unseen classes.
Our approach also extends to both transductive and generalized few-shot classification, two important settings that have essential use cases.
We verify the effectiveness of our model on two standard benchmark few-shot classification datasets --- MiniImageNet and CUB, where our approach demonstrates state-of-the-art empirical performance.
We revisit a technique of S. Lehr on automata and use it to prove old and new results in a simple way.
We give a very simple proof of the 1986 theorem of Honkala that it is decidable whether a given k-automatic sequence is ultimately periodic.
We prove that it is decidable whether a given k-automatic sequence is overlap-free (or squareefree, or cubefree, etc.)
We prove that the lexicographically least sequence in the orbit closure of a k-automatic sequence is k-automatic, and use this last result to show that several related quantities, such as the critical exponent, irrationality measure, and recurrence quotient for Sturmian words with slope alpha, have automatic continued fraction expansions if alpha does.
This paper introduces a new constraint domain for reasoning about data with uncertainty.
It extends convex modeling with the notion of p-box to gain additional quantifiable information on the data whereabouts.
Unlike existing approaches, the p-box envelops an unknown probability instead of approximating its representation.
The p-box bounds are uniform cumulative distribution functions (cdf) in order to employ linear computations in the probabilistic domain.
The reasoning by means of p-box cdf-intervals is an interval computation which is exerted on the real domain then it is projected onto the cdf domain.
This operation conveys additional knowledge represented by the obtained probabilistic bounds.
The empirical evaluation of our implementation shows that, with minimal overhead, the output solution set realizes a full enclosure of the data along with tighter bounds on its probabilistic distributions.
In many real-world machine learning applications, unlabeled data are abundant whereas class labels are expensive and scarce.
An active learner aims to obtain a model of high accuracy with as few labeled instances as possible by effectively selecting useful examples for labeling.
We propose a new selection criterion that is based on statistical leverage scores and present two novel active learning methods based on this criterion: ALEVS for querying single example at each iteration and DBALEVS for querying a batch of examples.
To assess the representativeness of the examples in the pool, ALEVS and DBALEVS use the statistical leverage scores of the kernel matrices computed on the examples of each class.
Additionally, DBALEVS selects a diverse a set of examples that are highly representative but are dissimilar to already labeled examples through maximizing a submodular set function defined with the statistical leverage scores and the kernel matrix computed on the pool of the examples.
The submodularity property of the set scoring function let us identify batches with a constant factor approximate to the optimal batch in an efficient manner.
Our experiments on diverse datasets show that querying based on leverage scores is a powerful strategy for active learning.
Many projects relies on cognitives sciences, neurosciences, computer sciences and robotics.
They concerned today the building of autonomous artificial beings able to think.
This paper shows a model to compare the human thinking with an hypothetic numerical way of thinking based on four hierarchies : the information system classification, the cognitive pyramid, the linguistic pyramid and the digital information hierarchy.
After a state of art on the nature of human thinking, feasibility of autonomous multi-agent systems provided with artificial consciousness which are able to think is discussed.
The ethical aspects and consequences for humanity of such systems is evaluated.
These systems lead the scientific community to react.
In recent years, heatmap regression based models have shown their effectiveness in face alignment and pose estimation.
However, Conventional Heatmap Regression (CHR) is not accurate nor stable when dealing with high-resolution facial videos, since it finds the maximum activated location in heatmaps which are generated from rounding coordinates, and thus leads to quantization errors when scaling back to the original high-resolution space.
In this paper, we propose a Fractional Heatmap Regression (FHR) for high-resolution video-based face alignment.
The proposed FHR can accurately estimate the fractional part according to the 2D Gaussian function by sampling three points in heatmaps.
To further stabilize the landmarks among continuous video frames while maintaining the precise at the same time, we propose a novel stabilization loss that contains two terms to address time delay and non-smooth issues, respectively.
Experiments on 300W, 300-VW and Talking Face datasets clearly demonstrate that the proposed method is more accurate and stable than the state-of-the-art models.
Recently, many graph matching methods that incorporate pairwise constraint and that can be formulated as a quadratic assignment problem (QAP) have been proposed.
Although these methods demonstrate promising results for the graph matching problem, they have high complexity in space or time.
In this paper, we introduce an adaptively transforming graph matching (ATGM) method from the perspective of functional representation.
More precisely, under a transformation formulation, we aim to match two graphs by minimizing the discrepancy between the original graph and the transformed graph.
With a linear representation map of the transformation, the pairwise edge attributes of graphs are explicitly represented by unary node attributes, which enables us to reduce the space and time complexity significantly.
Due to an efficient Frank-Wolfe method-based optimization strategy, we can handle graphs with hundreds and thousands of nodes within an acceptable amount of time.
Meanwhile, because transformation map can preserve graph structures, a domain adaptation-based strategy is proposed to remove the outliers.
The experimental results demonstrate that our proposed method outperforms the state-of-the-art graph matching algorithms.
Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack.
This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information.
What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim.
Online social media services can be one such source for gathering vital information about an individual.
In this paper, we characterize and examine a true positive dataset of spear phishing, spam, and normal phishing emails from Symantec's enterprise email scanning service.
We then present a model to detect spear phishing emails sent to employees of 14 international organizations, by using social features extracted from LinkedIn.
Our dataset consists of 4,742 targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack emails sent to 5,912 non victims; and publicly available information from their LinkedIn profiles.
We applied various machine learning algorithms to this labeled data, and achieved an overall maximum accuracy of 97.76% in identifying spear phishing emails.
We used a combination of social features from LinkedIn profiles, and stylometric features extracted from email subjects, bodies, and attachments.
However, we achieved a slightly better accuracy of 98.28% without the social features.
Our analysis revealed that social features extracted from LinkedIn do not help in identifying spear phishing emails.
To the best of our knowledge, this is one of the first attempts to make use of a combination of stylometric features extracted from emails, and social features extracted from an online social network to detect targeted spear phishing emails.
This paper summarizes the work done by the authors for the Zero Resource Speech Challenge organized in the technical program of Interspeech 2015.
The goal of the challenge is to discover linguistic units directly from unlabeled speech data.
The Multi-layered Acoustic Tokenizer (MAT) proposed in this work automatically discovers multiple sets of acoustic tokens from the given corpus.
Each acoustic token set is specified by a set of hyperparameters that describe the model configuration.
These sets of acoustic tokens carry different characteristics of the given corpus and the language behind thus can be mutually reinforced.
The multiple sets of token labels are then used as the targets of a Multi-target DNN (MDNN) trained on low-level acoustic features.
Bottleneck features extracted from the MDNN are used as feedback for the MAT and the MDNN itself.
We call this iterative system the Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) which generates high quality features for track 1 of the challenge and acoustic tokens for track 2 of the challenge.
As the realization of vehicular communication such as vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) is imperative for the autonomous driving cars, the understanding of realistic vehicle-to-everything (V2X) models is needed.
While previous research has mostly targeted vehicular models in which vehicles are randomly distributed and the variable of carrier frequency was not considered, a more realistic analysis of the V2X model is proposed in this paper.
We use a one-dimensional (1D) Poisson cluster process (PCP) to model a realistic scenario of vehicle distribution in a perpendicular cross line road urban area and compare the coverage results with the previous research that distributed vehicles randomly by Poisson Point Process (PPP).
Moreover, we incorporate the effect of different carrier frequencies, mmWave and sub-6 GHz, to our analysis by altering the antenna radiation pattern accordingly.
Results indicated that while the effect of clustering led to lower outage, using mmWave had even more significance in leading to lower outage.
Moreover, line-of-sight (LoS) interference links are shown to be more dominant in lowering the outage than the non-line-of-sight (NLoS) links even though they are less in number.
The analytical results give insight into designing and analyzing the urban V2X channels, and are verified by actual urban area three-dimensional (3D) ray-tracing simulation.
In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations.
The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages.
Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks.
The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e.~150 sentences per language.
Wireless systems are getting deployed in many new environments with different antenna heights, frequency bands and multipath conditions.
This has led to an increasing demand for more channel measurements to understand wireless propagation in specific environments and assist deployment engineering.
We design and implement a rapid wireless channel sounding system, using the Universal Software Radio Peripheral (USRP) and GNU Radio software, to address these demands.
Our design measures channel propagation characteristics simultaneously from multiple transmitter locations.
The system consists of multiple battery-powered transmitters and receivers.
Therefore, we can set-up the channel sounder rapidly at a field location and measure expeditiously by analyzing different transmitters signals during a single walk or drive through the environment.
Our design can be used for both indoor and outdoor channel measurements in the frequency range of 1 MHz to 6 GHz.
We expect that the proposed approach, with a few further refinements, can transform the task of propagation measurement as a routine part of day-to-day wireless network engineering.
Past decade has seen the development of many shared-memory graph processing frameworks intended to reduce the effort of developing high performance parallel applications.
However, many of these frameworks, based on Vertex-centric or Edge-centric paradigms suffer from several issues such as poor cache utilization, irregular memory accesses, heavy use of synchronization primitives or theoretical inefficiency, that deteriorate overall performance and scalability.
In this paper, we generalize a recent partition-centric paradigm for PageRank computation to a novel Graph Processing Over Partitions (GPOP) framework that exploits the locality of partitioning to dramatically improve the cache performance of a variety of graph algorithms.
It achieves high scalability by enabling completely lock and atomic free computation.
Its built-in analytical performance model enables it to use a hybrid of source and partition centric communication modes in a way that ensures work-efficiency each iteration while simultaneously boosting high bandwidth sequential memory accesses.
Finally, the GPOP framework is designed with programmability in mind.
It completely abstracts away underlying programming model details from the user and provides an easy to program set of APIs with the ability to selectively continue the active vertex set across iterations.
We extensively evaluate the performance of GPOP for a variety of graph algorithms, using several large datasets.
We observe that GPOP incurs upto 8.6x and 5.2x less L2 cache misses compared to Ligra and GraphMat, respectively.
In terms of execution time, GPOP is upto 19x and 6.1x faster than Ligra and GraphMat, respectively.
In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games.
This competition fills a gap in existing game AI competitions that have typically focussed on traditional card/board games or modern video games with graphical interfaces.
By providing a platform for evaluating agents in text-based adventures, the competition provides a novel benchmark for game AI with unique challenges for natural language understanding and generation.
This paper summarises the three competitions ran in 2016, 2017, and 2018 (including details of open source implementations of both the competition framework and our competitors) and presents the results of an improved evaluation of these competitors across 20 games.
We address the problem of attack detection and isolation for a class of discrete-time nonlinear systems under (potentially unbounded) sensor attacks and measurement noise.
We consider the case when a subset of sensors is subject to additive false data injection attacks.
Using a bank of observers, each observer leading to an Input-to-State Stable (ISS) estimation error, we propose two algorithms for detecting and isolating sensor attacks.
These algorithms make use of the ISS property of the observers to check whether the trajectories of observers are `consistent' with the attack-free trajectories of the system.
Simulations results are presented to illustrate the performance of the proposed algorithms.
Most state-of-the-art systems today produce morphological analysis based only on orthographic patterns.
In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words.
We model word formation in terms of morphological chains, from base words to the observed words, breaking the chains into parent-child relations.
We use log-linear models with morpheme and word-level features to predict possible parents, including their modifications, for each word.
The limited set of candidate parents for each word render contrastive estimation feasible.
Our model consistently matches or outperforms five state-of-the-art systems on Arabic, English and Turkish.
The objective of this research was to design a 2.4 GHz class AB Power Amplifier, with 0.18 um SMIC CMOS technology by using Cadence software, for health care applications.
The ultimate goal for such application is to minimize the trade-offs between performance and cost, and between performance and low power consumption design.
The performance of the power amplifier meets the specification requirements of the desired.
Recent advances in AI and robotics have claimed many incredible results with deep learning, yet no work to date has applied deep learning to the problem of liquid perception and reasoning.
In this paper, we apply fully-convolutional deep neural networks to the tasks of detecting and tracking liquids.
We evaluate three models: a single-frame network, multi-frame network, and a LSTM recurrent network.
Our results show that the best liquid detection results are achieved when aggregating data over multiple frames, in contrast to standard image segmentation.
They also show that the LSTM network outperforms the other two in both tasks.
This suggests that LSTM-based neural networks have the potential to be a key component for enabling robots to handle liquids using robust, closed-loop controllers.
In this paper, we consider the patient similarity matching problem over a cancer cohort of more than 220,000 patients.
Our approach first leverages on Word2Vec framework to embed ICD codes into vector-valued representation.
We then propose a sequential algorithm for case-control matching on this representation space of diagnosis codes.
The novel practice of applying the sequential matching on the vector representation lifted the matching accuracy measured through multiple clinical outcomes.
We reported the results on a large-scale dataset to demonstrate the effectiveness of our method.
For such a large dataset where most clinical information has been codified, the new method is particularly relevant.
Stigmergy has proved its great superiority in terms of distributed control, robustness and adaptability, thus being regarded as an ideal solution for large-scale swarm control problems.
Based on new discoveries on astrocytes in regulating synaptic transmission in the brain, this paper has mapped stigmergy mechanism into the interaction between synapses and investigated its characteristics and advantages.
Particularly, we have divided the interaction between synapses which are not directly connected into three phases and proposed a stigmergic learning model.
In this model, the state change of a stigmergy agent will expand its influence to affect the states of others.
The strength of the interaction is determined by the level of neural activity as well as the distance between stigmergy agents.
Inspired by the morphological and functional changes in astrocytes during environmental enrichment, it is likely that the regulation of distance between stigmergy agents plays a critical role in the stigmergy learning process.
Simulation results have verified its importance and indicated that the well-regulated distance between stigmergy agents can help to obtain stigmergy learning gain.
Automatic note-level transcription is considered one of the most challenging tasks in music information retrieval.
The specific case of flamenco singing transcription poses a particular challenge due to its complex melodic progressions, intonation inaccuracies, the use of a high degree of ornamentation and the presence of guitar accompaniment.
In this study, we explore the limitations of existing state of the art transcription systems for the case of flamenco singing and propose a specific solution for this genre: We first extract the predominant melody and apply a novel contour filtering process to eliminate segments of the pitch contour which originate from the guitar accompaniment.
We formulate a set of onset detection functions based on volume and pitch characteristics to segment the resulting vocal pitch contour into discrete note events.
A quantised pitch label is assigned to each note event by combining global pitch class probabilities with local pitch contour statistics.
The proposed system outperforms state of the art singing transcription systems with respect to voicing accuracy, onset detection and overall performance when evaluated on flamenco singing datasets.
This paper reports on the 2018 PIRM challenge on perceptual super-resolution (SR), held in conjunction with the Perceptual Image Restoration and Manipulation (PIRM) workshop at ECCV 2018.
In contrast to previous SR challenges, our evaluation methodology jointly quantifies accuracy and perceptual quality, therefore enabling perceptual-driven methods to compete alongside algorithms that target PSNR maximization.
Twenty-one participating teams introduced algorithms which well-improved upon the existing state-of-the-art methods in perceptual SR, as confirmed by a human opinion study.
We also analyze popular image quality measures and draw conclusions regarding which of them correlates best with human opinion scores.
We conclude with an analysis of the current trends in perceptual SR, as reflected from the leading submissions.
Independent component analysis (ICA) is a statistical method for transforming an observable multidimensional random vector into components that are as statistically independent as possible from each other.Usually the ICA framework assumes a model according to which the observations are generated (such as a linear transformation with additive noise).
ICA over finite fields is a special case of ICA in which both the observations and the independent components are over a finite alphabet.
In this work we consider a generalization of this framework in which an observation vector is decomposed to its independent components (as much as possible) with no prior assumption on the way it was generated.
This generalization is also known as Barlow's minimal redundancy representation problem and is considered an open problem.
We propose several theorems and show that this NP hard problem can be accurately solved with a branch and bound search tree algorithm, or tightly approximated with a series of linear problems.
Our contribution provides the first efficient and constructive set of solutions to Barlow's problem.The minimal redundancy representation (also known as factorial code) has many applications, mainly in the fields of Neural Networks and Deep Learning.
The Binary ICA (BICA) is also shown to have applications in several domains including medical diagnosis, multi-cluster assignment, network tomography and internet resource management.
In this work we show this formulation further applies to multiple disciplines in source coding such as predictive coding, distributed source coding and coding of large alphabet sources.
This paper proposes the first user-independent inter-keystroke timing attacks on PINs.
Our attack method is based on an inter-keystroke timing dictionary built from a human cognitive model whose parameters can be determined by a small amount of training data on any users (not necessarily the target victims).
Our attacks can thus be potentially launched on a large scale in real-world settings.
We investigate inter-keystroke timing attacks in different online attack settings and evaluate their performance on PINs at different strength levels.
Our experimental results show that the proposed attack performs significantly better than random guessing attacks.
We further demonstrate that our attacks pose a serious threat to real-world applications and propose various ways to mitigate the threat.
Pedestrian attribute inference is a demanding problem in visual surveillance that can facilitate person retrieval, search and indexing.
To exploit semantic relations between attributes, recent research treats it as a multi-label image classification task.
The visual cues hinting at attributes can be strongly localized and inference of person attributes such as hair, backpack, shorts, etc., are highly dependent on the acquired view of the pedestrian.
In this paper we assert this dependence in an end-to-end learning framework and show that a view-sensitive attribute inference is able to learn better attribute predictions.
Our proposed model jointly predicts the coarse pose (view) of the pedestrian and learns specialized view-specific multi-label attribute predictions.
We show in an extensive evaluation on three challenging datasets (PETA, RAP and WIDER) that our proposed end-to-end view-aware attribute prediction model provides competitive performance and improves on the published state-of-the-art on these datasets.
Non-uniform and multi-illuminant color constancy are important tasks, the solution of which will allow to discard information about lighting conditions in the image.
Non-uniform illumination and shadows distort colors of real-world objects and mostly do not contain valuable information.
Thus, many computer vision and image processing techniques would benefit from automatic discarding of this information at the pre-processing step.
In this work we propose novel view on this classical problem via generative end-to-end algorithm, namely image conditioned Generative Adversarial Network.
We also demonstrate the potential of the given approach for joint shadow detection and removal.
Forced by the lack of training data, we render the largest existing shadow removal dataset and make it publicly available.
It consists of approximately 6,000 pairs of wide field of view synthetic images with and without shadows.
Image style transfer models based on convolutional neural networks usually suffer from high temporal inconsistency when applied to videos.
Some video style transfer models have been proposed to improve temporal consistency, yet they fail to guarantee fast processing speed, nice perceptual style quality and high temporal consistency at the same time.
In this paper, we propose a novel real-time video style transfer model, ReCoNet, which can generate temporally coherent style transfer videos while maintaining favorable perceptual styles.
A novel luminance warping constraint is added to the temporal loss at the output level to capture luminance changes between consecutive frames and increase stylization stability under illumination effects.
We also propose a novel feature-map-level temporal loss to further enhance temporal consistency on traceable objects.
Experimental results indicate that our model exhibits outstanding performance both qualitatively and quantitatively.
Light clients, also known as Simple Payment Verification (SPV) clients, are nodes which only download a small portion of the data in a blockchain, and use indirect means to verify that a given chain is valid.
Typically, instead of validating block data, they assume that the chain favoured by the blockchain's consensus algorithm only contains valid blocks, and that the majority of block producers are honest.
By allowing such clients to receive fraud proofs generated by fully validating nodes that show that a block violates the protocol rules, and combining this with probabilistic sampling techniques to verify that all of the data in a block actually is available to be downloaded, we can eliminate the honest-majority assumption, and instead make much weaker assumptions about a minimum number of honest nodes that rebroadcast data.
Fraud and data availability proofs are key to enabling on-chain scaling of blockchains (e.g. via sharding or bigger blocks) while maintaining a strong assurance that on-chain data is available and valid.
We present, implement, and evaluate a novel fraud and data availability proof system.
Nonlinear electromagnetic (EM) inverse scattering is a quantitative and super-resolution imaging technique, in which more realistic interactions between the internal structure of scene and EM wavefield are taken into account in the imaging procedure, in contrast to conventional tomography.
However, it poses important challenges arising from its intrinsic strong nonlinearity, ill-posedness, and expensive computation costs.
To tackle these difficulties, we, for the first time to our best knowledge, exploit a connection between the deep neural network (DNN) architecture and the iterative method of nonlinear EM inverse scattering.
This enables the development of a novel DNN-based methodology for nonlinear EM inverse problems (termed here DeepNIS).
The proposed DeepNIS consists of a cascade of multi-layer complexvalued residual convolutional neural network (CNN) modules.
We numerically and experimentally demonstrate that the DeepNIS outperforms remarkably conventional nonlinear inverse scattering methods in terms of both the image quality and computational time.
We show that DeepNIS can learn a general model approximating the underlying EM inverse scattering system.
It is expected that the DeepNIS will serve as powerful tool in treating highly nonlinear EM inverse scattering problems over different frequency bands, involving large-scale and high-contrast objects, which are extremely hard and impractical to solve using conventional inverse scattering methods.
Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently attracted much attention in natural language processing and information retrieval tasks.
The embedding vectors are typically learned based on term proximity in a large corpus.
This means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately predict adjacent word(s) for a given word or context.
However, this objective is not necessarily equivalent to the goal of many information retrieval (IR) tasks.
The primary objective in various IR tasks is to capture relevance instead of term proximity, syntactic, or even semantic similarity.
This is the motivation for developing unsupervised relevance-based word embedding models that learn word representations based on query-document relevance information.
In this paper, we propose two learning models with different objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classifies each term as belonging to the relevant or non-relevant class for each query.
To train our models, we used over six million unique queries and the top ranked documents retrieved in response to each query, which are assumed to be relevant to the query.
We extrinsically evaluate our learned word representation models using two IR tasks: query expansion and query classification.
Both query expansion experiments on four TREC collections and query classification experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models significantly outperform state-of-the-art proximity-based embedding models, such as word2vec and GloVe.
The use of key-dependent shiftRows can be considered as one of the applied methods for altering the quality of a cryptographic algorithm.
This article describes one approach for changing the ShiftRows transformation employed in the algorithm AES.
The approach employs methods inspired from DNA processes and structure which depended on the key while the parameters of the created new ShiftRows have characteristics identical to those of the original algorithm AES in addition to increase its resistance against attacks.
The proposed new ShiftRows were tested for coefficient correlation for dynamic and static independence between the input and output.
The NIST Test Suite tests were used to test the randomness for the block cipher that used the new transformation.
This paper presents an infrastructure to test the functionality of the specific architectures output by a high-level compiler targeting dynamically reconfigurable hardware.
It results in a suitable scheme to verify the architectures generated by the compiler, each time new optimization techniques are included or changes in the compiler are performed.
We believe this kind of infrastructure is important to verify, by functional simulation, further research techniques, as far as compilation to Field-Programmable Gate Array (FPGA) platforms is concerned.
This paper presents a methodology for temporal logic verification of discrete-time stochastic systems.
Our goal is to find a lower bound on the probability that a complex temporal property is satisfied by finite traces of the system.
Desired temporal properties of the system are expressed using a fragment of linear temporal logic, called safe LTL over finite traces.
We propose to use barrier certificates for computations of such lower bounds, which is computationally much more efficient than the existing discretization-based approaches.
The new approach is discretization-free and does not suffer from the curse of dimensionality caused by discretizing state sets.
The proposed approach relies on decomposing the negation of the specification into a union of sequential reachabilities and then using barrier certificates to compute upper bounds for these reachability probabilities.
We demonstrate the effectiveness of the proposed approach on case studies with linear and polynomial dynamics.
In today's dynamic ICT environments, the ability to control users' access to resources becomes ever important.
On the one hand, it should adapt to the users' changing needs; on the other hand, it should not be compromised.
Therefore, it is essential to have a flexible access control model, incorporating dynamically changing context information.
Towards this end, this paper introduces a policy framework for context-aware access control (CAAC) applications that extends the role-based access control model with both dynamic associations of user-role and role-permission capabilities.
We first present a formal model of CAAC policies for our framework.
Using this model, we then introduce an ontology-based approach and a software prototype for modelling and enforcing CAAC policies.
In addition, we evaluate our policy ontology model and framework by considering (i) the completeness of the ontology concepts, specifying different context-aware user-role and role-permission assignment policies from the healthcare scenarios; (ii) the correctness and consistency of the ontology semantics, assessing the core and domain-specific ontologies through the healthcare case study; and (iii) the performance of the framework by means of response time.
The evaluation results demonstrate the feasibility of our framework and quantify the performance overhead of achieving context-aware access control to information resources.
This work addresses challenges arising from extracting entities from textual data, including the high cost of data annotation, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation.
We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as feedback.
This incremental and interactive learning framework allows for rapid annotation and subsequent extraction of sparse data while maintaining high accuracy.
We evaluate our framework on three publicly available datasets and show that it drastically reduces the cost of sparse entity annotation by an average of 85% and 45% to reach 0.9 and 1.0 F-Scores respectively.
Moreover, the method exhibited robust performance across all datasets.
Linear dynamical relations that may exist in continuous-time, or at some natural sampling rate, are not directly discernable at reduced observational sampling rates.
Indeed, at reduced rates, matricial spectral densities of vectorial time series have maximal rank and thereby cannot be used to ascertain potential dynamic relations between their entries.
This hitherto undeclared source of inaccuracies appears to plague off-the-shelf identification techniques seeking remedy in hypothetical observational noise.
In this paper we explain the exact relation between stochastic models at different sampling rates and show how to construct stochastic models at the finest time scale that data allows.
We then point out that the correct number of dynamical dependences can only be ascertained by considering stochastic models at this finest time scale, which in general is faster than the observational sampling rate.
The main goal of this work is to establish a bijection between Dyck words and a family of Eulerian digraphs.
We do so by providing two algorithms implementing such bijection in both directions.
The connection between Dyck words and Eulerian digraphs exploits a novel combinatorial structure: a binary matrix, we call Dyck matrix, representing the cycles of an Eulerian digraph.
We present Solrex,an automated solver for the game of Reverse Hex.Reverse Hex, also known as Rex, or Misere Hex, is the variant of the game of Hex in which the player who joins her two sides loses the game.
Solrex performs a mini-max search of the state space using Scalable Parallel Depth First Proof Number Search, enhanced by the pruning of inferior moves and the early detection of certain winning strategies.
Solrex is implemented on the same code base as the Hex program Solver, and can solve arbitrary positions on board sizes up to 6x6, with the hardest position taking less than four hours on four threads.
We show how a simple convolutional neural network (CNN) can be trained to accurately and robustly regress 6 degrees of freedom (6DoF) 3D head pose, directly from image intensities.
We further explain how this FacePoseNet (FPN) can be used to align faces in 2D and 3D as an alternative to explicit facial landmark detection for these tasks.
We claim that in many cases the standard means of measuring landmark detector accuracy can be misleading when comparing different face alignments.
Instead, we compare our FPN with existing methods by evaluating how they affect face recognition accuracy on the IJB-A and IJB-B benchmarks: using the same recognition pipeline, but varying the face alignment method.
Our results show that (a) better landmark detection accuracy measured on the 300W benchmark does not necessarily imply better face recognition accuracy.
(b) Our FPN provides superior 2D and 3D face alignment on both benchmarks.
Finally, (c), FPN aligns faces at a small fraction of the computational cost of comparably accurate landmark detectors.
For many purposes, FPN is thus a far faster and far more accurate face alignment method than using facial landmark detectors.
High transmission rate and secure communication have been identified as the key targets that need to be effectively addressed by fifth generation (5G) wireless systems.
In this context, the concept of physical-layer security becomes attractive, as it can establish perfect security using only the characteristics of wireless medium.
Nonetheless, to further increase the spectral efficiency, an emerging concept, termed physical-layer service integration (PHY-SI), has been recognized as an effective means.
Its basic idea is to combine multiple coexisting services, i.e., multicast/broadcast service and confidential service, into one integral service for one-time transmission at the transmitter side.
This article first provides a tutorial on typical PHY-SI models.
Furthermore, we propose some state-of-the-art solutions to improve the overall performance of PHY-SI in certain important communication scenarios.
In particular, we highlight the extension of several concepts borrowed from conventional single-service communications, such as artificial noise (AN), eigenmode transmission etc., to the scenario of PHY-SI.
These techniques are shown to be effective in the design of reliable and robust PHY-SI schemes.
Finally, several potential research directions are identified for future work.
Efficient and accurate path-sensitive analyses pose the challenges of: (a) analyzing an exponentially-increasing number of paths in a control-flow graph (CFG), and (b) checking feasibility of paths in a CFG.
We address these challenges by introducing an equivalence relation on the CFG paths to partition them into equivalence classes.
It is then sufficient to perform analysis on these equivalence classes rather than on the individual paths in a CFG.
This technique has two major advantages: (a) although the number of paths in a CFG can be exponentially large, the essential information to be analyzed is captured by a small number of equivalence classes, and (b) checking path feasibility becomes simpler.
The key challenge is how to efficiently compute equivalence classes of paths in a CFG without examining each path in the CFG?
In this paper, we present a linear-time algorithm to form equivalence classes without the need for examination of each path in a CFG.
The key to this algorithm is construction of an event-flow graph (EFG), a compact derivative of the CFG, in which each path represents an equivalence class of paths in the corresponding CFG.
EFGs are defined with respect to the set of events that are in turn defined by the analyzed property.
The equivalence classes are thus guaranteed to preserve all the event traces in the original CFG.
We present an empirical evaluation of the Linux kernel (v3.12).
The EFGs in our evaluation are defined with respect to events of the spin safe-synchronization property.
Evaluation results show that there are many fewer EFG-based equivalence classes compared to the corresponding number of paths in a CFG.
This reduction is close to 99% for CFGs with a large number of paths.
Moreover, our controlled experiment results show that EFGs are human comprehensible and compact compared to their corresponding CFGs.
Age estimation from a single face image has been an essential task in the field of human-computer interaction and computer vision which has a wide range of practical application value.
Concerning the problem that accuracy of age estimation of face images in the wild are relatively low for existing methods, where they take into account only the whole features of face image while neglecting the fine-grained features of age-sensitive area, we propose a method based on Attention LSTM network for Fine-Grained age estimation in the wild based on the idea of Fine-Grained categories and visual attention mechanism.
This method combines ResNets or RoR models with LSTM unit to construct AL-ResNets or AL-RoR networks to extract age-sensitive local regions, which effectively improves age estimation accuracy.
Firstly, ResNets or RoR model pre-trained on ImageNet dataset is selected as the basic model, which is then fine-tuned on the IMDB-WIKI-101 dataset for age estimation.
Then, we fine-tune ResNets or RoR on the target age datasets to extract the global features of face images.
To extract the local characteristics of age-sensitive areas, the LSTM unit is then presented to obtain the coordinates of the age-sensitive region automatically.
Finally, the age group classification experiment is conducted directly on the Adience dataset, and age-regression experiments are performed by the Deep EXpectation algorithm (DEX) on MORPH Album 2, FG-NET and LAP datasets.
By combining the global and local features, we got our final prediction results.
Our experiments illustrate the effectiveness of AL-ResNets or AL-RoR for age estimation in the wild, where it achieves new state-of-the-art performance than all other CNN methods on the Adience, MORPH Album 2, FG-NET and LAP datasets.
Deep neural network architectures designed for application domains other than sound, especially image recognition, may not optimally harness the time-frequency representation when adapted to the sound recognition problem.
In this work, we explore the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN) for multi-dimensional temporal signal recognition.
The CLNN considers the inter-frame relationship, and the MCLNN enforces a systematic sparseness over the network's links to enable learning in frequency bands rather than bins allowing the network to be frequency shift invariant mimicking a filterbank.
The mask also allows considering several combinations of features concurrently, which is usually handcrafted through exhaustive manual search.
We applied the MCLNN to the environmental sound recognition problem using the ESC-10 and ESC-50 datasets.
MCLNN achieved competitive performance, using 12% of the parameters and without augmentation, compared to state-of-the-art Convolutional Neural Networks.
We study the stable marriage problem in the partial information setting where the agents, although they have an underlying true strict linear order, are allowed to specify partial orders.
Specifically, we focus on the case where the agents are allowed to submit strict weak orders and we try to address the following questions from the perspective of a market-designer: i) How can a designer generate matchings that are robust? ii) What is the trade-off between the amount of missing information and the "quality" of solution one can get?
With the goal of resolving these questions through a simple and prior-free approach, we suggest looking at matchings that minimize the maximum number of blocking pairs with respect to all the possible underlying true orders as a measure of "quality", and subsequently provide results on finding such matchings.
In particular, we first restrict our attention to matchings that have to be stable with respect to at least one of the completions and show that in this case arbitrarily filling-in the missing information and computing the resulting stable matching can give a non-trivial approximation factor for our problem in certain cases.
We complement this result by showing that, even under severe restrictions on the preferences of the agents, the factor obtained is asymptotically tight in many cases.
We then investigate a special case, where only agents on one side provide strict weak orders and all the missing information is at the bottom of their preference orders, and show that here the negative result mentioned above can be circumvented in order to get a much better approximation factor; this result, too, is tight in many cases.
Finally, we move away from the restriction mentioned above and show a general hardness of approximation result and also discuss one possible approach that can lead us to a near-tight approximation bound.
Swarming peer-to-peer systems play an increasingly instrumental role in Internet content distribution.
It is therefore important to better understand how these systems behave in practice.
Recent research efforts have looked at various protocol parameters and have measured how they affect system performance and robustness.
However, the importance of the strategy based on which peers establish connections has been largely overlooked.
This work utilizes extensive simulations to examine the default overlay construction strategy in BitTorrent systems.
Based on the results, we identify a critical parameter, the maximum allowable number of outgoing connections at each peer, and evaluate its impact on the robustness of the generated overlay.
We find that there is no single optimal value for this parameter using the default strategy.
We then propose an alternative strategy that allows certain new peer connection requests to replace existing connections.
Further experiments with the new strategy demonstrate that it outperforms the default one for all considered metrics by creating an overlay more robust to churn.
Additionally, our proposed strategy exhibits optimal behavior for a well-defined value of the maximum number of outgoing connections, thereby removing the need to set this parameter in an ad-hoc manner.
OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs.
However, the verbosity of its C host API can hinder application development.
In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity of the OpenCL API, offering straightforward memory management, integrated profiling of events (e.g., kernel execution and data transfers), simple but extensible device selection mechanism and user-friendly error management.
We compare two versions of a conceptual application example, one based on cf4ocl, the other developed directly with the OpenCL host API.
Results show that the former is simpler to implement and offers more features, at the cost of an effectively negligible computational overhead.
Additionally, the tools provided with cf4ocl allowed for a quick analysis on how to optimize the application.
Variance reduction techniques have been shown by others in the past to be a useful tool to reduce variance in Simulation studies.
However, their application and success in the past has been mainly domain specific, with relatively little guidelines as to their general applicability, in particular for novices in this area.
To facilitate their use, this study aims to investigate the robustness of individual techniques across a set of scenarios from different domains.
Experimental results show that Control Variates is the only technique which achieves a reduction in variance across all domains.
Furthermore, applied individually, Antithetic Variates and Control Variates perform particularly well in the Cross-docking scenarios, which was previously unknown.
This paper explores the problem of ranking short social media posts with respect to user queries using neural networks.
Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic soft matches between query and post terms.
Extensive experiments on datasets from the TREC Microblog Tracks show that our simple models not only demonstrate better effectiveness than existing approaches that are far more complex or exploit a more diverse set of relevance signals, but also achieve 4 times speedup in model training and inference.
Network Function Virtualization (NFV) aims to simplify deployment of network services by running Virtual Network Functions (VNFs) on commercial off-the-shelf servers.
Service deployment involves placement of VNFs and in-sequence routing of traffic flows through VNFs comprising a Service Chain (SC).
The joint VNF placement and traffic routing is usually referred as SC mapping.
In a Wide Area Network (WAN), a situation may arise where several traffic flows, generated by many distributed node pairs, require the same SC, one single instance (or occurrence) of that SC might not be enough.
SC mapping with multiple SC instances for the same SC turns out to be a very complex problem, since the sequential traversal of VNFs has to be maintained while accounting for traffic flows in various directions.
Our study is the first to deal with SC mapping with multiple SC instances to minimize network resource consumption.
Exact mathematical modeling of this problem results in a quadratic formulation.
We propose a two-phase column-generation-based model and solution in order to get results over large network topologies within reasonable computational times.
Using such an approach, we observe that an appropriate choice of only a small set of SC instances can lead to solution very close to the minimum bandwidth consumption.
This paper presents an approach to dynamic component composition that facilitates creating new composed components using existing ones at runtime and without any code generation.
The dynamic abilities are supported by extended type notion and implementation based on additional superstructure provided with its Java API and corresponding JavaBeans components.
The new component composition is performed by building the composed prototype object that can be dynamically transformed into the new instantiable type (component).
That approach demonstrates interrelations between prototype-based and class-based component-oriented programming.
The component model proposed can be used when implementing user-defined types in declarative languages for event-driven applications programming.
Graphics Processing Units (GPUs) are becoming popular accelerators in modern High-Performance Computing (HPC) clusters.
Installing GPUs on each node of the cluster is not efficient resulting in high costs and power consumption as well as underutilisation of the accelerator.
The research reported in this paper is motivated towards the use of few physical GPUs by providing cluster nodes access to remote GPUs on-demand for a financial risk application.
We hypothesise that sharing GPUs between several nodes, referred to as multi-tenancy, reduces the execution time and energy consumed by an application.
Two data transfer modes between the CPU and the GPUs, namely concurrent and sequential, are explored.
The key result from the experiments is that multi-tenancy with few physical GPUs using sequential data transfers lowers the execution time and the energy consumed, thereby improving the overall performance of the application.
No-regret learning has emerged as a powerful tool for solving extensive-form games.
This was facilitated by the counterfactual-regret minimization (CFR) framework, which relies on the instantiation of regret minimizers for simplexes at each information set of the game.
We use an instantiation of the CFR framework to develop algorithms for solving behaviorally-constrained (and, as a special case, perturbed in the Selten sense) extensive-form games, which allows us to compute approximate Nash equilibrium refinements.
Nash equilibrium refinements are motivated by a major deficiency in Nash equilibrium: it provides virtually no guarantees on how it will play in parts of the game tree that are reached with zero probability.
Refinements can mend this issue, but have not been adopted in practice, mostly due to a lack of scalable algorithms.
We show that, compared to standard algorithms, our method finds solutions that have substantially better refinement properties, while enjoying a convergence rate that is comparable to that of state-of-the-art algorithms for Nash equilibrium computation both in theory and practice.
Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN).
Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet.
In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books.
This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes.
First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method.
A convolutional neural network (CNN) is trained to learn these attributes.
Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space.
We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters.
Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.
In the medical domain, identifying and expanding abbreviations in clinical texts is a vital task for both better human and machine understanding.
It is a challenging task because many abbreviations are ambiguous especially for intensive care medicine texts, in which phrase abbreviations are frequently used.
Besides the fact that there is no universal dictionary of clinical abbreviations and no universal rules for abbreviation writing, such texts are difficult to acquire, expensive to annotate and even sometimes, confusing to domain experts.
This paper proposes a novel and effective approach - exploiting task-oriented resources to learn word embeddings for expanding abbreviations in clinical notes.
We achieved 82.27% accuracy, close to expert human performance.
It is well known that the reserves/redundancies built into the transmission grid in order to address a variety of contingencies over a long planning horizon may, in the short run, cause economic dispatch inefficiency.
Accordingly, power grid optimization by means of short term line switching has been proposed and is typically formulated as a mixed integer programming problem by treating the state of the transmission lines as a binary decision variable, i.e. in-service or out-of-service, in the optimal power flow problem.
To handle the combinatorial explosion, a number of heuristic approaches to grid topology reconfiguration have been proposed in the literature.
This paper extends our recent results on the iterative heuristics and proposes a fast grid decomposition algorithm based on vertex cut sets with the purpose of further reducing the computational cost.
The paper concludes with a discussion of the possible relationship between vertex cut sets in transmission networks and power trading.
In this paper, the average successful throughput, i.e., goodput, of a coded 3-node cooperative network is studied in a Rayleigh fading environment.
It is assumed that a simple automatic repeat request (ARQ) technique is employed in the network so that erroneously received codeword is retransmitted until successful delivery.
The relay is assumed to operate in either amplify-and-forward (AF) or decode-and-forward (DF) mode.
Under these assumptions, retransmission mechanisms and protocols are described, and the average time required to send information successfully is determined.
Subsequently, the goodput for both AF and DF relaying is formulated.
The tradeoffs and interactions between the goodput, transmission rates, and relay location are investigated and optimal strategies are identified.
Two genres of heuristics that are frequently reported to perform much better on "real-world" instances than in the worst case are greedy algorithms and local search algorithms.
In this paper, we systematically study these two types of algorithms for the problem of maximizing a monotone submodular set function subject to downward-closed feasibility constraints.
We consider perturbation-stable instances, in the sense of Bilu and Linial, and precisely identify the stability threshold beyond which these algorithms are guaranteed to recover the optimal solution.
Byproducts of our work include the first definition of perturbation-stability for non-additive objective functions, and a resolution of the worst-case approximation guarantee of local search in p-extendible systems.
Audio Event Detection (AED) aims to recognize sounds within audio and video recordings.
AED employs machine learning algorithms commonly trained and tested on annotated datasets.
However, available datasets are limited in number of samples and hence it is difficult to model acoustic diversity.
Therefore, we propose combining labeled audio from a dataset and unlabeled audio from the web to improve the sound models.
The audio event detectors are trained on the labeled audio and ran on the unlabeled audio downloaded from YouTube.
Whenever the detectors recognized any of the known sounds with high confidence, the unlabeled audio was use to re-train the detectors.
The performance of the re-trained detectors is compared to the one from the original detectors using the annotated test set.
Results showed an improvement of the AED, and uncovered challenges of using web audio from videos.
Automated negotiation is a rising topic in Artificial Intelligence research.
Monte Carlo methods have got increasing interest, in particular since they have been used with success on games with high branching factor such as go.In this paper, we describe an Monte Carlo Negotiating Agent (MoCaNA) whose bidding strategy relies on Monte Carlo Tree Search.
We provide our agent with opponent modeling tehcniques for bidding strtaegy and utility.
MoCaNA can negotiate on continuous negotiating domains and in a context where no bound has been specified.
We confront MoCaNA and the finalists of ANAC 2014 and a RandomWalker on different negotiation domains.
Our agent ouperforms the RandomWalker in a domain without bound and the majority of the ANAC finalists in a domain with a bound.
Independent Component Analysis (ICA) is a popular model for blind signal separation.
The ICA model assumes that a number of independent source signals are linearly mixed to form the observed signals.
We propose a new algorithm, PEGI (for pseudo-Euclidean Gradient Iteration), for provable model recovery for ICA with Gaussian noise.
The main technical innovation of the algorithm is to use a fixed point iteration in a pseudo-Euclidean (indefinite "inner product") space.
The use of this indefinite "inner product" resolves technical issues common to several existing algorithms for noisy ICA.
This leads to an algorithm which is conceptually simple, efficient and accurate in testing.
Our second contribution is combining PEGI with the analysis of objectives for optimal recovery in the noisy ICA model.
It has been observed that the direct approach of demixing with the inverse of the mixing matrix is suboptimal for signal recovery in terms of the natural Signal to Interference plus Noise Ratio (SINR) criterion.
There have been several partial solutions proposed in the ICA literature.
It turns out that any solution to the mixing matrix reconstruction problem can be used to construct an SINR-optimal ICA demixing, despite the fact that SINR itself cannot be computed from data.
That allows us to obtain a practical and provably SINR-optimal recovery method for ICA with arbitrary Gaussian noise.
Currently, Markov-Gibbs random field (MGRF) image models which include high-order interactions are almost always built by modelling responses of a stack of local linear filters.
Actual interaction structure is specified implicitly by the filter coefficients.
In contrast, we learn an explicit high-order MGRF structure by considering the learning process in terms of general exponential family distributions nested over base models, so that potentials added later can build on previous ones.
We relatively rapidly add new features by skipping over the costly optimisation of parameters.
We introduce the use of local binary patterns as features in MGRF texture models, and generalise them by learning offsets to the surrounding pixels.
These prove effective as high-order features, and are fast to compute.
Several schemes for selecting high-order features by composition or search of a small subclass are compared.
Additionally we present a simple modification of the maximum likelihood as a texture modelling-specific objective function which aims to improve generalisation by local windowing of statistics.
The proposed method was experimentally evaluated by learning high-order MGRF models for a broad selection of complex textures and then performing texture synthesis, and succeeded on much of the continuum from stochastic through irregularly structured to near-regular textures.
Learning interaction structure is very beneficial for textures with large-scale structure, although those with complex irregular structure still provide difficulties.
The texture models were also quantitatively evaluated on two tasks and found to be competitive with other works: grading of synthesised textures by a panel of observers; and comparison against several recent MGRF models by evaluation on a constrained inpainting task.
Nowadays, the usefulness of a formal language for ensuring the consistency of requirements is well established.
The work presented here is part of the definition of a formally-grounded, model-based requirements engineering method for critical and complex systems.
Requirements are captured through the SysML/KAOS method and the targeted formal specification is written using the Event-B method.
Firstly, an Event-B skeleton is produced from the goal hierarchy provided by the SysML/KAOS goal model.
This skeleton is then completed in a second step by the Event-B specification obtained from system application domain properties that gives rise to the system structure.
Considering that the domain is represented using ontologies through the SysML/KAOS Domain Model method, is it possible to automatically produce the structural part of system Event-B models ?
This paper proposes a set of generic rules that translate SysML/KAOS domain ontologies into an Event-B specification.
The rules have been expressed, verified and validated through the Rodin tool using the Event-B method.
They are illustrated through a case study dealing with a landing gear system.
Our proposition makes it possible to automatically obtain, from a representation of the system application domain in the form of ontologies, the structural part of the Event-B specification which will be used to formally validate the consistency of system requirements.
In this paper, we propose the 3DFeat-Net which learns both 3D feature detector and descriptor for point cloud matching using weak supervision.
Unlike many existing works, we do not require manual annotation of matching point clusters.
Instead, we leverage on alignment and attention mechanisms to learn feature correspondences from GPS/INS tagged 3D point clouds without explicitly specifying them.
We create training and benchmark outdoor Lidar datasets, and experiments show that 3DFeat-Net obtains state-of-the-art performance on these gravity-aligned datasets.
In this paper we define the overflow problem of a network coding storage system in which the encoding parameter and the storage parameter are mismatched.
Through analyses and experiments, we first show the impacts of the overflow problem in a network coding scheme, which not only waste storage spaces, but also degrade coding efficiency.
To avoid the overflow problem, we then develop the network coding based secure storage (NCSS) scheme.
Thanks to considering both security and storage requirements in encoding procedures and distributed architectures, the NCSS can improve the performance of a cloud storage system from both the aspects of storage cost and coding processing time.
We analyze the maximum allowable stored encoded data under the perfect secrecy criterion, and provide the design guidelines for the secure cloud storage system to enhance coding efficiency and achieve the minimal storage cost.
The useful life of electrochemical energy storage (EES) is a critical factor to EES planning, operation, and economic assessment.
Today, systems commonly assume a physical end-of-life criterion, retiring EES when the remaining capacity reaches a threshold below which the EES is of little use because of functionality degradation.
Here, we propose an economic end of life criterion, where EES is retired when it cannot earn positive net economic benefit in its intended application.
This criterion depends on the use case and degradation characteristics of the EES, but is independent of initial capital cost.
Using an intertemporal operational framework to consider functionality and profitability degradation, our case study shows that the economic end of life could occur significantly faster than the physical end of life.
We argue that both criteria should be applied in EES system planning and assessment.
We also analyze how R&D efforts should consider cycling capability and calendar degradation rate when considering the economic end-of-life of EES.
While the Internet of things (IoT) promises to improve areas such as energy efficiency, health care, and transportation, it is highly vulnerable to cyberattacks.
In particular, distributed denial-of-service (DDoS) attacks overload the bandwidth of a server.
But many IoT devices form part of cyber-physical systems (CPS).
Therefore, they can be used to launch "physical" denial-of-service attacks (PDoS) in which IoT devices overflow the "physical bandwidth" of a CPS.
In this paper, we quantify the population-based risk to a group of IoT devices targeted by malware for a PDoS attack.
In order to model the recruitment of bots, we develop a "Poisson signaling game," a signaling game with an unknown number of receivers, which have varying abilities to detect deception.
Then we use a version of this game to analyze two mechanisms (legal and economic) to deter botnet recruitment.
Equilibrium results indicate that 1) defenders can bound botnet activity, and 2) legislating a minimum level of security has only a limited effect, while incentivizing active defense can decrease botnet activity arbitrarily.
This work provides a quantitative foundation for proactive PDoS defense.
Future wireless systems are expected to provide a wide range of services to more and more users.
Advanced scheduling strategies thus arise not only to perform efficient radio resource management, but also to provide fairness among the users.
On the other hand, the users' perceived quality, i.e., Quality of Experience (QoE), is becoming one of the main drivers within the schedulers design.
In this context, this paper starts by providing a comprehension of what is QoE and an overview of the evolution of wireless scheduling techniques.
Afterwards, a survey on the most recent QoE-based scheduling strategies for wireless systems is presented, highlighting the application/service of the different approaches reported in the literature, as well as the parameters that were taken into account for QoE optimization.
Therefore, this paper aims at helping readers interested in learning the basic concepts of QoE-oriented wireless resources scheduling, as well as getting in touch with the present time research frontier.
Supporting programmable states in the data plane of a forwarding element, e.g., a switch or a NIC, has recently attracted the interest of the research community, which is now looking for the right abstraction to enable the programming of stateful network functions in hardware at line rate.
We challenge the conservative assumptions of state-of-the-art abstractions in this field, e.g. always assuming minimum size packets arriving back-to-back.
Using trace-based simulations we show that by making more realistic assumptions on the traffic characteristics, e.g. larger average packet size, we can relax the design constraints that currently limit the set of functions that can be implemented at line rate, allowing for more complex functions, with no harm for performance.
Personalized driver models play a key role in the development of advanced driver assistance systems and automated driving systems.
Traditionally, physical-based driver models with fixed structures usually lack the flexibility to describe the uncertainties and high non-linearity of driver behaviors.
In this paper, two kinds of learning-based car-following personalized driver models were developed using naturalistic driving data collected from the University of Michigan Safety Pilot Model Deployment program.
One model is developed by combining the Gaussian Mixture Model (GMM) and the Hidden Markov Model (HMM), and the other one is developed by combining the Gaussian Mixture Model (GMM) and Probability Density Functions (PDF).
Fitting results between the two approaches were analyzed with different model inputs and numbers of GMM components.
Statistical analyses show that both models provide good performance of fitting while the GMM--PDF approach shows a higher potential to increase the model accuracy given a higher dimension of training data.
The traction force of a kite can be used to drive a cyclic motion for extracting wind energy from the atmosphere.
This paper presents a novel quasi-steady modelling framework for predicting the power generated over a full pumping cycle.
The cycle is divided into traction, retraction and transition phases, each described by an individual set of analytic equations.
The effect of gravity on the airborne system components is included in the framework.
A trade-off is made between modelling accuracy and computation speed such that the model is specifically useful for system optimisation and scaling in economic feasibility studies.
Simulation results are compared to experimental measurements of a 20 kW kite power system operated up to a tether length of 720 m. Simulation and experiment agree reasonably well, both for moderate and for strong wind conditions, indicating that the effect of gravity has to be taken into account for a predictive performance simulation.
Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making.
Due to the large volumes of data being processed, several research efforts, across multiple communities, have explored how to scale up recursive queries, typically expressed in Datalog.
Our experience with these tools indicated that their performance does not translate across domains (e.g., a tool design for large-scale graph analytics does not exhibit the same performance on program-analysis tasks, and vice versa).
As a result, we designed and implemented a general-purpose Datalog engine, called RecStep, on top of a parallel single-node relational system.
In this paper, we outline the different techniques we use in RecStep, and the contribution of each technique to overall performance.
We also present results from a detailed set of experiments comparing RecStep with a number of other Datalog systems using both graph analytics and program-analysis tasks, summarizing pros and cons of existing techniques based on the analysis of our observations.
We show that RecStep generally outperforms the state-of-the-art parallel Datalog engines on complex and large-scale Datalog program evaluation, by a 4-6X margin.
An additional insight from our work is that we show that it is possible to build a high-performance Datalog system on top of a relational engine, an idea that has been dismissed in past work in this area.
With the resurgence of chat-based dialog systems in consumer and enterprise applications, there has been much success in developing data-driven and rule-based natural language models to understand human intent.
Since these models require large amounts of data and in-domain knowledge, expanding an equivalent service into new markets is disrupted by language barriers that inhibit dialog automation.
This paper presents a user study to evaluate the utility of out-of-the-box machine translation technology to (1) rapidly bootstrap multilingual spoken dialog systems and (2) enable existing human analysts to understand foreign language utterances.
We additionally evaluate the utility of machine translation in human assisted environments, where a portion of the traffic is processed by analysts.
In English->Spanish experiments, we observe a high potential for dialog automation, as well as the potential for human analysts to process foreign language utterances with high accuracy.
With the advancement of software engineering in recent years, the model checking techniques are widely applied in various areas to do the verification for the system model.
However, it is difficult to apply the model checking to verify requirements due to lacking the details of the design.
Unlike other model checking tools, LTSA provides the structure diagram, which can bridge the gap between the requirements and the design.
In this paper, we demonstrate the abilities of LTSA shipped with the classic case study of the steam boiler system.
The structure diagram of LTSA can specify the interactions between the controller and the steam boiler, which can be derived from UML requirements model such as system sequence diagram of the steam boiler system.
The start-up design model of LTSA can be generated from the structure diagram.
Furthermore, we provide a variation law of the steam rate to avoid the issue of state space explosion and show how explicitly and implicitly model the time that reflects the difference between system modeling and the physical world.
Finally, the derived model is verified against the required properties.
Our work demonstrates the potential power of integrating UML with model checking tools in requirement elicitation, system design, and verification.
This paper addresses the problem of reassembling images from disjointed fragments.
More specifically, given an unordered set of fragments, we aim at reassembling one or several possibly incomplete images.
The main contributions of this work are: 1) several deep neural architectures to predict the relative position of image fragments that outperform the previous state of the art; 2) casting the reassembly problem into the shortest path in a graph problem for which we provide several construction algorithms depending on available information; 3) a new dataset of images taken from the Metropolitan Museum of Art (MET) dedicated to image reassembly for which we provide a clear setup and a strong baseline.
This paper studies the subspace clustering problem.
Given some data points approximately drawn from a union of subspaces, the goal is to group these data points into their underlying subspaces.
Many subspace clustering methods have been proposed and among which sparse subspace clustering and low-rank representation are two representative ones.
Despite the different motivations, we observe that many existing methods own the common block diagonal property, which possibly leads to correct clustering, yet with their proofs given case by case.
In this work, we consider a general formulation and provide a unified theoretical guarantee of the block diagonal property.
The block diagonal property of many existing methods falls into our special case.
Second, we observe that many existing methods approximate the block diagonal representation matrix by using different structure priors, e.g., sparsity and low-rankness, which are indirect.
We propose the first block diagonal matrix induced regularizer for directly pursuing the block diagonal matrix.
With this regularizer, we solve the subspace clustering problem by Block Diagonal Representation (BDR), which uses the block diagonal structure prior.
The BDR model is nonconvex and we propose an alternating minimization solver and prove its convergence.
Experiments on real datasets demonstrate the effectiveness of BDR.
This paper is a reply to the comments on 'Integer SEC-DED codes for low power communications'.
In settings where only unlabelled speech data is available, speech technology needs to be developed without transcriptions, pronunciation dictionaries, or language modelling text.
A similar problem is faced when modelling infant language acquisition.
In these cases, categorical linguistic structure needs to be discovered directly from speech audio.
We present a novel unsupervised Bayesian model that segments unlabelled speech and clusters the segments into hypothesized word groupings.
The result is a complete unsupervised tokenization of the input speech in terms of discovered word types.
In our approach, a potential word segment (of arbitrary length) is embedded in a fixed-dimensional acoustic vector space.
The model, implemented as a Gibbs sampler, then builds a whole-word acoustic model in this space while jointly performing segmentation.
We report word error rates in a small-vocabulary connected digit recognition task by mapping the unsupervised decoded output to ground truth transcriptions.
The model achieves around 20% error rate, outperforming a previous HMM-based system by about 10% absolute.
Moreover, in contrast to the baseline, our model does not require a pre-specified vocabulary size.
This note provides a description of a procedure that is designed to efficiently optimize expensive black-box functions.
It uses the response surface methodology by incorporating radial basis functions as the response model.
A simple method based on a Latin hypercube is used for initial sampling.
A modified version of CORS algorithm with space rescaling is used for the subsequent sampling.
The procedure is able to scale on multicore processors by performing multiple function evaluations in parallel.
The source code of the procedure is written in Python.
The software development process for embedded systems is getting faster and faster, which generally incurs an increase in the associated complexity.
As a consequence, consumer electronics companies usually invest a lot of resources in fast and automatic verification processes, in order to create robust systems and reduce product recall rates.
Because of that, the present paper proposes a simplified version of the Qt framework, which is integrated into the Efficient SMT-Based Bounded Model Checking tool to verify actual applications that use the mentioned framework.
The method proposed in this paper presents a success rate of 94.45%, for the developed test suite.
Analysing and explaining relationships between entities in a graph is a fundamental problem associated with many practical applications.
For example, a graph of biological pathways can be used for discovering a previously unknown relationship between two proteins.
Domain experts, however, may be reluctant to trust such a discovery without a detailed explanation as to why exactly the two proteins are deemed related in the graph.
This paper provides an overview of the types of solutions, their associated methods and strategies, that have been proposed for finding entity relatedness explanations in graphs.
The first type of solution relies on information inherent to the paths connecting the entities.
This type of solution provides entity relatedness explanations in the form of a list of ranked paths.
The rank of a path is measured in terms of importance, uniqueness, novelty and informativeness.
The second type of solution relies on measures of node relevance.
In this case, the relevance of nodes is measured w.r.t. the entities of interest, and relatedness explanations are provided in the form of a subgraph that maximises node relevance scores.
This paper uses this classification of approaches to discuss and contrast some of the key concepts that guide different solutions to the problem of entity relatedness explanation in graphs.
Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs.
Each bug in Defects4J comes with a test suite and at least one failing test case that triggers the bug.
In this paper, we report on an experiment to explore the effectiveness of automatic test-suite based repair on Defects4J.
The result of our experiment shows that the considered state-of-the-art repair methods can generate patches for 47 out of 224 bugs.
However, those patches are only test-suite adequate, which means that they pass the test suite and may potentially be incorrect beyond the test-suite satisfaction correctness criterion.
We have manually analyzed 84 different patches to assess their real correctness.
In total, 9 real Java bugs can be correctly repaired with test-suite based repair.
This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial or incorrect patches still pass the test suite.
With respect to practical applicability, it takes on average 14.8 minutes to find a patch.
The experiment was done on a scientific grid, totaling 17.6 days of computation time.
All the repair systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.
Most machine learning tools work with a single table where each row is an instance and each column is an attribute.
Each cell of the table contains an attribute value for an instance.
This representation prevents one important form of learning, which is, classification based on groups of correlated records, such as multiple exams of a single patient, internet customer preferences, weather forecast or prediction of sea conditions for a given day.
To some extent, relational learning methods, such as inductive logic programming, can capture this correlation through the use of intensional predicates added to the background knowledge.
In this work, we propose SPPAM, an algorithm that aggregates past observations in one single record.
We show that applying SPPAM to the original correlated data, before the learning task, can produce classifiers that are better than the ones trained using all records.
We consider a general regularised interpolation problem for learning a parameter vector from data.
The well known representer theorem says that under certain conditions on the regulariser there exists a solution in the linear span of the data points.
This is the core of kernel methods in machine learning as it makes the problem computationally tractable.
Necessary and sufficient conditions for differentiable regularisers on Hilbert spaces to admit a representer theorem have been proved.
We extend those results to nondifferentiable regularisers on uniformly convex and uniformly smooth Banach spaces.
This gives a (more) complete answer to the question when there is a representer theorem.
We then note that for regularised interpolation in fact the solution is determined by the function space alone and independent of the regulariser, making the extension to Banach spaces even more valuable.
Kron reduction is used to simplify the analysis of multi-machine power systems under certain steady state assumptions that underly the usage of phasors.
In this paper we show how to perform Kron reduction for a class of electrical networks without steady state assumptions.
The reduced models can thus be used to analyze the transient as well as the steady state behavior of these electrical networks.
L1 adaptive controller has been recognized for having a structure that allows decoupling between robustness and adaption owing to the introduction of a low pass filter with adjustable gain in the feedback loop.
The trade-off between performance, fast adaptation and robustness, is the main criteria when selecting the structure or the coefficients of the filter.
Several off-line methods with varying levels of complexity exist to help finding bounds or initial values for these coefficients.
Such values may require further refinement using trial-and-error procedures upon implementation.
Subsequently, these approaches suggest that once implemented these values are kept fixed leading to sub-optimal performance in both speed of adaptation and robustness.
In this paper, a new practical approach based on fuzzy rules for online continuous tuning of these coefficients is proposed.
The fuzzy controller is optimally tuned using Particle Swarm Optimization (PSO) taking into accounts both the tracking error and the controller output signal range.
The simulation of several examples of systems with moderate to severe nonlinearities demonstrate that the proposed approach offers improved control performance.
Keywords: Fuzzy logic control, single-objective, multi-objective particle swarm optimization, L1 Adaptive control, fuzzy L1 adaptive controller, L1 fuzzy adaptive control, L1 fuzzy adaptive controller, fuzzy L1 adaptive control, Filter tuning, Fuzzy membership function tuning, optimal, optimal tuning, Fuzzy membership function optimization, Robustness, Adaptation, multi-input multi-output, single-input single-output, estimate, PSO, FLC, nonlinear, adaptive, online, off-line, Fuzzy adaptive controller, Fuzzy adaptive control, single input single output, multi input multi output, SISO, MIMO, robust, uncertain, uncertain nonlinear system, disturbance, unknown, Adaptive Fuzzy Control Design, stable.
Gradually typed languages allow statically typed and dynamically typed code to interact while maintaining benefits of both styles.
The key to reasoning about these mixed programs is Siek-Vitousek-Cimini-Boyland's (dynamic) gradual guarantee, which says that giving components of a program more precise types only adds runtime type checking, and does not otherwise change behavior.
In this paper, we give a semantic reformulation of the gradual guarantee called graduality.
We change the name to promote the analogy that graduality is to gradual typing what parametricity is to polymorphism.
Each gives a local-to-global, syntactic-to-semantic reasoning principle that is formulated in terms of a kind of observational approximation.
Utilizing the analogy, we develop a novel logical relation for proving graduality.
We show that embedding-projection pairs (ep pairs) are to graduality what relations are to parametricity.
We argue that casts between two types where one is "more dynamic" (less precise) than the other necessarily form an ep pair, and we use this to cleanly prove the graduality cases for casts from the ep-pair property.
To construct ep pairs, we give an analysis of the type dynamism relation (also known as type precision or naive subtyping) that interprets the rules for type dynamism as compositional constructions on ep pairs, analogous to the coercion interpretation of subtyping.
The recently proposed Minimal Complexity Machine (MCM) finds a hyperplane classifier by minimizing an exact bound on the Vapnik-Chervonenkis (VC) dimension.
The VC dimension measures the capacity of a learning machine, and a smaller VC dimension leads to improved generalization.
On many benchmark datasets, the MCM generalizes better than SVMs and uses far fewer support vectors than the number used by SVMs.
In this paper, we describe a neural network based on a linear dynamical system, that converges to the MCM solution.
The proposed MCM dynamical system is conducive to an analogue circuit implementation on a chip or simulation using Ordinary Differential Equation (ODE) solvers.
Numerical experiments on benchmark datasets from the UCI repository show that the proposed approach is scalable and accurate, as we obtain improved accuracies and fewer number of support vectors (upto 74.3% reduction) with the MCM dynamical system.
We determine lower and upper bounds on the capacity of bandlimited optical intensity channels (BLOIC) with white Gaussian noise.
Three types of input power constraints are considered: 1) only an average power constraint, 2) only a peak power constraint, and 3) an average and a peak power constraint.
Capacity lower bounds are derived by a two-step process including 1) for each type of constraint, designing admissible pulse amplitude modulated input waveform ensembles, and 2) lower bounding the maximum achievable information rates of the designed input ensembles.
Capacity upper bounds are derived by exercising constraint relaxations and utilizing known results on discrete-time optical intensity channels.
We obtain degrees-of-freedom-optimal (DOF-optimal) lower bounds which have the same pre-log factor as the upper bounds, thereby characterizing the high SNR capacity of BLOIC to within a finite gap.
We further derive intersymbol-interference-free (ISI-free) signaling based lower bounds, which perform well for all practical SNR values.
In particular, the ISI-free signaling based lower bounds outperform the DOF-optimal lower bound when the SNR is below 10 dB.
We introduce new diversification methods for zero-one optimization that significantly extend strategies previously introduced in the setting of metaheuristic search.
Our methods incorporate easily implemented strategies for partitioning assignments of values to variables, accompanied by processes called augmentation and shifting which create greater flexibility and generality.
We then show how the resulting collection of diversified solutions can be further diversified by means of permutation mappings, which equally can be used to generate diversified collections of permutations for applications such as scheduling and routing.
These methods can be applied to non-binary vectors by the use of binarization procedures and by Diversification-Based Learning (DBL) procedures which also provide connections to applications in clustering and machine learning.
Detailed pseudocode and numerical illustrations are provided to show the operation of our methods and the collections of solutions they create.
This paper presents a method to improve the localization accuracy of robots operating in a range-based localization network.
The method is favorable especially when the robots operate in harsh environments where the access to a robust and reliable localization system is limited.
A state estimator is used for a six degree of freedom object using inertial sensors as well as an Ultra-wideband (UWB) range measurement sensor.
The estimator is incorporated into an adaptive algorithm, improving the localization quality of an agent by using a mobile UWB ranging sensor, where the mobile anchor moves to improve localization quality.
The algorithm reconstructs localization network in real-time to minimize the determinant of the covariance matrix in the sense of least square error.
Finally, the proposed algorithm is experimentally validated in a network consisting of one mobile and four fixed anchors.
Staying abroad during their studies is increasingly popular for students.
However, there are various challenges for both students and universities.
One important question for students is whether or not achievements performed at different universities can be taken into account for either enrolling at a foreign university or for completing the studies at their home university.
In addition to university achievements, an increasing proportion of the 195 million students worldwide increasingly receive certificates from MOOCs or other social media services.
The integration of such services into university teaching is still in the initial stages and presents some challenges.
In this paper we describe the idea to manage all these study achievements worldwide in a blockchain, which might solve the national and international challenges regarding the recognition of student achievements.
The aim of this paper is to encourage discussion in the global community instead of presenting a finished concept.
Some of the open research questions are: How to ensure student data protection, how to deal with fraud and how to deal with the possibility that students can analytically calculate the easiest way through their studies?
The paper is devoted to a mathematical model of concurrency the special case of which is asynchronous system.
Distributed asynchronous automata are introduced here.
It is proved that the Petri nets and transition systems with independence can be considered like the distributed asynchronous automata.
Time distributed asynchronous automata are defined in standard way by the map which assigns time intervals to events.
It is proved that the time distributed asynchronous automata are generalized the time Petri nets and asynchronous systems.
Pagination - the process of determining where to break an article across pages in a multi-article layout is a common layout challenge for most commercially printed newspapers and magazines.
To date, no one has created an algorithm that determines a minimal pagination break point based on the content of the article.
Existing approaches for automatic multi-article layout focus exclusively on maximizing content (number of articles) and optimizing aesthetic presentation (e.g., spacing between articles).
However, disregarding the semantic information within the article can lead to overly aggressive cutting, thereby eliminating key content and potentially confusing the reader, or setting too generous of a break point, thereby leaving in superfluous content and making automatic layout more difficult.
This is one of the remaining challenges on the path from manual layouts to fully automated processes that still ensure article content quality.
In this work, we present a new approach to calculating a document minimal break point for the task of pagination.
Our approach uses a statistical language model to predict minimal break points based on the semantic content of an article.
We then compare 4 novel candidate approaches, and 4 baselines (currently in use by layout algorithms).
Results from this experiment show that one of our approaches strongly outperforms the baselines and alternatives.
Results from a second study suggest that humans are not able to agree on a single "best" break point.
Therefore, this work shows that a semantic-based lower bound break point prediction is necessary for ideal automated document synthesis within a real-world context.
This paper presents a new type of evolutionary algorithm (EA) based on the concept of "meme", where the individuals forming the population are represented by semantic networks and the fitness measure is defined as a function of the represented knowledge.
Our work can be classified as a novel memetic algorithm (MA), given that (1) it is the units of culture, or information, that are undergoing variation, transmission, and selection, very close to the original sense of memetics as it was introduced by Dawkins; and (2) this is different from existing MA, where the idea of memetics has been utilized as a means of local refinement by individual learning after classical global sampling of EA.
The individual pieces of information are represented as simple semantic networks that are directed graphs of concepts and binary relations, going through variation by memetic versions of operators such as crossover and mutation, which utilize knowledge from commonsense knowledge bases.
In evaluating this introductory work, as an interesting fitness measure, we focus on using the structure mapping theory of analogical reasoning from psychology to evolve pieces of information that are analogous to a given base information.
Considering other possible fitness measures, the proposed representation and algorithm can serve as a computational tool for modeling memetic theories of knowledge, such as evolutionary epistemology and cultural selection theory.
Prepositions are among the most frequent words in English and play complex roles in the syntax and semantics of sentences.
Not surprisingly, they pose well-known difficulties in automatic processing of sentences (prepositional attachment ambiguities and idiosyncratic uses in phrases).
Existing methods on preposition representation treat prepositions no different from content words (e.g., word2vec and GloVe).
In addition, recent studies aiming at solving prepositional attachment and preposition selection problems depend heavily on external linguistic resources and use dataset-specific word representations.
In this paper we use word-triple counts (one of the triples being a preposition) to capture a preposition's interaction with its attachment and complement.
We then derive preposition embeddings via tensor decomposition on a large unlabeled corpus.
We reveal a new geometry involving Hadamard products and empirically demonstrate its utility in paraphrasing phrasal verbs.
Furthermore, our preposition embeddings are used as simple features in two challenging downstream tasks: preposition selection and prepositional attachment disambiguation.
We achieve results comparable to or better than the state-of-the-art on multiple standardized datasets.
Estimation of facial shapes plays a central role for face transfer and animation.
Accurate 3D face reconstruction, however, often deploys iterative and costly methods preventing real-time applications.
In this work we design a compact and fast CNN model enabling real-time face reconstruction on mobile devices.
For this purpose, we first study more traditional but slow morphable face models and use them to automatically annotate a large set of images for CNN training.
We then investigate a class of efficient MobileNet CNNs and adapt such models for the task of shape regression.
Our evaluation on three datasets demonstrates significant improvements in the speed and the size of our model while maintaining state-of-the-art reconstruction accuracy.
The application of deep learning techniques using convolutional neural networks to the classification of particle collisions in High Energy Physics is explored.
An intuitive approach to transform physical variables, like momenta of particles and jets, into a single image that captures the relevant information, is proposed.
The idea is tested using a well known deep learning framework on a simulation dataset, including leptonic ttbar events and the corresponding background at 7 TeV from the CMS experiment at LHC, available as Open Data.
This initial test shows competitive results when compared to more classical approaches, like those using feedforward neural networks.
We present an attention based visual analysis framework to compute grasp-relevant information in order to guide grasp planning using a multi-fingered robotic hand.
Our approach uses a computational visual attention model to locate regions of interest in a scene, and uses a deep convolutional neural network to detect grasp type and point for a sub-region of the object presented in a region of interest.
We demonstrate the proposed framework in object grasping tasks, in which the information generated from the proposed framework is used as prior information to guide the grasp planning.
Results show that the proposed framework can not only speed up grasp planning with more stable configurations, but also is able to handle unknown objects.
Furthermore, our framework can handle cluttered scenarios.
A new Grasp Type Dataset (GTD) that considers 6 commonly used grasp types and covers 12 household objects is also presented.
Media publisher platforms often face an effectiveness-nuisance tradeoff: more annoying ads can be more effective for some advertisers because of their ability to attract attention, but after attracting viewers' attention, their nuisance to viewers can decrease engagement with the platform over time.
With the rise of mobile technology and ad blockers, many platforms are becoming increasingly concerned about how to improve monetization through digital ads while improving viewer experience.
We study an online ad auction mechanism that incorporates a charge for ad impact on user experience as a criterion for ad selection and pricing.
Like a Pigovian tax, the charge causes advertisers to internalize the hidden cost of foregone future platform revenue due to ad impact on user experience.
Over time, the mechanism provides an incentive for advertisers to develop ads that are effective while offering viewers a more pleasant experience.
We show that adopting the mechanism can simultaneously benefit the publisher, advertisers, and viewers, even in the short term.
Incorporating a charge for ad impact can increase expected advertiser profits if enough advertisers compete.
A stronger effectiveness-nuisance tradeoff, meaning that ad effectiveness is more strongly associated with negative impact on user experience, increases the amount of competition required for the mechanism to benefit advertisers.
The findings suggest that the mechanism can benefit the marketplace for ad slots that consistently attract many advertisers.
This paper proposes a novel framework to regularize the highly ill-posed and non-linear Fourier ptychography problem using generative models.
We demonstrate experimentally that our proposed algorithm, Deep Ptych, outperforms the existing Fourier ptychography techniques, in terms of quality of reconstruction and robustness against noise, using far fewer samples.
We further modify the proposed approach to allow the generative model to explore solutions outside the range, leading to improved performance.
This paper considers an energy-efficient packet scheduling problem over quasi-static block fading channels.
The goal is to minimize the total energy for transmitting a sequence of data packets under the first-in-first-out rule and strict delay constraints.
Conventionally, such design problem is studied under the assumption that the packet transmission rate can be characterized by the classical Shannon capacity formula, which, however, may provide inaccurate energy consumption estimation, especially when the code blocklength is finite.
In this paper, we formulate a new energy-efficient packet scheduling problem by adopting a recently developed channel capacity formula for finite blocklength codes.
The newly formulated problem is fundamentally more challenging to solve than the traditional one because the transmission energy function under the new channel capacity formula neither can be expressed in closed form nor possesses desirable monotonicity and convexity in general.
We analyze conditions on the code blocklength for which the transmission energy function is monotonic and convex.
Based on these properties, we develop efficient offline packet scheduling algorithms as well as a rolling-window based online algorithm for real-time packet scheduling.
Simulation results demonstrate not only the efficacy of the proposed algorithms but also the fact that the traditional design using the Shannon capacity formula can considerably underestimate the transmission energy for reliable communications.
Computational synthesis planning approaches have achieved recent success in organic chemistry, where tabulated synthesis procedures are readily available for supervised learning.
The syntheses of inorganic materials, however, exist primarily as natural language narratives contained within scientific journal articles.
This synthesis information must first be extracted from the text in order to enable analogous synthesis planning methods for inorganic materials.
In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.
We define the structured representation as a set of linked events made up of extracted scientific entities and evaluate two unsupervised approaches for extracting these structures on expert-annotated articles: a strong heuristic baseline and a generative model of procedural text.
We also evaluate a variety of supervised models for extracting scientific entities.
Our results provide insight into the nature of the data and directions for further work in this exciting new area of research.
We consider the problem of fusing an arbitrary number of multiband, i.e., panchromatic, multispectral, or hyperspectral, images belonging to the same scene.
We use the well-known forward observation and linear mixture models with Gaussian perturbations to formulate the maximum-likelihood estimator of the endmember abundance matrix of the fused image.
We calculate the Fisher information matrix for this estimator and examine the conditions for the uniqueness of the estimator.
We use a vector total-variation penalty term together with nonnegativity and sum-to-one constraints on the endmember abundances to regularize the derived maximum-likelihood estimation problem.
The regularization facilitates exploiting the prior knowledge that natural images are mostly composed of piecewise smooth regions with limited abrupt changes, i.e., edges, as well as coping with potential ill-posedness of the fusion problem.
We solve the resultant convex optimization problem using the alternating direction method of multipliers.
We utilize the circular convolution theorem in conjunction with the fast Fourier transform to alleviate the computational complexity of the proposed algorithm.
Experiments with multiband images constructed from real hyperspectral datasets reveal the superior performance of the proposed algorithm in comparison with the state-of-the-art algorithms, which need to be used in tandem to fuse more than two multiband images.
An output-polynomial algorithm for the listing of minimal dominating sets in graphs is a challenging open problem and is known to be equivalent to the well-known Transversal problem which asks for an output-polynomial algorithm for listing the set of minimal hitting sets in hypergraphs.
We give a polynomial delay algorithm to list the set of minimal dominating sets in chordal graphs, an important and well-studied graph class where such an algorithm was open for a while.
In two-view geometry, the essential matrix describes the relative position and orientation of two calibrated images.
In three views, a similar role is assigned to the calibrated trifocal tensor.
It is a particular case of the (uncalibrated) trifocal tensor and thus it inherits all its properties but, due to the smaller degrees of freedom, satisfies a number of additional algebraic constraints.
Some of them are described in this paper.
More specifically, we define a new notion --- the trifocal essential matrix.
On the one hand, it is a generalization of the ordinary (bifocal) essential matrix, and, on the other hand, it is closely related to the calibrated trifocal tensor.
We prove the two necessary and sufficient conditions that characterize the set of trifocal essential matrices.
Based on these characterizations, we propose three necessary conditions on a calibrated trifocal tensor.
They have a form of 15 quartic and 99 quintic polynomial equations.
We show that in the practically significant real case the 15 quartic constraints are also sufficient.
Wireless networking allows users to access information and services regardless of location and physical infrastructure.
It is a fast growing technology due to its availability of wireless devices, flexibility, ease of installation and configuration.
With this rapid expansion of information and Communication Technology (ICT), the consumption of energy is also increasing.
In the early age of wireless technology, computing infrastructure focused on everywhere access, capacity and speed of technology.
But now computing infrastructure should be energy efficient because, in wireless networking, devices are mostly powered by a battery that is a limited source of energy and is a challenge for the researchers.
In computing infrastructure energy saving and environmental protection has become a global demand.
This paper proposed a computing infrastructure based on green computing for energy efficient wireless networking.
Further, some challenges and techniques like power consumption in network architecture, algorithm efficiency, virtualization, and dynamic power saving will be discussed to make energy efficient computing infrastructure.
Exogenous state variables and rewards can slow down reinforcement learning by injecting uncontrolled variation into the reward signal.
We formalize exogenous state variables and rewards and identify conditions under which an MDP with exogenous state can be decomposed into an exogenous Markov Reward Process involving only the exogenous state+reward and an endogenous Markov Decision Process defined with respect to only the endogenous rewards.
We also derive a variance-covariance condition under which Monte Carlo policy evaluation on the endogenous MDP is accelerated compared to using the full MDP.
Similar speedups are likely to carry over to all RL algorithms.
We develop two algorithms for discovering the exogenous variables and test them on several MDPs.
Results show that the algorithms are practical and can significantly speed up reinforcement learning.
This notebook paper presents our system in the ActivityNet Dense Captioning in Video task (task 3).
Temporal proposal generation and caption generation are both important to the dense captioning task.
Therefore, we propose a proposal ranking model to employ a set of effective feature representations for proposal generation, and ensemble a series of caption models enhanced with context information to generate captions robustly on predicted proposals.
Our approach achieves the state-of-the-art performance on the dense video captioning task with 8.529 METEOR score on the challenge testing set.
We present a Bayesian object observation model for complete probabilistic semantic SLAM.
Recent studies on object detection and feature extraction have become important for scene understanding and 3D mapping.
However, 3D shape of the object is too complex to formulate the probabilistic observation model; therefore, performing the Bayesian inference of the object-oriented features as well as their pose is less considered.
Besides, when the robot equipped with an RGB mono camera only observes the projected single view of an object, a significant amount of the 3D shape information is abandoned.
Due to these limitations, semantic SLAM and viewpoint-independent loop closure using volumetric 3D object shape is challenging.
In order to enable the complete formulation of probabilistic semantic SLAM, we approximate the observation model of a 3D object with a tractable distribution.
We also estimate the variational likelihood from the 2D image of the object to exploit its observed single view.
In order to evaluate the proposed method, we perform pose and feature estimation, and demonstrate that the automatic loop closure works seamlessly without additional loop detector in various environments.
We define and study error detection and correction tasks that are useful for 3D reconstruction of neurons from electron microscopic imagery, and for image segmentation more generally.
Both tasks take as input the raw image and a binary mask representing a candidate object.
For the error detection task, the desired output is a map of split and merge errors in the object.
For the error correction task, the desired output is the true object.
We call this object mask pruning, because the candidate object mask is assumed to be a superset of the true object.
We train multiscale 3D convolutional networks to perform both tasks.
We find that the error-detecting net can achieve high accuracy.
The accuracy of the error-correcting net is enhanced if its input object mask is "advice" (union of erroneous objects) from the error-detecting net.
Machine learning is used to compute achievable information rates (AIRs) for a simplified fiber channel.
The approach jointly optimizes the input distribution (constellation shaping) and the auxiliary channel distribution to compute AIRs without explicit channel knowledge in an end-to-end fashion.
Answer Set Programming (ASP) is a well-established declarative problem solving paradigm which became widely used in AI and recognized as a powerful tool for knowledge representation and reasoning (KRR), especially for its high expressiveness and the ability to deal also with incomplete knowledge.
Recently, thanks to the availability of a number of robust and efficient implementations, ASP has been increasingly employed in a number of different domains, and used for the development of industrial-level and enterprise applications.
This made clear the need for proper development tools and interoperability mechanisms for easing interaction and integration with external systems in the widest range of real-world scenarios, including mobile applications and educational contexts.
In this work we present a framework for integrating the KRR capabilities of ASP into generic applications.
We show the use of the framework by illustrating proper specializations for some relevant ASP systems over different platforms, including the mobile setting; furthermore, the potential of the framework for educational purposes is illustrated by means of the development of several ASP-based applications.
Today, with the continued growth in using information and communication technologies (ICT) for business purposes, business organizations become increasingly dependent on their information systems.
Thus, they need to protect them from the different attacks exploiting their vulnerabilities.
To do so, the organization has to use security technologies, which may be proactive or reactive ones.
Each security technology has a relative cost and addresses specific vulnerabilities.
Therefore, the organization has to put in place the appropriate security technologies set that minimizes the information system s vulnerabilities with a minimal cost.
This bi objective problem will be considered as a resources allocation problem (RAP) where security technologies represent the resources to be allocated.
However, the set of vulnerabilities may change, periodically, with the continual appearance of new ones.
Therefore, the security technologies set should be flexible to face these changes, in real time, and the problem becomes a dynamic one.
In this paper, we propose a harmony search based algorithm to solve the bi objective dynamic resource allocation decision model.
This approach was compared to a genetic algorithm and provided good results.
Online recommender systems often deal with continuous, potentially fast and unbounded flows of data.
Ensemble methods for recommender systems have been used in the past in batch algorithms, however they have never been studied with incremental algorithms that learn from data streams.
We evaluate online bagging with an incremental matrix factorization algorithm for top-N recommendation with positive-only -- binary -- ratings.
Our results show that online bagging is able to improve accuracy up to 35% over the baseline, with small computational overhead.
This paper proposes an end-to-end approach for single-channel speaker-independent multi-speaker speech separation, where time-frequency (T-F) masking, the short-time Fourier transform (STFT), and its inverse are represented as layers within a deep network.
Previous approaches, rather than computing a loss on the reconstructed signal, used a surrogate loss based on the target STFT magnitudes.
This ignores reconstruction error introduced by phase inconsistency.
In our approach, the loss function is directly defined on the reconstructed signals, which are optimized for best separation.
In addition, we train through unfolded iterations of a phase reconstruction algorithm, represented as a series of STFT and inverse STFT layers.
While mask values are typically limited to lie between zero and one for approaches using the mixture phase for reconstruction, this limitation is less relevant if the estimated magnitudes are to be used together with phase reconstruction.
We thus propose several novel activation functions for the output layer of the T-F masking, to allow mask values beyond one.
On the publicly-available wsj0-2mix dataset, our approach achieves state-of-the-art 12.6 dB scale-invariant signal-to-distortion ratio (SI-SDR) and 13.1 dB SDR, revealing new possibilities for deep learning based phase reconstruction and representing a fundamental progress towards solving the notoriously-hard cocktail party problem.
One of the most challenging fields in vehicular communications has been the experimental assessment of protocols and novel technologies.
Researchers usually tend to simulate vehicular scenarios and/or partially validate new contributions in the area by using constrained testbeds and carrying out minor tests.
In this line, the present work reviews the issues that pioneers in the area of vehicular communications and, in general, in telematics, have to deal with if they want to perform a good evaluation campaign by real testing.
The key needs for a good experimental evaluation is the use of proper software tools for gathering testing data, post-processing and generating relevant figures of merit and, finally, properly showing the most important results.
For this reason, a key contribution of this paper is the presentation of an evaluation environment called AnaVANET, which covers the previous needs.
By using this tool and presenting a reference case of study, a generic testing methodology is described and applied.
This way, the usage of the IPv6 protocol over a vehicle-to-vehicle routing protocol, and supporting IETF-based network mobility, is tested at the same time the main features of the AnaVANET system are presented.
This work contributes in laying the foundations for a proper experimental evaluation of vehicular networks and will be useful for many researchers in the area.
Diversification-Based Learning (DBL) derives from a collection of principles and methods introduced in the field of metaheuristics that have broad applications in computing and optimization.
We show that the DBL framework goes significantly beyond that of the more recent Opposition-based learning (OBL) framework introduced in Tizhoosh (2005), which has become the focus of numerous research initiatives in machine learning and metaheuristic optimization.
We unify and extend earlier proposals in metaheuristic search (Glover, 1997, Glover and Laguna, 1997) to give a collection of approaches that are more flexible and comprehensive than OBL for creating intensification and diversification strategies in metaheuristic search.
We also describe potential applications of DBL to various subfields of machine learning and optimization.
In this paper we present our work on a case study on Statistical Machine Translation (SMT) and Rule based machine translation (RBMT) for translation from English to Malayalam and Malayalam to English.
One of the motivations of our study is to make a three way performance comparison, such as, a) SMT and RBMT b) English to Malayalam SMT and Malayalam to English SMT c) English to Malayalam RBMT and Malayalam to English RBMT.
We describe the development of English to Malayalam and Malayalam to English baseline phrase based SMT system and the evaluation of its performance compared against the RBMT system.
Based on our study the observations are: a) SMT systems outperform RBMT systems, b) In the case of SMT, English - Malayalam systems perform better than that of Malayalam - English systems, c) In the case RBMT, Malayalam to English systems are performing better than English to Malayalam systems.
Based on our evaluations and detailed error analysis, we describe the requirements of incorporating morphological processing into the SMT to improve the accuracy of translation.
Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length.
In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting.
To render this search feasible, we use multi-objective reinforcement learning to significantly reduce the number of training dialogues required.
We apply our proposed method to find optimized component weights for six domains and compare them to a default baseline.
This article considers the task of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages.
We propose a general framework that is based on annotation projection, phrased as a graph optimization problem.
It is relatively inexpensive and has the potential to reduce the human effort involved in creating role-semantic resources.
Within this framework, we present projection models that exploit lexical and syntactic information.
We provide an experimental evaluation on an English-German parallel corpus which demonstrates the feasibility of inducing high-precision German semantic role annotation both for manually and automatically annotated English data.
Online forums enable users to discuss together around various topics.
One of the serious problems of these environments is high volume of discussions and thus information overload problem.
Unfortunately without considering the users interests, traditional Information Retrieval (IR) techniques are not able to solve the problem.
Therefore, employment of a Recommender System (RS) that could suggest favorite's topics of users according to their tastes could increases the dynamism of forum and prevent the users from duplicate posts.
In addition, consideration of semantics can be useful for increasing the performance of IR based RS.
Our goal is study of impact of ontology and data mining techniques on improving of content-based RS.
For this purpose, at first, three type of ontologies will be constructed from the domain corpus with utilization of text mining, Natural Language Processing (NLP) and Wordnet and then they will be used as an input in two kind of RS: one, fully ontology-based and one with enriching the user profile vector with ontology in vector space model (VSM) (proposed method).
Afterward the results will be compared with the simple VSM based RS.
Given results show that the proposed RS presents the highest performance.
In this paper we present the state of advancement of the French ANR WebStand project.
The objective of this project is to construct a customizable XML based warehouse platform to acquire, transform, analyze, store, query and export data from the web, in particular mailing lists, with the final intension of using this data to perform sociological studies focused on social groups of World Wide Web, with a specific emphasis on the temporal aspects of this data.
We are currently using this system to analyze the standardization process of the W3C, through its social network of standard setters.
Researchers spend a great deal of time reading research papers.
Keshav (2012) provides a three-pass method to researchers to improve their reading skills.
This article extends Keshav's method for reading a research compendium.
Research compendia are an increasingly used form of publication, which packages not only the research paper's text and figures, but also all data and software for better reproducibility.
We introduce the existing conventions for research compendia and suggest how to utilise their shared properties in a structured reading process.
Unlike the original, this article is not build upon a long history but intends to provide guidance at the outset of an emerging practice.
Fault tolerance is essential for building reliable services; however, it comes at the price of redundancy, mainly the "replication factor" and "diversity".
With the increasing reliance on Internet-based services, more machines (mainly servers) are needed to scale out, multiplied with the extra expense of replication.
This paper revisits the very fundamentals of fault tolerance and presents "artificial redundancy": a formal generalization of "exact copy" redundancy in which new sources of redundancy are exploited to build fault tolerant systems.
On this concept, we show how to build "artificial replication" and design "artificial fault tolerance" (AFT).
We discuss the properties of these new techniques showing that AFT extends current fault tolerant approaches to use other forms of redundancy aiming at reduced cost and high diversity.
In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models to recover different kinds of temporal relations from text.
Using the shortest dependency path between entities as input, the same architecture is used to extract intra-sentence, cross-sentence, and document creation time relations.
A "double-checking" technique reverses entity pairs in classification, boosting the recall of positive cases and reducing misclassifications between opposite classes.
An efficient pruning algorithm resolves conflicts globally.
Evaluated on QA-TempEval (SemEval2015 Task 5), our proposed technique outperforms state-of-the-art methods by a large margin.
This paper has dual aims.
First is to develop practical universal coding methods for unlabeled graphs.
Second is to use these for graph anomaly detection.
The paper develops two coding methods for unlabeled graphs: one based on the degree distribution, the second based on the triangle distribution.
It is shown that these are efficient for different types of random graphs, and on real-world graphs.
These coding methods is then used for detecting anomalous graphs, based on structure alone.
It is shown that anomalous graphs can be detected with high probability.
The present study proposes a new structure selection approach for non-linear system identification based on Two-Dimensional particle swarms (2D-UPSO).
The 2D learning framework essentially extends the learning dimension of the conventional particle swarms and explicitly incorporates the information about the cardinality, i.e., number of terms, into the search process.
This property of the 2D-UPSO has been exploited to determine the correct structure of the non-linear systems.
The efficacy of the proposed approach is demonstrated by considering several simulated benchmark nonlinear systems in discrete and in continuous domain.
In addition, the proposed approach is applied to identify a parsimonious structure from practical non-linear wave-force data.
The results of the comparative investigation with Genetic Algorithm (GA), Binary Particle Swarm Optimization (BPSO) and the classical Orthogonal Forward Regression (OFR) methods illustrate that the proposed 2D-UPSO could successfully detect the correct structure of the non-linear systems.
Substitution Box or S-Box had been generated using 4-bit Boolean Functions (BFs) for Encryption and Decryption Algorithm of Lucifer and Data Encryption Standard (DES) in late sixties and late seventies respectively.
The S-box of Advance Encryption Standard have also been generated using Irreducible Polynomials over Galois field GF(2^8) adding an additive constant in early twenty first century.
In this paper Substitution Boxes have been generated from Irreducible or Reducible Polynomials over Galois field GF(p^q).
Binary Galois fields have been used to generate Substitution Boxes.
Since the Galois Field Number or the Number generated from coefficients of a polynomial over a particular Binary Galois field (2q) is similar to log 2 q+1 bit BFs.
So generation of log 2 q+1 bit S-boxes is Possible.
Now if p = prime or non-prime number then generation of S-Boxes is possible using Galois field GF (p^q). where, q = p-1.
Upper limb Prosthetic can be viewed as an independent cognitive system in order to develop a conceptual space.
In this paper, we provide a detailed analogical reasoning of prosthetic arm to build the conceptual spaces with the help of the theory called geometric framework of conceptual spaces proposed by Gardenfors.
Terminologies of conceptual spaces such as concepts, similarities, properties, quality dimensions and prototype are applied for a specific prosthetic system and conceptual space is built for prosthetic arm.
Concept lattice traversals are used on the lattice represented conceptual spaces.
Cognitive functionalities such as generalization (Similarities) and specialization (Differences) are achieved in the lattice represented conceptual space.
This might well prove to design intelligent prosthetics to assist challenged humans.
Geometric framework of conceptual spaces holds similar concepts closer in geometric structures in a way similar to concept lattices.
Hence, we also propose to use concept lattice to represent concepts of geometric framework of conceptual spaces.
Also, we extend our discussion with our insights on conceptual spaces of bidirectional hand prosthetics.
In this paper, we analyze the throughput performance of two co-existing downlink multiuser underlay secondary networks that use fixed-rate transmissions.
We assume that the interference temperature limit (ITL) is apportioned to accommodate two concurrent transmissions using an interference temperature apportioning parameter so as to ensure that the overall interference to the primary receiver does not exceed the ITL.
Using the derived analytical expressions for throughput, when there is only one secondary user in each network, or when the secondary networks do not employ opportunistic user selection (use round robin scheduling for example), there exists a critical fixed-rate below which sum throughput with co-existing secondary networks is higher than the throughput with a single secondary network.
We derive an expression for this critical fixed-rate.
Below this critical rate, we show that careful apportioning of the ITL is critical to maximizing sum throughput of the co-existing networks.
We derive an expression for this apportioning parameter.
Throughput is seen to increase with increase in number of users in each of the secondary networks.
Computer simulations demonstrate accuracy of the derived expressions.
The visual representation of concepts or ideas through the use of simple shapes has always been explored in the history of Humanity, and it is believed to be the origin of writing.
We focus on computational generation of visual symbols to represent concepts.
We aim to develop a system that uses background knowledge about the world to find connections among concepts, with the goal of generating symbols for a given concept.
We are also interested in exploring the system as an approach to visual dissociation and visual conceptual blending.
This has a great potential in the area of Graphic Design as a tool to both stimulate creativity and aid in brainstorming in projects such as logo, pictogram or signage design.
Existing approaches to protect the privacy of Electronic Health Records are either insufficient for existing medical laws or they are too restrictive in their usage.
For example, smart card-based encryption systems require the patient to be always present to authorize access to medical records.
Questionnaires were administered by 50 medical practitioners to identify and categorize different Electronic Health Records attributes.
The system was implemented using multi biometrics of patients to access patient record in pre-hospital care.The software development tools employed were JAVA and MySQL database.
The system provides applicable security when patients records are shared either with other practitioners, employers, organizations or research institutes.
The result of the system evaluation shows that the average response time of 6 seconds and 11.1 seconds for fingerprint and iris respectively after ten different simulations.
The system protects privacy and confidentiality by limiting the amount of data exposed to users.The system also enables emergency medical technicians to gain easy and reliable access to necessary attributes of patients Electronic Health Records while still maintaining the privacy and confidentiality of the data using the patients fingerprint and iris.
Content Delivery Networks (CDNs) deliver a majority of the user-requested content on the Internet, including web pages, videos, and software downloads.
A CDN server caches and serves the content requested by users.
Designing caching algorithms that automatically adapt to the heterogeneity, burstiness, and non-stationary nature of real-world content requests is a major challenge and is the focus of our work.
While there is much work on caching algorithms for stationary request traffic, the work on non-stationary request traffic is very limited.
Consequently, most prior models are inaccurate for production CDN traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees for content request traffic that is bursty and non-stationary.
The first algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic approximation approach.
Given a feasible target hit rate, we show that the hit rate of d-TTL converges to its target value for a general class of bursty traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL.
The first-level cache adaptively filters out non-stationary traffic, while the second-level cache stores frequently-accessed stationary traffic.
Given feasible targets for both the hit rate and the expected cache size, f-TTL asymptotically achieves both targets.
We implement d-TTL and f-TTL and evaluate both algorithms using an extensive nine-day trace consisting of 500 million requests from a production CDN server.
We show that both d-TTL and f-TTL converge to their hit rate targets with an error of about 1.3%.
But, f-TTL requires a significantly smaller cache size than d-TTL to achieve the same hit rate, since it effectively filters out the non-stationary traffic for rarely-accessed objects.
This paper considers the task of thorax disease classification on chest X-ray images.
Existing methods generally use the global image as input for network learning.
Such a strategy is limited in two aspects.
1) A thorax disease usually happens in (small) localized areas which are disease specific.
Training CNNs using global image may be affected by the (excessive) irrelevant noisy areas.
2) Due to the poor alignment of some CXR images, the existence of irregular borders hinders the network performance.
In this paper, we address the above problems by proposing a three-branch attention guided convolution neural network (AG-CNN).
AG-CNN 1) learns from disease-specific regions to avoid noise and improve alignment, 2) also integrates a global branch to compensate the lost discriminative cues by local branch.
Specifically, we first learn a global CNN branch using global images.
Then, guided by the attention heat map generated from the global branch, we inference a mask to crop a discriminative region from the global image.
The local region is used for training a local CNN branch.
Lastly, we concatenate the last pooling layers of both the global and local branches for fine-tuning the fusion branch.
The Comprehensive experiment is conducted on the ChestX-ray14 dataset.
We first report a strong global baseline producing an average AUC of 0.841 with ResNet-50 as backbone.
After combining the local cues with the global information, AG-CNN improves the average AUC to 0.868.
While DenseNet-121 is used, the average AUC achieves 0.871, which is a new state of the art in the community.
Data diversity is critical to success when training deep learning models.
Medical imaging data sets are often imbalanced as pathologic findings are generally rare, which introduces significant challenges when training deep learning models.
In this work, we propose a method to generate synthetic abnormal MRI images with brain tumors by training a generative adversarial network using two publicly available data sets of brain MRI.
We demonstrate two unique benefits that the synthetic images provide.
First, we illustrate improved performance on tumor segmentation by leveraging the synthetic images as a form of data augmentation.
Second, we demonstrate the value of generative models as an anonymization tool, achieving comparable tumor segmentation results when trained on the synthetic data versus when trained on real subject data.
Together, these results offer a potential solution to two of the largest challenges facing machine learning in medical imaging, namely the small incidence of pathological findings, and the restrictions around sharing of patient data.
Recent work in data mining and related areas has highlighted the importance of the statistical assessment of data mining results.
Crucial to this endeavour is the choice of a non-trivial null model for the data, to which the found patterns can be contrasted.
The most influential null models proposed so far are defined in terms of invariants of the null distribution.
Such null models can be used by computation intensive randomization approaches in estimating the statistical significance of data mining results.
Here, we introduce a methodology to construct non-trivial probabilistic models based on the maximum entropy (MaxEnt) principle.
We show how MaxEnt models allow for the natural incorporation of prior information.
Furthermore, they satisfy a number of desirable properties of previously introduced randomization approaches.
Lastly, they also have the benefit that they can be represented explicitly.
We argue that our approach can be used for a variety of data types.
However, for concreteness, we have chosen to demonstrate it in particular for databases and networks.
We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing.
On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way.
We analyze the performance of this environment based on "virtual" machines versus the utilization of "real" physical hardware.
In this way we provide a qualitative result for the influence of the host operating system on the performance of a representative set of applications for phenomenology calculations.
A resource-bounded version of the statement "no algorithm recognizes all non-halting Turing machines" is equivalent to an infinitely often (i.o.) superpolynomial speedup for the time required to accept any coNP-complete language and also equivalent to a superpolynomial speedup in proof length in propositional proof systems for tautologies, each of which implies P!=NP.
This suggests a correspondence between the properties 'has no algorithm at all' and 'has no best algorithm' which seems relevant to open problems in computational and proof complexity.
The paper presents to address this challenge, we have proposed the use of Adaptive Window Positioning technique which focuses on not just the meaning of the handwritten signature but also on the individuality of the writer.
This innovative technique divides the handwritten signature into 13 small windows of size nxn(13x13).This size should be large enough to contain ample information about the style of the author and small enough to ensure a good identification performance.The process was tested with a GPDS data set containing 4870 signature samples from 90 different writers by comparing the robust features of the test signature with that of the user signature using an appropriate classifier.
Experimental results reveal that adaptive window positioning technique proved to be the efficient and reliable method for accurate signature feature extraction for the identification of offline handwritten signatures.The contribution of this technique can be used to detect signatures signed under emotional duress.
Everything in the world is being connected, and things are becoming interactive.
The future of the interactive world depends on the future Internet of Things (IoT).
Software-defined networking (SDN) technology, a new paradigm in the networking area, can be useful in creating an IoT because it can handle interactivity by controlling physical devices, transmission of data among them, and data acquisition.
However, digital signage can be one of the promising technologies in this era of technology that is progressing toward the interactive world, connecting users to the IoT network through device-to-device communication technology.
This article illustrates a novel prototype that is mainly focused on a smart digital signage system comprised of software-defined IoT (SD-IoT) and invisible image sensor communication technology.
We have proposed an SDN scheme with a view to initiating its flexibility and compatibility for an IoT network-based smart digital signage system.
The idea of invisible communication can make the users of the technology trendier to it, and the usage of unused resources such as images and videos can be ensured.
In addition, this communication has paved the way for interactivity between the user and digital signage, where the digital signage and the camera of a smartphone can be operated as a transmitter and a receiver, respectively.
The proposed scheme might be applicable to real-world applications because SDN has the flexibility to adapt with the alteration of network status without any hardware modifications while displays and smartphones are available everywhere.
A performance analysis of this system showed the advantages of an SD-IoT network over an Internet protocol-based IoT network considering a queuing analysis for a dynamic link allocation process in the case of user access to the IoT network.
Scientific computation is a discipline that combines numerical analysis, physical understanding, algorithm development, and structured programming.
Several yottacycles per year on the world's largest computers are spent simulating problems as diverse as weather prediction, the properties of material composites, the behavior of biomolecules in solution, and the quantum nature of chemical compounds.
This article is intended to review specfic languages features and their use in computational science.
We will review the strengths and weaknesses of different programming styles, with examples taken from widely used scientific codes.
The lack of realistic and open benchmarking datasets for pedestrian visual-inertial odometry has made it hard to pinpoint differences in published methods.
Existing datasets either lack a full six degree-of-freedom ground-truth or are limited to small spaces with optical tracking systems.
We take advantage of advances in pure inertial navigation, and develop a set of versatile and challenging real-world computer vision benchmark sets for visual-inertial odometry.
For this purpose, we have built a test rig equipped with an iPhone, a Google Pixel Android phone, and a Google Tango device.
We provide a wide range of raw sensor data that is accessible on almost any modern-day smartphone together with a high-quality ground-truth track.
We also compare resulting visual-inertial tracks from Google Tango, ARCore, and Apple ARKit with two recent methods published in academic forums.
The data sets cover both indoor and outdoor cases, with stairs, escalators, elevators, office environments, a shopping mall, and metro station.
Recent studies have shown that adaptively regulating the sampling rate results in significant reduction in computational resources in embedded software based control.
Selecting a uniform sampling rate for a control loop is robust, but overtly pessimistic for sharing processors among multiple control loops.
Fine grained regulation of periodicity achieves better resource utilization, but is hard to implement online in a robust way.
In this paper we propose multi-mode sampling period selection, derived from an offline control theoretic analysis of the system.
We report significant gains in computational efficiency without trading off control performance.
Recently, a chaotic image encryption algorithm based on perceptron model was proposed.
The present paper analyzes security of the algorithm and finds that the equivalent secret key can be reconstructed with only one pair of known-plaintext/ciphertext, which is supported by both mathematical proof and experiment results.
In addition, some other security defects are also reported.
Several state-of-the-art video deblurring methods are based on a strong assumption that the captured scenes are static.
These methods fail to deblur blurry videos in dynamic scenes.
We propose a video deblurring method to deal with general blurs inherent in dynamic scenes, contrary to other methods.
To handle locally varying and general blurs caused by various sources, such as camera shake, moving objects, and depth variation in a scene, we approximate pixel-wise kernel with bidirectional optical flows.
Therefore, we propose a single energy model that simultaneously estimates optical flows and latent frames to solve our deblurring problem.
We also provide a framework and efficient solvers to optimize the energy model.
By minimizing the proposed energy function, we achieve significant improvements in removing blurs and estimating accurate optical flows in blurry frames.
Extensive experimental results demonstrate the superiority of the proposed method in real and challenging videos that state-of-the-art methods fail in either deblurring or optical flow estimation.
Authors propose a conceptual model of participation in viral diffusion process composed of four stages: awareness, infection, engagement and action.
To verify the model it has been applied and studied in the virtual social chat environment settings.
The study investigates the behavioral paths of actions that reflect the stages of participation in the diffusion and presents shortcuts, that lead to the final action, i.e. the attendance in a virtual event.
The results show that the participation in each stage of the process increases the probability of reaching the final action.
Nevertheless, the majority of users involved in the virtual event did not go through each stage of the process but followed the shortcuts.
That suggests that the viral diffusion process is not necessarily a linear sequence of human actions but rather a dynamic system.
What is here called controlled natural language (CNL) has traditionally been given many different names.
Especially during the last four decades, a wide variety of such languages have been designed.
They are applied to improve communication among humans, to improve translation, or to provide natural and intuitive representations for formal notations.
Despite the apparent differences, it seems sensible to put all these languages under the same umbrella.
To bring order to the variety of languages, a general classification scheme is presented here.
A comprehensive survey of existing English-based CNLs is given, listing and describing 100 languages from 1930 until today.
Classification of these languages reveals that they form a single scattered cloud filling the conceptual space between natural languages such as English on the one end and formal languages such as propositional logic on the other.
The goal of this article is to provide a common terminology and a common model for CNL, to contribute to the understanding of their general nature, to provide a starting point for researchers interested in the area, and to help developers to make design decisions.
This paper concerns model reduction of dynamical systems using the nuclear norm of the Hankel matrix to make a trade-off between model fit and model complexity.
This results in a convex optimization problem where this trade-off is determined by one crucial design parameter.
The main contribution is a methodology to approximately calculate all solutions up to a certain tolerance to the model reduction problem as a function of the design parameter.
This is called the regularization path in sparse estimation and is a very important tool in order to find the appropriate balance between fit and complexity.
We extend this to the more complicated nuclear norm case.
The key idea is to determine when to exactly calculate the optimal solution using an upper bound based on the so-called duality gap.
Hence, by solving a fixed number of optimization problems the whole regularization path up to a given tolerance can be efficiently computed.
We illustrate this approach on some numerical examples.
Creative telescoping algorithms compute linear differential equations satisfied by multiple integrals with parameters.
We describe a precise and elementary algorithmic version of the Griffiths-Dwork method for the creative telescoping of rational functions.
This leads to bounds on the order and degree of the coefficients of the differential equation, and to the first complexity result which is simply exponential in the number of variables.
One of the important features of the algorithm is that it does not need to compute certificates.
The approach is vindicated by a prototype implementation.
How would you search for a unique, fashionable shoe that a friend wore and you want to buy, but you didn't take a picture?
Existing approaches propose interactive image search as a promising venue.
However, they either entrust the user with taking the initiative to provide informative feedback, or give all control to the system which determines informative questions to ask.
Instead, we propose a mixed-initiative framework where both the user and system can be active participants, depending on whose initiative will be more beneficial for obtaining high-quality search results.
We develop a reinforcement learning approach which dynamically decides which of three interaction opportunities to give to the user: drawing a sketch, providing free-form attribute feedback, or answering attribute-based questions.
By allowing these three options, our system optimizes both the informativeness and exploration capabilities allowing faster image retrieval.
We outperform three baselines on three datasets and extensive experimental settings.
A novel control design approach for general nonlinear systems is presented in this paper.
The approach is based on the identification of a polynomial model of the system to control and on the on-line inversion of this model.
An efficient technique is developed to perform the inversion, which allows an effective control implementation on real-time processors.
This large-scale study, consisting of 24.5 million hand hygiene opportunities spanning 19 distinct facilities in 10 different states, uses linear predictive models to expose factors that may affect hand hygiene compliance.
We examine the use of features such as temperature, relative humidity, influenza severity, day/night shift, federal holidays and the presence of new residents in predicting daily hand hygiene compliance.
The results suggest that colder temperatures and federal holidays have an adverse effect on hand hygiene compliance rates, and that individual cultures and attitudes regarding hand hygiene seem to exist among facilities.
Scissor lifts, a staple of mechanical design, especially in competitive robotics, are a type of linkage that can be used to raise a load to some height, when acted upon by some force, usually exerted by an actuator.
The position of this actuator, however, can affect the mechanical advantage and velocity ratio of the system.
Hence, there needs to be a concrete way to analytically compare different actuator positions.
However, all current research into the analysis of scissor lifts either focusses only on the screw jack configuration, or derives separate force expressions for different actuator positions.
This, once again, leaves the decision between different actuator positions to trial and error, since the expression to test the potency of the position can only be derived once the position is chosen.
This paper proposes a derivation for a general force expression, in terms of a few carefully chosen position variables, which can be used to generate the force expression for any actuator position.
Hence, this expression illustrates exactly how each of the position variables (called a, b and i in this paper, as defined later) affect the force output, and hence can be used to pick an appropriate actuator position, by choosing values for the position variables that give the desired result.
Authoring documents in MKM formats like OMDoc is a very tedious task.
After years of working on a semantically annotated corpus of sTeX documents (GenCS), we identified a set of common, time-consuming subtasks, which can be supported in an integrated authoring environment.
We have adapted the modular Eclipse IDE into sTeXIDE, an authoring solution for enhancing productivity in contributing to sTeX based corpora. sTeXIDE supports context-aware command completion, module management, semantic macro retrieval, and theory graph navigation.
Group communication implies a many-to-many communication and it goes beyond both one-to-one communication (i.e., unicast) and one-to-many communication (i.e., multicast).
Unlike most user authentication protocols that authenticate a single user each time, we propose a new type of authentication, called group authentication, that authenticates all users in a group at once.
The group authentication protocol is specially designed to support group communications.
There is a group manager who is responsible to manage the group communication.
During registration, each user of a group obtains an unique token from the group manager.
Users present their tokens to determine whether they all belong to the same group or not.
The group authentication protocol allows users to reuse their tokens without compromising the security of tokens.
In addition, the group authentication can protect the identity of each user.
The number of bandwidth-hungry applications and services is constantly growing.
HTTP adaptive streaming of audio-visual content accounts for the majority of today's internet traffic.
Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs.
This paper proposes a multi-codec DASH dataset comprising AVC, HEVC, VP9, and AV1 in order to enable interoperability testing and streaming experiments for the efficient usage of these codecs under various conditions.
We adopt state of the art encoding and packaging options and also provide basic quality metrics along with the DASH segments.
Additionally, we briefly introduce a multi-codec DASH scheme and possible usage scenarios.
Finally, we provide a preliminary evaluation of the encoding efficiency in the context of HTTP adaptive streaming services and applications.
We propose a novel GAN-based framework for detecting shadows in images, in which a shadow detection network (D-Net) is trained together with a shadow attenuation network (A-Net) that generates adversarial training examples.
The A-Net modifies the original training images constrained by a simplified physical shadow model and is focused on fooling the D-Net's shadow predictions.
Hence, it is effectively augmenting the training data for D-Net with hard-to-predict cases.
The D-Net is trained to predict shadows in both original images and generated images from the A-Net.
Our experimental results show that the additional training data from A-Net significantly improves the shadow detection accuracy of D-Net.
Our method outperforms the state-of-the-art methods on the most challenging shadow detection benchmark (SBU) and also obtains state-of-the-art results on a cross-dataset task, testing on UCF.
Furthermore, the proposed method achieves accurate real-time shadow detection at 45 frames per second.
Individual neurons in convolutional neural networks supervised for image-level classification tasks have been shown to implicitly learn semantically meaningful concepts ranging from simple textures and shapes to whole or partial objects - forming a "dictionary" of concepts acquired through the learning process.
In this work we introduce a simple, efficient zero-shot learning approach based on this observation.
Our approach, which we call Neuron Importance-AwareWeight Transfer (NIWT), learns to map domain knowledge about novel "unseen" classes onto this dictionary of learned concepts and then optimizes for network parameters that can effectively combine these concepts - essentially learning classifiers by discovering and composing learned semantic concepts in deep networks.
Our approach shows improvements over previous approaches on the CUBirds and AWA2 generalized zero-shot learning benchmarks.
We demonstrate our approach on a diverse set of semantic inputs as external domain knowledge including attributes and natural language captions.
Moreover by learning inverse mappings, NIWT can provide visual and textual explanations for the predictions made by the newly learned classifiers and provide neuron names.
Our code is available at https://github.com/ramprs/neuron-importance-zsl.
The Universal Turing Machine (TM) is a model for VonNeumann computers --- general-purpose computers.
A human brain can inside-skull-automatically learn a universal TM so that he acts as a general-purpose computer and writes a computer program for any practical purposes.
It is unknown whether a machine can accomplish the same.
This theoretical work shows how the Developmental Network (DN) can accomplish this.
Unlike a traditional TM, the TM learned by DN is a super TM --- Grounded, Emergent, Natural, Incremental, Skulled, Attentive, Motivated, and Abstractive (GENISAMA).
A DN is free of any central controller (e.g., Master Map, convolution, or error back-propagation).
Its learning from a teacher TM is one transition observation at a time, immediate, and error-free until all its neurons have been initialized by early observed teacher transitions.
From that point on, the DN is no longer error-free but is always optimal at every time instance in the sense of maximal likelihood, conditioned on its limited computational resources and the learning experience.
This letter also extends the Church-Turing thesis to automatic programming for general purposes and sketchily proved it.
This paper presents a method for imaging of moving targets using multi-static SAR by treating the problem as one of spatial reflectivity signal inversion over an overcomplete dictionary of target velocities.
Since SAR sensor returns can be related to the spatial frequency domain projections of the scattering field, we exploit insights from compressed sensing theory to show that moving targets can be effectively imaged with transmitters and receivers randomly dispersed in a multi-static geometry within a narrow forward cone around the scene of interest.
Existing approaches to dealing with moving targets in SAR solve a coupled non-linear problem of target scattering and motion estimation typically through matched filtering.
In contrast, by using an overcomplete dictionary approach we effectively linearize the forward model and solve the moving target problem as a larger, unified regularized inversion problem subject to sparsity constraints.
An energy management scheme is presented for a grid-connected hybrid power system comprising of a photovoltaic generator as the primary power source and fuel-cell stacks as backup generation.
Power production is managed between the two sources such that a flexible operation is achieved, allowing the hybrid power system to supply a desired power demand by the grid operator.
In addition, the energy management algorithm and the control system are designed such that the hybrid power system supports the grid in case of both symmetrical and asymmetrical voltage sags, thus, adding low voltage ride-through capability, a requirement imposed by a number of modern grid codes on distributed generation.
During asymmetrical voltage sags, the injected active power is kept constant and grid currents are maintained sinusoidal with low harmonic content without requiring a phase locked loop or positive-negative sequence extraction, hence, lowering the computational complexity and design requirements of the control system.
Several test case scenarios are simulated using detailed component models using the SimPowerSystems toolbox of MATLAB/Simulink computing environment to demonstrate effectiveness of the proposed energy management control system under normal operating conditions and voltage sags.
Deep Neural Networks have been shown to be beneficial for a variety of tasks, in particular allowing for end-to-end learning and reducing the requirement for manual design decisions.
However, still many parameters have to be chosen in advance, also raising the need to optimize them.
One important, but often ignored system parameter is the selection of a proper activation function.
Thus, in this paper we target to demonstrate the importance of activation functions in general and show that for different tasks different activation functions might be meaningful.
To avoid the manual design or selection of activation functions, we build on the idea of genetic algorithms to learn the best activation function for a given task.
In addition, we introduce two new activation functions, ELiSH and HardELiSH, which can easily be incorporated in our framework.
In this way, we demonstrate for three different image classification benchmarks that different activation functions are learned, also showing improved results compared to typically used baselines.
Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states' long-term value under a given policy.
In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset.
We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then present a primal-dual batch gradient method, as well as two stochastic variance reduction methods for solving the problem.
These algorithms scale linearly in both sample size and feature dimension.
Moreover, they achieve linear convergence even when the saddle-point problem has only strong concavity in the dual variables but no strong convexity in the primal variables.
Numerical experiments on benchmark problems demonstrate the effectiveness of our methods.
The development of cyber-physical system (CPS) is a big challenge because of its complexity and its complex requirements.
Especially in Requirements Engineering (RE), there exist many redundant and conflict requirements.
Eliminating conflict requirements and merged redundant/common requirements lead a challenging task at the elicitation phase in the requirements engineering process for CPS.
Collecting and optimizing requirements through appropriate process reduce both development time and cost as every functional requirement gets refined and optimized at very first stage (requirements elicitation phase) of the whole development process.
Existing researches have focused on requirements those have already been collected.
However, none of the researches have worked on how the requirements are collected and refined.
This paper provides a requirements model for CPS that gives a direction about the requirements be gathered, refined and cluster in order to developing the CPS independently.
The paper also shows a case study about the application of the proposed model to transport system.
Bibliometric methods are used in multiple fields for a variety of purposes, namely for research evaluation.
Most bibliometric analyses have in common their data sources: Thomson Reuters' Web of Science (WoS) and Elsevier's Scopus.
This research compares the journal coverage of both databases in terms of fields, countries and languages, using Ulrich's extensive periodical directory as a base for comparison.
Results indicate that the use of either WoS or Scopus for research evaluation may introduces biases that favor Natural Sciences and Engineering as well as Biomedical Research to the detriment of Social Sciences and Arts and Humanities.
Similarly, English-language journals are overrepresented to the detriment of other languages.
While both databases share these biases, their coverage differs substantially.
As a consequence, the results of bibliometric analyses may vary depending on the database used.
For data integration in information ecosystems, semantic heterogeneity is a known difficulty.
In this paper, we propose Shadow Theory as the philosophical foundation to address this issue.
It is based on the notion of shadows in Plato's Allegory of the Cave.
What we can observe are just shadows, and meanings of shadows are mental entities that only exist in viewers' cognitive structures.
With enterprise customer data integration example, we proposed six design principles and algebra to support required operations.
To enhance the performance of affective models and reduce the cost of acquiring physiological signals for real-world applications, we adopt multimodal deep learning approach to construct affective models from multiple physiological signals.
For unimodal enhancement task, we indicate that the best recognition accuracy of 82.11% on SEED dataset is achieved with shared representations generated by Deep AutoEncoder (DAE) model.
For multimodal facilitation tasks, we demonstrate that the Bimodal Deep AutoEncoder (BDAE) achieves the mean accuracies of 91.01% and 83.25% on SEED and DEAP datasets, respectively, which are much superior to the state-of-the-art approaches.
For cross-modal learning task, our experimental results demonstrate that the mean accuracy of 66.34% is achieved on SEED dataset through shared representations generated by EEG-based DAE as training samples and shared representations generated by eye-based DAE as testing sample, and vice versa.
Privacy problems are lethal and getting more attention than any other issue with the notion of the Internet of Things (IoT).
Since IoT has many application areas including smart home, smart grids, smart healthcare system, smart and intelligent transportation and many more.
Most of these applications are fueled by the resource-constrained sensor network, such as Smart healthcare system is powered by Wireless Body Area Network (WBAN) and Smart home and weather monitoring systems are fueled by Wireless Sensor Networks (WSN).
In the mentioned application areas sensor node life is a very important aspect of these technologies as it explicitly effects the network life and performance.
Data aggregation techniques are used to increase sensor node life by decreasing communication overhead.
However, when the data is aggregated at intermediate nodes to reduce communication overhead, data privacy problems becomes more vulnerable.
Different Privacy-Preserving Data Aggregation (PPDA) techniques have been proposed to ensure data privacy during data aggregation in resource-constrained sensor nodes.
We provide a review and comparative analysis of the state of the art PPDA techniques in this paper.
The comparative analysis is based on Computation Cost, Communication overhead, Privacy Level, resistance against malicious aggregator, sensor node life and energy consumption by the sensor node.
We have studied the most recent techniques and provide in-depth analysis of the minute steps involved in these techniques.
To the best of our knowledge, this survey is the most recent and comprehensive study of PPDA techniques.
This paper presents an approach for transforming data granularity in hierarchical databases for binary decision problems by applying regression to categorical attributes at the lower grain levels.
Attributes from a lower hierarchy entity in the relational database have their information content optimized through regression on the categories histogram trained on a small exclusive labelled sample, instead of the usual mode category of the distribution.
The paper validates the approach on a binary decision task for assessing the quality of secondary schools focusing on how logistic regression transforms the students and teachers attributes into school attributes.
Experiments were carried out on Brazilian schools public datasets via 10-fold cross-validation comparison of the ranking score produced also by logistic regression.
The proposed approach achieved higher performance than the usual distribution mode transformation and equal to the expert weighing approach measured by the maximum Kolmogorov-Smirnov distance and the area under the ROC curve at 0.01 significance level.
Phonemic segmentation of speech is a critical step of speech recognition systems.
We propose a novel unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural network.
Our approach consists in analyzing the error profile of a model trained to predict speech features frame-by-frame.
Specifically, we try to learn the dynamics of speech in the MFCC space and hypothesize boundaries from local maxima in the prediction error.
We evaluate our system on the TIMIT dataset, with improvements over similar methods.
As a collection of 3D points sampled from surfaces of objects, a 3D point cloud is widely used in robotics, autonomous driving and augmented reality.
Due to the physical limitations of 3D sensing devices, 3D point clouds are usually noisy, which influences subsequent computations, such as surface reconstruction, recognition and many others.
To denoise a 3D point cloud, we present a novel algorithm, called weighted multi-projection.
Compared to many previous works on denoising, instead of directly smoothing the coordinates of 3D points, we use a two-fold smoothing: We first estimate a local tangent plane at each 3D point and then reconstruct each 3D point by weighted averaging of its projections on multiple tangent planes.
We also provide the theoretical analysis for the surface normal estimation and achieve a tighter bound than in a previous work.
We validate the empirical performance on the dataset of ShapeNetCore and show that weighted multi-projection outperforms its competitors in all nine classes.
In a multi-user millimeter (mm) wave communication system, we consider the problem of estimating the channel response between the central node (base station) and each of the user equipments (UE).
We propose three different strategies: 1) Each UE estimates its channel separately, 2) Base station estimates all the UEs channels jointly, and 3) Two stage process with estimation done at both UE and base station.
Exploiting the low rank nature of the mm wave channels, we propose a generalized block orthogonal matching pursuit (G.BOMP) framework for channel estimation in all the three strategies.
Our simulation results show that, the average beamforming gain of the G.BOMP algorithm is higher than that of the conventional OMP algorithm and other existing works on the multi-user mm wave system.
In this paper, we present an open data set extracted from the transaction log of the social sciences academic search engine sowiport.
The data set includes a filtered set of 484,449 retrieval sessions which have been carried out by sowiport users in the period from April 2014 to April 2015.
We propose a description of interactions performed by the academic search engine users that can be used in different applications such as result ranking improvement, user modeling, query reformulation analysis, search pattern recognition.
Each year, the treatment decisions for more than 230,000 breast cancer patients in the U.S. hinge on whether the cancer has metastasized away from the breast.
Metastasis detection is currently performed by pathologists reviewing large expanses of biological tissues.
This process is labor intensive and error-prone.
We present a framework to automatically detect and localize tumors as small as 100 x 100 pixels in gigapixel microscopy images sized 100,000 x 100,000 pixels.
Our method leverages a convolutional neural network (CNN) architecture and obtains state-of-the-art results on the Camelyon16 dataset in the challenging lesion-level tumor detection task.
At 8 false positives per image, we detect 92.4% of the tumors, relative to 82.7% by the previous best automated approach.
For comparison, a human pathologist attempting exhaustive search achieved 73.2% sensitivity.
We achieve image-level AUC scores above 97% on both the Camelyon16 test set and an independent set of 110 slides.
In addition, we discover that two slides in the Camelyon16 training set were erroneously labeled normal.
Our approach could considerably reduce false negative rates in metastasis detection.
Increasing interest in securing Android ecosystem has spawned numerous efforts to assist app developers in building secure apps.
These efforts have resulted in tools and techniques capable of detecting vulnerabilities (and malicious behaviors) in apps.
However, there has been no evaluation of the effectiveness of these tools and techniques in detecting known vulnerabilities.
Absence of such evaluations puts app developers at a disadvantage when choosing security analysis tools to secure their apps.
In this regard, we evaluated the effectiveness of vulnerability detection tools for Android apps.
We considered 64 tools and empirically evaluated 14 vulnerability detection tools (incidentally along with 5 malicious behavior detection tools) against 42 known unique vulnerabilities captured by Ghera benchmarks, which are composed of both vulnerable and secure apps.
Of the 24 observations from the evaluation, the key observation is existing vulnerability detection tools for Android apps are very limited in their ability to detect known vulnerabilities --- all of the evaluated tools together could only detect 30 of the 42 known unique vulnerabilities.
More effort is required if security analysis tools are to help developers build secure apps.
We hope the observations from this evaluation will help app developers choose appropriate security analysis tools and persuade tool developers and researchers to identify and address limitations in their tools and techniques.
We also hope this evaluation will catalyze or spark a conversation in the software engineering and security communities to require more rigorous and explicit evaluation of security analysis tools and techniques.
With the rapidly changing technological realm, there is an urgent need to provide and protect the confidentiality of confidential images when stored in a cloud environment.
To overcome the security risks associated with single cloud, multiple clouds offered by unrelated cloud providers have to be used.
This paper outlines an integrated encryption scheme for the secure storage of confidential images on multiple clouds based on DNA sequences.
The current work proposes an application of DEA methodology for measurement of technical and allocative efficiency of university research activity.
The analysis is based on bibliometric data from the Italian university system for the five year period 2004-2008.
Technical and allocative efficiency is measured with input being considered as a university's research staff, classified according to academic rank, and with output considered as the field-standardized impact of the research product realized by these staff.
The analysis is applied to all scientific disciplines of the so-called hard sciences, and conducted at subfield level, thus at a greater level of detail than ever before achieved in national-scale research assessments.
To design trustworthy robots, we need to understand the impact factors of trust: people's attitudes, experience, and characteristics; the robot's physical design, reliability, and performance; a task's specification and the circumstances under which it is to be performed, e.g. at leisure or under time pressure.
As robots are used for a wide variety of tasks and applications, robot designers ought to be provided with evidence and guidance, to inform their decisions to achieve safe, trustworthy and efficient human-robot interactions.
In this work, the impact factors of trust in a collaborative manufacturing scenario are studied by conducting an experiment with a real robot and participants where a physical object was assembled and then disassembled.
Objective and subjective measures were employed to evaluate the development of trust, under faulty and non-faulty robot conditions, and the effect of previous experience with robots, and personality traits.
Our findings highlight differences when compared to other, more social, scenarios with robotic assistants (such as a home care assistant), in that the condition (faulty or not) does not have a significant impact on the human's perception of the robot in terms of human-likeliness, likeability, trustworthiness, and even competence.
However, personality and previous experience do have an effect on how the robot is perceived by participants, even though that is relatively small.
Various studies have empirically shown that the majority of Java and Android apps misuse cryptographic libraries, causing devastating breaches of data security.
Therefore, it is crucial to detect such misuses early in the development process.
The fact that insecure usages are not the exception but the norm precludes approaches based on property inference and anomaly detection.
In this paper, we present CrySL, a definition language that enables cryptography experts to specify the secure usage of the cryptographic libraries that they provide.
CrySL combines the generic concepts of method-call sequences and data-flow constraints with domain-specific constraints related to cryptographic algorithms and their parameters.
We have implemented a compiler that translates a CrySL ruleset into a context- and flow-sensitive demand-driven static analysis.
The analysis automatically checks a given Java or Android app for violations of the CrySL-encoded rules.
We empirically evaluated our ruleset through analyzing 10,001 Android apps.
Our results show that misuse of cryptographic APIs is still widespread, with 96% of apps containing at least one misuse.
However, we observed fewer of the misuses that were reported in previous work.
Mobile ad-hoc networks (MANETs) are a set of self organized wireless mobile nodes that works without any predefined infrastructure.
For routing data in MANETs, the routing protocols relay on mobile wireless nodes.
In general, any routing protocol performance suffers i) with resource constraints and ii) due to the mobility of the nodes.
Due to existing routing challenges in MANETs clustering based protocols suffers frequently with cluster head failure problem, which degrades the cluster stability.
This paper proposes, Enhanced CBRP, a schema to improve the cluster stability and in-turn improves the performance of traditional cluster based routing protocol (CBRP), by electing better cluster head using weighted clustering algorithm and considering some crucial routing challenges.
Moreover, proposed protocol suggests a secondary cluster head for each cluster, to increase the stability of the cluster and implicitly the network infrastructure in case of sudden failure of cluster head.
Computer Vision, either alone or combined with other technologies such as radar or Lidar, is one of the key technologies used in Advanced Driver Assistance Systems (ADAS).
Its role understanding and analysing the driving scene is of great importance as it can be noted by the number of ADAS applications that use this technology.
However, porting a vision algorithm to an embedded automotive system is still very challenging, as there must be a trade-off between several design requisites.
Furthermore, there is not a standard implementation platform, so different alternatives have been proposed by both the scientific community and the industry.
This paper aims to review the requisites and the different embedded implementation platforms that can be used for Computer Vision-based ADAS, with a critical analysis and an outlook to future trends.
Regret theory is a theory that describes human decision-making under risk.
The key of obtaining a quantitative model of regret theory is to measure the preference in humans' mind when they choose among a set of options.
Unlike physical quantities, measuring psychological preference is not procedure invariant, i.e. the readings alter when the methods change.
In this work, we alleviate this influence by choosing the procedure compatible with the way that an individual makes a choice.
We believe the resulting model is closer to the nature of human decision-making.
The preference elicitation process is decomposed into a series of short surveys to reduce cognitive workload and increase response accuracy.
To make the questions natural and familiar to the subjects, we follow the insight that humans generate, quantify and communicate preference in natural language.
The fuzzy-set theory is hence utilized to model responses from subjects.
Based on these ideas, a graphical human-computer interface (HCI) is designed to articulate the information as well as to efficiently collect human responses.
The design also accounts for human heuristics and biases, e.g. range effect and anchoring effect, to enhance its reliability.
The overall performance of the survey is satisfactory because the measured model shows prediction accuracy equivalent to the revisit-performance of the subjects.
With the fast-growing economy in the past ten years, cities in China have experience great changes, meanwhile, huge volume of urban grid management data has been recorded.
Studies on urban grid management are not common so far.
This kind of study is important, however, because the urban grid data describes the individual behaviors and detailed problems in community, and reveals the dynamics of changing policies and social relations.
In this article, we did a preliminary study on the urban grid management data of Shanghai, and investigated the key characteristics of the interactions between local government and citizen in such a fast-growing metropolitan.
Our investigation illustrates the dynamics of coevolution between economy and living environments.
We also developed mathematical model to quantitatively discover the spatial and temporal intra-relations among events found in data, providing insights to local government to fine tune the policy of resource allocation and give proper incentives to drive the coevolution to the optimal state, thereby achieving the good governance.
The problem of Learning from Demonstration is targeted at learning to perform tasks based on observed examples.
One approach to Learning from Demonstration is Inverse Reinforcement Learning, in which actions are observed to infer rewards.
This work combines a feature based state evaluation approach to Inverse Reinforcement Learning with neuroevolution, a paradigm for modifying neural networks based on their performance on a given task.
Neural networks are used to learn from a demonstrated expert policy and are evolved to generate a policy similar to the demonstration.
The algorithm is discussed and evaluated against competitive feature-based Inverse Reinforcement Learning approaches.
At the cost of execution time, neural networks allow for non-linear combinations of features in state evaluations.
These valuations may correspond to state value or state reward.
This results in better correspondence to observed examples as opposed to using linear combinations.
This work also extends existing work on Bayesian Non-Parametric Feature Construction for Inverse Reinforcement Learning by using non-linear combinations of intermediate data to improve performance.
The algorithm is observed to be specifically suitable for a linearly solvable non-deterministic Markov Decision Processes in which multiple rewards are sparsely scattered in state space.
A conclusive performance hierarchy between evaluated algorithms is presented.
Botnets continue to be an active threat against firms or companies and individuals worldwide.
Previous research regarding botnets has unveiled information on how the system and their stakeholders operate, but an insight on the economic structure that supports these stakeholders is lacking.
The objective of this research is to analyse the business model and determine the revenue stream of a botnet owner.
We also study the botnet life-cycle and determine the costs associated with it on the basis of four case studies.
We conclude that building a full scale cyber army from scratch is very expensive where as acquiring a previously developed botnet requires a little cost.
We find that initial setup and monthly costs were minimal compared to total revenue.
While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from the problems of context confusion and edge color bleeding.
To address context confusion, we propose to incorporate the pixel-level object semantics to guide the image colorization.
The rationale is that human beings perceive and distinguish colors based on the object's semantic categories.
We propose a hierarchical neural network with two branches.
One branch learns what the object is while the other branch learns the object's colors.
The network jointly optimizes a semantic segmentation loss and a colorization loss.
To attack edge color bleeding we generate more continuous color maps with sharp edges by adopting a joint bilateral upsamping layer at inference.
Our network is trained on PASCAL VOC2012 and COCO-stuff with semantic segmentation labels and it produces more realistic and finer results compared to the colorization state-of-the-art.
This article investigates emergence and complexity in complex systems that can share information on a network.
To this end, we use a theoretical approach from information theory, computability theory, and complex networks.
One key studied question is how much emergent complexity (or information) arises when a population of computable systems is networked compared with when this population is isolated.
First, we define a general model for networked theoretical machines, which we call algorithmic networks.
Then, we narrow our scope to investigate algorithmic networks that optimize the average fitnesses of nodes in a scenario in which each node imitates the fittest neighbor and the randomly generated population is networked by a time-varying graph.
We show that there are graph-topological conditions that cause these algorithmic networks to have the property of expected emergent open-endedness for large enough populations.
In other words, the expected emergent algorithmic complexity of a node tends to infinity as the population size tends to infinity.
Given a dynamic network, we show that these conditions imply the existence of a central time to trigger expected emergent open-endedness.
Moreover, we show that networks with small diameter compared to the network size meet these conditions.
We also discuss future research based on how our results are related to some problems in network science, information theory, computability theory, distributed computing, game theory, evolutionary biology, and synergy in complex systems.
Learning a deep neural network requires solving a challenging optimization problem: it is a high-dimensional, non-convex and non-smooth minimization problem with a large number of terms.
The current practice in neural network optimization is to rely on the stochastic gradient descent (SGD) algorithm or its adaptive variants.
However, SGD requires a hand-designed schedule for the learning rate.
In addition, its adaptive variants tend to produce solutions that generalize less well on unseen data than SGD with a hand-designed schedule.
We present an optimization method that offers empirically the best of both worlds: our algorithm yields good generalization performance while requiring only one hyper-parameter.
Our approach is based on a composite proximal framework, which exploits the compositional nature of deep neural networks and can leverage powerful convex optimization algorithms by design.
Specifically, we employ the Frank-Wolfe (FW) algorithm for SVM, which computes an optimal step-size in closed-form at each time-step.
We further show that the descent direction is given by a simple backward pass in the network, yielding the same computational cost per iteration as SGD.
We present experiments on the CIFAR and SNLI data sets, where we demonstrate the significant superiority of our method over Adam, Adagrad, as well as the recently proposed BPGrad and AMSGrad.
Furthermore, we compare our algorithm to SGD with a hand-designed learning rate schedule, and show that it provides similar generalization while converging faster.
The code is publicly available at https://github.com/oval-group/dfw.
Virtual network services that span multiple data centers are important to support emerging data-intensive applications in fields such as bioinformatics and retail analytics.
Successful virtual network service composition and maintenance requires flexible and scalable 'constrained shortest path management' both in the management plane for virtual network embedding (VNE) or network function virtualization service chaining (NFV-SC), as well as in the data plane for traffic engineering (TE).
In this paper, we show analytically and empirically that leveraging constrained shortest paths within recent VNE, NFV-SC and TE algorithms can lead to network utilization gains (of up to 50%) and higher energy efficiency.
The management of complex VNE, NFV-SC and TE algorithms can be, however, intractable for large scale substrate networks due to the NP-hardness of the constrained shortest path problem.
To address such scalability challenges, we propose a novel, exact constrained shortest path algorithm viz., 'Neighborhoods Method' (NM).
Our NM uses novel search space reduction techniques and has a theoretical quadratic speed-up making it practically faster (by an order of magnitude) than recent branch-and-bound exhaustive search solutions.
Finally, we detail our NM-based SDN controller implementation in a real-world testbed to further validate practical NM benefits for virtual network services.
Regular languages (RL) are the simplest family in Chomsky's hierarchy.
Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields.
Practically all of their related problems are decidable, so that they support automatic verification algorithms.
Also, they can be recognized in real-time.
Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t.
RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms.
Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques.
Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields.
After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families.
Restoring face images from distortions is important in face recognition applications and is challenged by multiple scale issues, which is still not well-solved in research area.
In this paper, we present a Sequential Gating Ensemble Network (SGEN) for multi-scale face restoration issue.
We first employ the principle of ensemble learning into SGEN architecture design to reinforce predictive performance of the network.
The SGEN aggregates multi-level base-encoders and base-decoders into the network, which enables the network to contain multiple scales of receptive field.
Instead of combining these base-en/decoders directly with non-sequential operations, the SGEN takes base-en/decoders from different levels as sequential data.
Specifically, the SGEN learns to sequentially extract high level information from base-encoders in bottom-up manner and restore low level information from base-decoders in top-down manner.
Besides, we propose to realize bottom-up and top-down information combination and selection with Sequential Gating Unit (SGU).
The SGU sequentially takes two inputs from different levels and decides the output based on one active input.
Experiment results demonstrate that our SGEN is more effective at multi-scale human face restoration with more image details and less noise than state-of-the-art image restoration models.
By using adversarial training, SGEN also produces more visually preferred results than other models through subjective evaluation.
Typically an ontology matching technique is a combination of much different type of matchers operating at various abstraction levels such as structure, semantic, syntax, instance etc.
An ontology matching technique which employs matchers at all possible abstraction levels is expected to give, in general, best results in terms of precision, recall and F-measure due to improvement in matching opportunities and if we discount efficiency issues which may improve with better computing resources such as parallel processing.
A gold standard ontology matching model is derived from a model classification of ontology matching techniques.
A suitable metric is also defined based on gold standard ontology matching model.
A review of various ontology matching techniques specified in recent research papers in the area was undertaken to categorize an ontology matching technique as per newly proposed gold standard model and a metric value for the whole group was computed.
The results of the above study support proposed gold standard ontology matching model.
One of the defining features of a cryptocurrency is that its ledger, containing all transactions that have ever taken place, is globally visible.
As one consequence of this degree of transparency, a long line of recent research has demonstrated that - even in cryptocurrencies that are specifically designed to improve anonymity - it is often possible to track flows of money as it changes hands, and in some cases to de-anonymize users entirely.
With the recent proliferation of alternative cryptocurrencies, however, it becomes relevant to ask not only whether or not money can be traced as it moves within the ledger of a single cryptocurrency, but if it can in fact be traced as it moves across ledgers.
This is especially pertinent given the rise in popularity of automated trading platforms such as ShapeShift, which make it effortless to carry out such cross-currency trades.
In this paper, we use data scraped from ShapeShift over a six-month period and the data from eight different blockchains in order to explore this question.
Beyond developing new heuristics and demonstrating the ability to create new types of links across cryptocurrency ledgers, we also identify various patterns of cross-currency trades and of the general usage of these platforms, with the ultimate goal of understanding whether they serve either a criminal or a profit-driven agenda.
Advanced Encryption Standard (AES) is a symmetric key encryption algorithm which is extensively used in secure electronic data transmission.
When introduced, although it was tested and declared as secure, in 2005, a researcher named Bernstein claimed that it is vulnerable to side channel attacks.
The cache-based timing attack is the type of side channel attack demonstrated by Bernstein, which uses the timing variation in cache hits and misses.
This kind of attacks can be prevented by masking the actual timing information from the attacker.
Such masking can be performed by altering the original AES software implementation while preserving its semantics.
This paper presents possible software implementation level countermeasures against Bernstein's cache timing attack.
Two simple software based countermeasures based on the concept of "constant-encryption-time" were demonstrated against the remote cache timing attack with positive outcomes, in which we establish a secured environment for the AES encryption.
In this work, we explore the outage probability (OP) analysis of selective decode and forward (SDF) cooperation protocol employing multiple-input multipleoutput (MIMO) orthogonal space-time block-code (OSTBC) over time varying Rayleigh fading channel conditions with imperfect channel state information (CSI) and mobile nodes.
The closed-form expressions of the per-block average OP, probability distribution function (PDF) of sum of independent and identically distributed (i.i.d.)
Gamma random variables (RVs), and cumulative distribution function (CDF) are derived and used to investigate the performance of the relaying network.
A mathematical framework is developed to derive the optimal source-relay power allocation factors.
It is shown that source node mobility affects the per-block average OP performance more significantly than the destination node mobility.
Nevertheless, in other node mobility situations, cooperative systems are constrained by an error floor with a higher signal to noise ratio (SNR) regimes.
Simulation results show that the equal power allocation is the only possible optimal solution when source to relay link is stronger than the relay to destination link.
Also, we allocate almost all the power to the source node when source to relay link is weaker than the relay to destination link.
Simulation results also show that OP simulated plots are in close agreement with the OP analytic plots at high SNR regimes.
Digital predistortion (DPD) is a widely adopted baseband processing technique in current radio transmitters.
While DPD can effectively suppress unwanted spurious spectrum emissions stemming from imperfections of analog RF and baseband electronics, it also introduces extra processing complexity and poses challenges on efficient and flexible implementations, especially for mobile cellular transmitters, considering their limited computing power compared to basestations.
In this paper, we present high data rate implementations of broadband DPD on modern embedded processors, such as mobile GPU and multicore CPU, by taking advantage of emerging parallel computing techniques for exploiting their computing resources.
We further verify the suppression effect of DPD experimentally on real radio hardware platforms.
Performance evaluation results of our DPD design demonstrate the high efficacy of modern general purpose mobile processors on accelerating DPD processing for a mobile transmitter.
Multiplayer Online Battle Arena (MOBA) is currently one of the most popular genres of digital games around the world.
The domain of knowledge contained in these complicated games is large.
It is hard for humans and algorithms to evaluate the real-time game situation or predict the game result.
In this paper, we introduce MOBA-Slice, a time slice based evaluation framework of relative advantage between teams in MOBA games.
MOBA-Slice is a quantitative evaluation method based on learning, similar to the value network of AlphaGo.
It establishes a foundation for further MOBA related research including AI development.
In MOBA-Slice, with an analysis of the deciding factors of MOBA game results, we design a neural network model to fit our discounted evaluation function.
Then we apply MOBA-Slice to Defense of the Ancients 2 (DotA2), a typical and popular MOBA game.
Experiments on a large number of match replays show that our model works well on arbitrary matches.
MOBA-Slice not only has an accuracy 3.7% higher than DotA Plus Assistant at result prediction, but also supports the prediction of the remaining time of the game, and then realizes the evaluation of relative advantage between teams.
We study the problem of stochastic optimization for deep learning in the parallel computing environment under communication constraints.
A new algorithm is proposed in this setting where the communication and coordination of work among concurrent processes (local workers), is based on an elastic force which links the parameters they compute with a center variable stored by the parameter server (master).
The algorithm enables the local workers to perform more exploration, i.e. the algorithm allows the local variables to fluctuate further from the center variable by reducing the amount of communication between local workers and the master.
We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.
We propose synchronous and asynchronous variants of the new algorithm.
We provide the stability analysis of the asynchronous variant in the round-robin scheme and compare it with the more common parallelized method ADMM.
We show that the stability of EASGD is guaranteed when a simple stability condition is satisfied, which is not the case for ADMM.
We additionally propose the momentum-based version of our algorithm that can be applied in both synchronous and asynchronous settings.
Asynchronous variant of the algorithm is applied to train convolutional neural networks for image classification on the CIFAR and ImageNet datasets.
Experiments demonstrate that the new algorithm accelerates the training of deep architectures compared to DOWNPOUR and other common baseline approaches and furthermore is very communication efficient.
This article presents a novel intruder model for automated reasoning about anonymity (vote-privacy) and secrecy properties of voting systems.
We adapt the lazy spy for this purpose, as it avoids the eagerness of pre-computation of unnecessary deductions, reducing the required state space for the analysis.
This powerful intruder behaves as a Dolev-Yao intruder, which not only observes a protocol run but also interacts with the protocol participants, overhears communication channels, intercepts and spoofs any messages that he has learned or generated from any prior knowledge.
We make several important modifications in relation to existing channel types and the deductive system.
For the former, we define various channel types for different threat models.
For the latter, we construct a large deductive system over the space of messages transmitted in the voting system model.
The model represents the first formal treatment of the vVote system, which was used in November 2014, in state elections in Victoria, Australia.
This paper presents the kinematic analysis of the 3-PPPS parallel robot with an equilateral mobile platform and a U-shape base.
The proposed design and appropriate selection of parameters allow to formulate simpler direct and inverse kinematics for the manipulator under study.
The parallel singularities associated with the manipulator depend only on the orientation of the end-effector, and thus depend only on the orientation of the end effector.
The quaternion parameters are used to represent the aspects, i.e. the singularity free regions of the workspace.
A cylindrical algebraic decomposition is used to characterize the workspace and joint space with a low number of cells.
The dis-criminant variety is obtained to describe the boundaries of each cell.
With these simplifications, the 3-PPPS parallel robot with proposed design can be claimed as the simplest 6 DOF robot, which further makes it useful for the industrial applications.
A large semantic gap between the high-level synthesis (HLS) design and the low-level (on-board or RTL) simulation environment often creates a barrier for those who are not FPGA experts.
Moreover, such low-level simulation takes a long time to complete.
Software-based HLS simulators can help bridge this gap and accelerate the simulation process; however, we found that the current FPGA HLS commercial software simulators sometimes produce incorrect results.
In order to solve this correctness issue while maintaining the high speed of a software-based simulator, this paper proposes a new HLS simulation flow named FLASH.
The main idea behind the proposed flow is to extract the scheduling information from the HLS tool and automatically construct an equivalent cycle-accurate simulation model while preserving C semantics.
Experimental results show that FLASH runs three orders of magnitude faster than the RTL simulation.
Video description is the automatic generation of natural language sentences that describe the contents of a given video.
It is useful for helping the visually impaired, video subtitling and robotics.
The past few years have seen a surge of research in this area due to the unprecedented success of deep learning in computer vision and natural language processing.
Numerous methods, datasets and evaluation metrics have been proposed in the literature, calling the need for a comprehensive survey to focus research efforts in this flourishing new direction.
This paper fills the gap by surveying the state of the art approaches with a focus on deep learning models; comparing benchmark datasets in terms of their domain, number of classes, and repository size; and identifying the pros and cons of various evaluation metrics like SPICE, CIDEr, ROUGE, BLEU, METEOR, and WMD.
Classical approaches combined subject, object and verb detection with template based language models to generate sentences.
However, the release of large datasets revealed that these methods can not cope with the diversity in open domain videos.
Classical approaches were followed by a very short era of statistical methods which were soon replaced with deep learning, the current state of the art in video description.
Our survey shows that despite the fast-paced developments, video description research is still in its infancy due to the following reasons.
Firstly, existing datasets neither contain adequate visual diversity nor complexity of linguistic structures.
Secondly, current evaluation metrics fall short of measuring the agreement between machine generated descriptions with that of humans.
From an algorithmic point of view, diagnosis of new models is challenging because it is difficult to ascertain the contributions of the visual features and the adopted language model to the final description.
We conclude...
Actor of its presentation and actor of its online representation, the diarist draws his diegetic existence by setting up a strategy of automediation.
The Self-representation is a personal creation determined by the interface and the functionalities of the software.
A pragmatic approach of the Self-representation in the Livejournal Blog and the Touchgraph Livejournal browser provides a way to observe the play between intimacy and intersubjectivity.
The software leads the user from the lonely space of writing to the community space of publication.
We propose an approach to address two issues that commonly occur during training of unsupervised GANs.
First, since GANs use only a continuous latent distribution to embed multiple classes or clusters of data, they often do not correctly handle the structural discontinuity between disparate classes in a latent space.
Second, discriminators of GANs easily forget about past generated samples by generators, incurring instability during adversarial training.
We argue that these two infamous problems of unsupervised GAN training can be largely alleviated by a learnable memory network to which both generators and discriminators can access.
Generators can effectively learn representation of training samples to understand underlying cluster distributions of data, which ease the structure discontinuity problem.
At the same time, discriminators can better memorize clusters of previously generated samples, which mitigate the forgetting problem.
We propose a novel end-to-end GAN model named memoryGAN, which involves a memory network that is unsupervisedly trainable and integrable to many existing GAN models.
With evaluations on multiple datasets such as Fashion-MNIST, CelebA, CIFAR10, and Chairs, we show that our model is probabilistically interpretable, and generates realistic image samples of high visual fidelity.
The memoryGAN also achieves the state-of-the-art inception scores over unsupervised GAN models on the CIFAR10 dataset, without any optimization tricks and weaker divergences.
We introduce a new generative model for human planning under the Bayesian Inverse Reinforcement Learning (BIRL) framework which takes into account the fact that humans often plan using hierarchical strategies.
We describe the Bayesian Inverse Hierarchical RL (BIHRL) algorithm for inferring the values of hierarchical planners, and use an illustrative toy model to show that BIHRL retains accuracy where standard BIRL fails.
Furthermore, BIHRL is able to accurately predict the goals of `Wikispeedia' game players, with inclusion of hierarchical structure in the model resulting in a large boost in accuracy.
We show that BIHRL is able to significantly outperform BIRL even when we only have a weak prior on the hierarchical structure of the plans available to the agent, and discuss the significant challenges that remain for scaling up this framework to more realistic settings.
Automated detection of abnormalities in data has been studied in research area in recent years because of its diverse applications in practice including video surveillance, industrial damage detection and network intrusion detection.
However, building an effective anomaly detection system is a non-trivial task since it requires to tackle challenging issues of the shortage of annotated data, inability of defining anomaly objects explicitly and the expensive cost of feature engineering procedure.
Unlike existing appoaches which only partially solve these problems, we develop a unique framework to cope the problems above simultaneously.
Instead of hanlding with ambiguous definition of anomaly objects, we propose to work with regular patterns whose unlabeled data is abundant and usually easy to collect in practice.
This allows our system to be trained completely in an unsupervised procedure and liberate us from the need for costly data annotation.
By learning generative model that capture the normality distribution in data, we can isolate abnormal data points that result in low normality scores (high abnormality scores).
Moreover, by leverage on the power of generative networks, i.e. energy-based models, we are also able to learn the feature representation automatically rather than replying on hand-crafted features that have been dominating anomaly detection research over many decades.
We demonstrate our proposal on the specific application of video anomaly detection and the experimental results indicate that our method performs better than baselines and are comparable with state-of-the-art methods in many benchmark video anomaly detection datasets.
Multiview representation learning is very popular for latent factor analysis.
It naturally arises in many data analysis, machine learning, and information retrieval applications to model dependent structures among multiple data sources.
For computational convenience, existing approaches usually formulate the multiview representation learning as convex optimization problems, where global optima can be obtained by certain algorithms in polynomial time.
However, many pieces of evidence have corroborated that heuristic nonconvex approaches also have good empirical computational performance and convergence to the global optima, although there is a lack of theoretical justification.
Such a gap between theory and practice motivates us to study a nonconvex formulation for multiview representation learning, which can be efficiently solved by a simple stochastic gradient descent (SGD) algorithm.
We first illustrate the geometry of the nonconvex formulation; Then, we establish asymptotic global rates of convergence to the global optima by diffusion approximations.
Numerical experiments are provided to support our theory.
We propose an efficient solution to peer-to-peer localization in a wireless sensor network which works in two stages.
At the first stage the optimization problem is relaxed into a convex problem, given in the form recently proposed by Soares, Xavier, and Gomes.
The convex problem is efficiently solved in a distributed way by an ADMM approach, which provides a significant improvement in speed with respect to the original solution.
In the second stage, a soft transition to the original, non-convex, non relaxed formulation is applied in such a way to force the solution towards a local minimum.
The algorithm is built in such a way to be fully distributed, and it is tested in meaningful situations, showing its effectiveness in localization accuracy and speed of convergence, as well as its inner robustness.
Broadcasting systems have to deal with channel variability in order to offer the best rate to the users.
Hierarchical modulation is a practical solution to provide different rates to the receivers in function of the channel quality.
Unfortunately, the performance evaluation of such modulations requires time consuming simulations.
We propose in this paper a novel approach based on the channel capacity to avoid these simulations.
The method allows to study the performance of hierarchical and also classical modulations combined with error correcting codes.
We will also compare hierarchical modulation with time sharing strategy in terms of achievable rates and indisponibility.
Our work will be applied to the DVB-SH and DVB-S2 standards, which both consider hierarchical modulation as an optional feature.
We describe here a library aimed at automating the solution of partial differential equations using the finite element method.
By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation.
Finite element variational forms may be expressed in near mathematical notation, from which low-level code is automatically generated, compiled and seamlessly integrated with efficient implementations of computational meshes and high-performance linear algebra.
Easy-to-use object-oriented interfaces to the library are provided in the form of a C++ library and a Python module.
This paper discusses the mathematical abstractions and methods used in the design of the library and its implementation.
A number of examples are presented to demonstrate the use of the library in application code.
The aim of this paper is to propose an application of mutual information-based ensemble methods to the analysis and classification of heart beats associated with different types of Arrhythmia.
Models of multilayer perceptrons, support vector machines, and radial basis function neural networks were trained and tested using the MIT-BIH arrhythmia database.
This research brings a focus to an ensemble method that, to our knowledge, is a novel application in the area of ECG Arrhythmia detection.
The proposed classifier ensemble method showed improved performance, relative to either majority voting classifier integration or to individual classifier performance.
The overall ensemble accuracy was 98.25%.
This paper presents our contribution to the ChaLearn Challenge 2015 on Cultural Event Classification.
The challenge in this task is to automatically classify images from 50 different cultural events.
Our solution is based on the combination of visual features extracted from convolutional neural networks with temporal information using a hierarchical classifier scheme.
We extract visual features from the last three fully connected layers of both CaffeNet (pretrained with ImageNet) and our fine tuned version for the ChaLearn challenge.
We propose a late fusion strategy that trains a separate low-level SVM on each of the extracted neural codes.
The class predictions of the low-level SVMs form the input to a higher level SVM, which gives the final event scores.
We achieve our best result by adding a temporal refinement step into our classification scheme, which is applied directly to the output of each low-level SVM.
Our approach penalizes high classification scores based on visual features when their time stamp does not match well an event-specific temporal distribution learned from the training and validation data.
Our system achieved the second best result in the ChaLearn Challenge 2015 on Cultural Event Classification with a mean average precision of 0.767 on the test set.
The Cloud Computing paradigm is providing system architects with a new powerful tool for building scalable applications.
Clouds allow allocation of resources on a "pay-as-you-go" model, so that additional resources can be requested during peak loads and released after that.
However, this flexibility asks for appropriate dynamic reconfiguration strategies.
In this paper we describe SAVER (qoS-Aware workflows oVER the Cloud), a QoS-aware algorithm for executing workflows involving Web Services hosted in a Cloud environment.
SAVER allows execution of arbitrary workflows subject to response time constraints.
SAVER uses a passive monitor to identify workload fluctuations based on the observed system response time.
The information collected by the monitor is used by a planner component to identify the minimum number of instances of each Web Service which should be allocated in order to satisfy the response time constraint.
SAVER uses a simple Queueing Network (QN) model to identify the optimal resource allocation.
Specifically, the QN model is used to identify bottlenecks, and predict the system performance as Cloud resources are allocated or released.
The parameters used to evaluate the model are those collected by the monitor, which means that SAVER does not require any particular knowledge of the Web Services and workflows being executed.
Our approach has been validated through numerical simulations, whose results are reported in this paper.
Ethics in the emerging world of data science are often discussed through cautionary tales about the dire consequences of missteps taken by high profile companies or organizations.
We take a different approach by foregrounding the ways that ethics are implicated in the day-to-day work of data science, focusing on instances in which data scientists recognize, grapple with, and conscientiously respond to ethical challenges.
This paper presents a case study of ethical dilemmas that arose in a "data science for social good" (DSSG) project focused on improving navigation for people with limited mobility.
We describe how this particular DSSG team responded to those dilemmas, and how those responses gave rise to still more dilemmas.
While the details of the case discussed here are unique, the ethical dilemmas they illuminate can commonly be found across many DSSG projects.
These include: the risk of exacerbating disparities; the thorniness of algorithmic accountability; the evolving opportunities for mischief presented by new technologies; the subjective and value- laden interpretations at the heart of any data-intensive project; the potential for data to amplify or mute particular voices; the possibility of privacy violations; and the folly of technological solutionism.
Based on our tracing of the team's responses to these dilemmas, we distill lessons for an ethical data science practice that can be more generally applied across DSSG projects.
Specifically, this case experience highlights the importance of: 1) Setting the scene early on for ethical thinking 2) Recognizing ethical decision-making as an emergent phenomenon intertwined with the quotidian work of data science for social good 3) Approaching ethical thinking as a thoughtful and intentional balancing of priorities rather than a binary differentiation between right and wrong.
In todays world there is a wide availability of huge amount of data and thus there is a need for turning this data into useful information which is referred to as knowledge.
This demand for knowledge discovery process has led to the development of many algorithms used to determine the association rules.
One of the major problems faced by these algorithms is generation of candidate sets.
The FP Tree algorithm is one of the most preferred algorithms for association rule mining because it gives association rules without generating candidate sets.
But in the process of doing so, it generates many CP trees which decreases its efficiency.
In this research paper, an improvised FP tree algorithm with a modified header table, along with a spare table and the MFI algorithm for association rule mining is proposed.
This algorithm generates frequent item sets without using candidate sets and CP trees.
The nonnegative matrix factorization (NMF) is widely used in signal and image processing, including bio-informatics, blind source separation and hyperspectral image analysis in remote sensing.
A great challenge arises when dealing with a nonlinear formulation of the NMF.
Within the framework of kernel machines, the models suggested in the literature do not allow the representation of the factorization matrices, which is a fallout of the curse of the pre-image.
In this paper, we propose a novel kernel-based model for the NMF that does not suffer from the pre-image problem, by investigating the estimation of the factorization matrices directly in the input space.
For different kernel functions, we describe two schemes for iterative algorithms: an additive update rule based on a gradient descent scheme and a multiplicative update rule in the same spirit as in the Lee and Seung algorithm.
Within the proposed framework, we develop several extensions to incorporate constraints, including sparseness, smoothness, and spatial regularization with a total-variation-like penalty.
The effectiveness of the proposed method is demonstrated with the problem of unmixing hyperspectral images, using well-known real images and results with state-of-the-art techniques.
The synchronization problem is investigated for the class of locally strongly transitive automata introduced in a previous work of the authors.
Some extensions of this problem related to the notions of stable set and word of minimal rank of an automaton are studied.
An application to synchronizing colorings of aperiodic graphs with a Hamiltonian path is also considered.
Using Deep Reinforcement Learning (DRL) can be a promising approach to handle various tasks in the field of (simulated) autonomous driving.
However, recent publications mainly consider learning in unusual driving environments.
This paper presents Driving School for Autonomous Agents (DSA^2), a software for validating DRL algorithms in more usual driving environments based on artificial and realistic road networks.
We also present the results of applying DSA^2 for handling the task of driving on a straight road while regulating the velocity of one vehicle according to different speed limits.
Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval.
Segmentation entails breaking words into their constituent stems, affixes and clitics.
In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect.
The two approaches involve posing the problem as a ranking problem, where an SVM ranker picks the best segmentation, and as a sequence labeling problem, where a bi-LSTM RNN coupled with CRF determines where best to segment words.
We are able to achieve solid segmentation results for all dialects using rather limited training data.
We also show that employing Modern Standard Arabic data for domain adaptation and assuming context independence improve overall results.
Genome-to-genome comparisons require designating anchor points, which are given by Maximum Exact Matches (MEMs) between their sequences.
For large genomes this is a challenging problem and the performance of existing solutions, even in parallel regimes, is not quite satisfactory.
We present a new algorithm, copMEM, that allows to sparsely sample both input genomes, with sampling steps being coprime.
Despite being a single-threaded implementation, copMEM computes all MEMs of minimum length 100 between the human and mouse genomes in less than 2 minutes, using less than 10 GB of RAM memory.
Moldable tasks allow schedulers to determine the number of processors assigned to a task, enabling efficient use of large-scale parallel processing systems.
A generic assumption is that every task is monotonic, i.e., its workload increases but its execution time decreases as the number of assigned processors increases.
In this paper, we study the problem of scheduling moldable tasks on processors.
Motivated by many benchmark studies, we introduce a new speedup model: it is linear when the number of assigned processors is small, up to some threshold; then, it possibly declines and even become negative as the number increases.
Given any threshold value achievable, we propose a generic approximation algorithm to minimize the makespan, which is simpler and achieves a better performance guarantee than the existing ones under the monotonic assumption.
As a by-product, we also propose an approximation algorithm to maximize the sum of values of tasks completed by a deadline; this scheduling objective is considered for moldable tasks for the first time while similar works have been done for other types of parallel tasks.
Most of the existing work on automatic facial expression analysis focuses on discrete emotion recognition, or facial action unit detection.
However, facial expressions do not always fall neatly into pre-defined semantic categories.
Also, the similarity between expressions measured in the action unit space need not correspond to how humans perceive expression similarity.
Different from previous work, our goal is to describe facial expressions in a continuous fashion using a compact embedding space that mimics human visual preferences.
To achieve this goal, we collect a large-scale faces-in-the-wild dataset with human annotations in the form: Expressions A and B are visually more similar when compared to expression C, and use this dataset to train a neural network that produces a compact (16-dimensional) expression embedding.
We experimentally demonstrate that the learned embedding can be successfully used for various applications such as expression retrieval, photo album summarization, and emotion recognition.
We also show that the embedding learned using the proposed dataset performs better than several other embeddings learned using existing emotion or action unit datasets.
We consider Markov models of large-scale networks where nodes are characterized by their local behavior and by a mobility model over a two-dimensional lattice.
By assuming random walk, we prove convergence to a system of partial differential equations (PDEs) whose size depends neither on the lattice size nor on the population of nodes.
This provides a macroscopic view of the model which approximates discrete stochastic movements with continuous deterministic diffusions.
We illustrate the practical applicability of this result by modeling a network of mobile nodes with on/off behavior performing file transfers with connectivity to 802.11 access points.
By means of an empirical validation against discrete-event simulation we show high quality of the PDE approximation even for low populations and coarse lattices.
In addition, we confirm the computational advantage in using the PDE limit over a traditional ordinary differential equation limit where the lattice is modeled discretely, yielding speed-ups of up to two orders of magnitude.
Analyzing multivariate time series data is important for many applications such as automated control, fault diagnosis and anomaly detection.
One of the key challenges is to learn latent features automatically from dynamically changing multivariate input.
In visual recognition tasks, convolutional neural networks (CNNs) have been successful to learn generalized feature extractors with shared parameters over the spatial domain.
However, when high-dimensional multivariate time series is given, designing an appropriate CNN model structure becomes challenging because the kernels may need to be extended through the full dimension of the input volume.
To address this issue, we present two structure learning algorithms for deep CNN models.
Our algorithms exploit the covariance structure over multiple time series to partition input volume into groups.
The first algorithm learns the group CNN structures explicitly by clustering individual input sequences.
The second algorithm learns the group CNN structures implicitly from the error backpropagation.
In experiments with two real-world datasets, we demonstrate that our group CNNs outperform existing CNN based regression methods.
In this article, we propose a new implementation of John von Neumann's middle square random number generator (RNG).
A Weyl sequence is utilized to keep the generator running through a long period.
Many machine intelligence techniques are developed in E-commerce and one of the most essential components is the representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc.
The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones.
In this paper, we propose an embedding based framework to learn and transfer the representation of IDs.
As the implicit feedbacks of users, a tremendous amount of item ID sequences can be easily collected from the interactive sessions.
By jointly using these informative sequences and the structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space.
Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks.
We deploy and evaluate the proposed approach in Hema App and the results validate its effectiveness.
We create and release the first publicly available commercial customer service corpus with annotated relational segments.
Human-computer data from three live customer service Intelligent Virtual Agents (IVAs) in the domains of travel and telecommunications were collected, and reviewers marked all text that was deemed unnecessary to the determination of user intention.
After merging the selections of multiple reviewers to create highlighted texts, a second round of annotation was done to determine the classes of language present in the highlighted sections such as the presence of Greetings, Backstory, Justification, Gratitude, Rants, or Emotions.
This resulting corpus is a valuable resource for improving the quality and relational abilities of IVAs.
As well as discussing the corpus itself, we compare the usage of such language in human-human interactions on TripAdvisor forums.
We show that removal of this language from task-based inputs has a positive effect on IVA understanding by both an increase in confidence and improvement in responses, demonstrating the need for automated methods of its discovery.
We derive an upper bound on the number of models for exact satisfiability (XSAT) of arbitrary CNF formulas F. The bound can be calculated solely from the distribution of positive and negated literals in the formula.
For certain subsets of CNF instances the new bound can be computed in sub-exponential time, namely in at most O(exp(sqrt(n))) , where n is the number of variables of F. A wider class of SAT problems beyond XSAT is defined to which the method can be extended.
Prior social contagion models consider the spread of either one contagion at a time on interdependent networks or multiple contagions on single layer networks or under assumptions of competition.
We propose a new threshold model for the diffusion of multiple contagions.
Individuals are placed on a multiplex network with a periodic lattice and a random-regular-graph layer.
On these population structures, we study the interface between two key aspects of the diffusion process: the level of synergy between two contagions, and the rate at which individuals become dormant after adoption.
Dormancy is defined as a looser form of immunity that models the ability to spread without resistance.
Monte Carlo simulations reveal lower synergy makes contagions more susceptible to percolation, especially those that diffuse on lattices.
Faster diffusion of one contagion with dormancy probabilistically blocks the diffusion of the other, in a way similar to ring vaccination.
We show that within a band of synergy, contagions on the lattices undergo bimodal or trimodal branching if they are the slower diffusing contagion.
Item-item collaborative filtering (CF) models are a well known and studied family of recommender systems, however current literature does not provide any theoretical explanation of the conditions under which item-based recommendations will succeed or fail.
We investigate the existence of an ideal item-based CF method able to make perfect recommendations.
This CF model is formalized as an eigenvalue problem, where estimated ratings are equivalent to the true (unknown) ratings multiplied by a user-specific eigenvalue of the similarity matrix.
Preliminary experiments show that the magnitude of the eigenvalue is proportional to the accuracy of recommendations for that user and therefore it can provide reliable measure of confidence.
The use of millimeter wave (mmWave) frequencies for communication will be one of the innovations of the next generation of cellular mobile networks (5G).
It will provide unprecedented data rates, but is highly susceptible to rapid channel variations and suffers from severe isotropic pathloss.
Highly directional antennas at the transmitter and the receiver will be used to compensate for these shortcomings and achieve sufficient link budget in wide area networks.
However, directionality demands precise alignment of the transmitter and the receiver beams, an operation which has important implications for control plane procedures, such as initial access, and may increase the delay of the data transmission.
This paper provides a comparison of measurement frameworks for initial access in mmWave cellular networks in terms of detection accuracy, reactiveness and overhead, using parameters recently standardized by the 3GPP and a channel model based on real-world measurements.
We show that the best strategy depends on the specific environment in which the nodes are deployed, and provide guidelines to characterize the optimal choice as a function of the system parameters.
For survival, a living agent must have the ability to assess risk (1) by temporally anticipating accidents before they occur, and (2) by spatially localizing risky regions in the environment to move away from threats.
In this paper, we take an agent-centric approach to study the accident anticipation and risky region localization tasks.
We propose a novel soft-attention Recurrent Neural Network (RNN) which explicitly models both spatial and appearance-wise non-linear interaction between the agent triggering the event and another agent or static-region involved.
In order to test our proposed method, we introduce the Epic Fail (EF) dataset consisting of 3000 viral videos capturing various accidents.
In the experiments, we evaluate the risk assessment accuracy both in the temporal domain (accident anticipation) and spatial domain (risky region localization) on our EF dataset and the Street Accident (SA) dataset.
Our method consistently outperforms other baselines on both datasets.
One of the fundamental elements impacting the performance of a wireless system is interference, which has been a long-term issue in wireless networks.
In the case of cognitive radio (CR) networks, the problem of interference is tremendously crucial.
In other words, CR keeps the important promise of not producing any harmful interference to the primary user (PU) system.
Thus, it is essential to investigate the impact of interference caused to the PUs so that its detrimental effect on the performance of the PU system performance is reduced.
Study of cognitive interference generally includes developing a model to statistically demonstrate the power of cognitive interference at the PUs, which then can be utilized to examine different performance measures.
Having inspected the different models for channel interference present in the literature, it can be obviously seen that interference models have been gradually evolved in terms of complication and sophistication.
Although numerous papers can be found in the literature that have investigated different models for interference, to the best of our knowledge, very few publications are available that provide a review of all models and their comparisons.
This paper is a collection of state-of-the-art in interference modeling which overviews and compares different models in the literature to provide the valuable insights for researchers when modeling the interference in a specific scenario.
The estimation of inertial parameters of a robotic system is crucial for better trajectory tracking performance, specially when model-based controllers are used for carrying out precise tasks.
In this paper, we consider the scenario of grasping an object of unknown properties by a free-flyer space robot with limited actuation.
The problem is to find the inertial parameters of the complete system after grasping has been performed.
Excitation is provided in inertial space, and the excitation trajectories are found by optimization.
Truncated Fourier series are used to represent the reference as well as tracked trajectory.
An approach based on the energy balance between the actuation work and the rate of change of kinetic energy is introduced to calculate the number of harmonics in the Fourier series used to represent the executed trajectory, while trying to find a balance between accounting for saturation effects and keeping out noise.
The effect of input saturation on parameter estimation is also studied.
Simulation results using the Space CoBot free-flyer robot are presented to show the feasibility of the approach.
Air traffic control increasingly depends on information and communication technology (ICT) to manage traffic flow through highly congested and increasingly interdependent airspace regions.
While these systems are critical to ensuring the efficiency and safety of our airspace, they are also increasingly vulnerable to cyber threats that could potentially lead to reduction in capacity and/or reorganization of traffic flows.
In this paper, we model various cyber threats to air traffic control systems, and analyze how these attacks could impact the flow of aircraft through the airspace.
To perform this analysis, we consider a model for wide-area air traffic based on a dynamic queuing network model.
Then we introduce three different attacks (Route Denial of Service, Route Selection Tampering, and Sector Denial of Service) to the air traffic control system, and explore how these attacks manipulate the sector flows by evaluating the queue backlogs for each sector's outflows.
Furthermore, we then explore graph-level vulnerability metrics to identify the sectors that are most vulnerable to various flow manipulations, and compare them to case-study simulations of the various attacks.
The results suggest that Route Denial of Service attacks have a significant impact on the target sector and lead to the largest degradation to the overall air traffic flows.
Furthermore, the impact of Sector Denial of Service attack impacts are primarily confined to the target sector, while the Route Selection Tampering impacts are mostly confined to certain aircraft.
Generative adversarial nets (GANs) are widely used to learn the data sampling process and their performance may heavily depend on the loss functions, given a limited computational budget.
This study revisits MMD-GAN that uses the maximum mean discrepancy (MMD) as the loss function for GAN and makes two contributions.
First, we argue that the existing MMD loss function may discourage the learning of fine details in data as it attempts to contract the discriminator outputs of real data.
To address this issue, we propose a repulsive loss function to actively learn the difference among the real data by simply rearranging the terms in MMD.
Second, inspired by the hinge loss, we propose a bounded Gaussian kernel to stabilize the training of MMD-GAN with the repulsive loss function.
The proposed methods are applied to the unsupervised image generation tasks on CIFAR-10, STL-10, CelebA, and LSUN bedroom datasets.
Results show that the repulsive loss function significantly improves over the MMD loss at no additional computational cost and outperforms other representative loss functions.
The proposed methods achieve an FID score of 16.21 on the CIFAR-10 dataset using a single DCGAN network and spectral normalization.
It is known that Boosting can be interpreted as a gradient descent technique to minimize an underlying loss function.
Specifically, the underlying loss being minimized by the traditional AdaBoost is the exponential loss, which is proved to be very sensitive to random noise/outliers.
Therefore, several Boosting algorithms, e.g., LogitBoost and SavageBoost, have been proposed to improve the robustness of AdaBoost by replacing the exponential loss with some designed robust loss functions.
In this work, we present a new way to robustify AdaBoost, i.e., incorporating the robust learning idea of Self-paced Learning (SPL) into Boosting framework.
Specifically, we design a new robust Boosting algorithm based on SPL regime, i.e., SPLBoost, which can be easily implemented by slightly modifying off-the-shelf Boosting packages.
Extensive experiments and a theoretical characterization are also carried out to illustrate the merits of the proposed SPLBoost.
The present paper introduces the initial implementation of a software exploration tool targeting graphical user interface (GUI) driven applications.
GUITracer facilitates the comprehension of GUI-driven applications by starting from their most conspicuous artefact - the user interface itself.
The current implementation of the tool can be used with any Java-based target application that employs one of the AWT, Swing or SWT toolkits.
The tool transparently instruments the target application and provides real time information about the GUI events fired.
For each event, call relations within the application are displayed at method, class or package level, together with detailed coverage information.
The tool facilitates feature location, program comprehension as well as GUI test creation by revealing the link between the application's GUI and its underlying code.
As such, GUITracer is intended for software practitioners developing or maintaining GUI-driven applications.
We believe our tool to be especially useful for entry-level practitioners as well as students seeking to understand complex GUI-driven software systems.
The present paper details the rationale as well as the technical implementation of the tool.
As a proof-of-concept implementation, we also discuss further development that can lead to our tool's integration into a software development workflow.
Opacity is a property that characterizes the system's capability to keep its "secret" from being inferred by an intruder that partially observes the system's behavior.
In this paper, we are concerned with enhancing the opacity using insertion functions, while at the same time, enforcing the task specification in a parametric stochastic discrete event system.
We first obtain the parametric Markov decision process that encodes all the possible insertions.
Based on which, we convert this parameter and insertion function co-synthesis problem into a nonlinear program.
We prove that if the output of this program satisfies all the constraints, it will be a valid solution to our problem.
Therefore, the security and the capability of enforcing the task specification can be simultaneously guaranteed.
In today's WLANs, scheduling of packet transmissions solely relies on the collision and success a station may experience.
To better support traffic differentiation in dense WLANs, in this paper, we propose a distributed reservation mechanism for the Carrier Sense Multiple Access Extended Collision Avoidance (CSMA/ECA) MAC protocol, termed CSMA/ECA-DR, based on which stations can collaboratively achieve higher network performance.
In addition, proper Contention Window (CW) will be chosen based on the instantaneously estimated number of active contenders in the network.
Simulation results from dense scenarios with traffic differentiation demonstrate that CSMA/ECA-DR can greatly improve the efficiency of WLANs for traffic differentiation even with large numbers of contenders.
The new frontier in cellular networks is harnessing the enormous spectrum available at millimeter wave (mmWave) frequencies above 28 GHz.
The challenging radio propagation characteristics at these frequencies, and the use of highly directional beamforming, lead to intermittent links between the base station (BS) and the user equipment (UE).
In this paper, we revisit the problem of cell selection to maintain an acceptable level of service, despite the underlying intermittent link connectivity typical of mmWave links.
We propose a Markov Decision Process (MDP) framework to study the properties and performance of our proposed cell selection strategy, which jointly considers several factors such as dynamic channel load and link quality.
We use the Value Iteration Algorithm (VIA) to solve the MDP, and obtain the optimal set of associations.
We address the multi user problem through a distributed iterative approach, in which each UE characterizes the evolution of the system based on stationary channel distribution and cell selection statistics of other UEs.
Through simulation results, we show that our proposed technique makes judicious handoff choices, thereby providing a significant improvement in the overall network capacity.
Further, our technique reduces the total number of handoffs, thus lowering the signaling overhead, while providing a higher quality of service to the UEs.
Deep convolutional networks have demonstrated the state-of-the-art performance on various medical image computing tasks.
Leveraging images from different modalities for the same analysis task holds clinical benefits.
However, the generalization capability of deep models on test data with different distributions remain as a major challenge.
In this paper, we propose the PnPAdaNet (plug-and-play adversarial domain adaptation network) for adapting segmentation networks between different modalities of medical images, e.g., MRI and CT. We propose to tackle the significant domain shift by aligning the feature spaces of source and target domains in an unsupervised manner.
Specifically, a domain adaptation module flexibly replaces the early encoder layers of the source network, and the higher layers are shared between domains.
With adversarial learning, we build two discriminators whose inputs are respectively multi-level features and predicted segmentation masks.
We have validated our domain adaptation method on cardiac structure segmentation in unpaired MRI and CT.
The experimental results with comprehensive ablation studies demonstrate the excellent efficacy of our proposed PnP-AdaNet.
Moreover, we introduce a novel benchmark on the cardiac dataset for the task of unsupervised cross-modality domain adaptation.
We will make our code and database publicly available, aiming to promote future studies on this challenging yet important research topic in medical imaging.
Offline signature verification is one of the most challenging tasks in biometrics and document forensics.
Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might often resembles the real signature with small deformation.
This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases.
In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network.
Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity.
This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs.
Experiments conducted on cross-domain datasets emphasize the capability of our network to model forgery in different languages (scripts) and handwriting styles.
Moreover, our designed Siamese network, named SigNet, exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.
This paper addresses the problem of designing LDPC decoders robust to transient errors introduced by a faulty hardware.
We assume that the faulty hardware introduces errors during the message passing updates and we propose a general framework for the definition of the message update faulty functions.
Within this framework, we define symmetry conditions for the faulty functions, and derive two simple error models used in the analysis.
With this analysis, we propose a new interpretation of the functional Density Evolution threshold previously introduced, and show its limitations in case of highly unreliable hardware.
However, we show that under restricted decoder noise conditions, the functional threshold can be used to predict the convergence behavior of FAIDs under faulty hardware.
In particular, we reveal the existence of robust and non-robust FAIDs and propose a framework for the design of robust decoders.
We finally illustrate robust and non-robust decoders behaviors of finite length codes using Monte Carlo simulations.
Earlier formulations of the DNA assembly problem were all in the context of perfect assembly; i.e., given a set of reads from a long genome sequence, is it possible to perfectly reconstruct the original sequence?
In practice, however, it is very often the case that the read data is not sufficiently rich to permit unambiguous reconstruction of the original sequence.
While a natural generalization of the perfect assembly formulation to these cases would be to consider a rate-distortion framework, partial assemblies are usually represented in terms of an assembly graph, making the definition of a distortion measure challenging.
In this work, we introduce a distortion function for assembly graphs that can be understood as the logarithm of the number of Eulerian cycles in the assembly graph, each of which correspond to a candidate assembly that could have generated the observed reads.
We also introduce an algorithm for the construction of an assembly graph and analyze its performance on real genomes.
Much research has been conducted on both face identification and face verification, with greater focus on the latter.
Research on face identification has mostly focused on using closed-set protocols, which assume that all probe images used in evaluation contain identities of subjects that are enrolled in the gallery.
Real systems, however, where only a fraction of probe sample identities are enrolled in the gallery, cannot make this closed-set assumption.
Instead, they must assume an open set of probe samples and be able to reject/ignore those that correspond to unknown identities.
In this paper, we address the widespread misconception that thresholding verification-like scores is a good way to solve the open-set face identification problem, by formulating an open-set face identification protocol and evaluating different strategies for assessing similarity.
Our open-set identification protocol is based on the canonical labeled faces in the wild (LFW) dataset.
Additionally to the known identities, we introduce the concepts of known unknowns (known, but uninteresting persons) and unknown unknowns (people never seen before) to the biometric community.
We compare three algorithms for assessing similarity in a deep feature space under an open-set protocol: thresholded verification-like scores, linear discriminant analysis (LDA) scores, and an extreme value machine (EVM) probabilities.
Our findings suggest that thresholding EVM probabilities, which are open-set by design, outperforms thresholding verification-like scores.
Equating users' true needs and desires with behavioural measures of 'engagement' is problematic.
However, good metrics of 'true preferences' are difficult to define, as cognitive biases make people's preferences change with context and exhibit inconsistencies over time.
Yet, HCI research often glosses over the philosophical and theoretical depth of what it means to infer what users really want.
In this paper, we present an alternative yet very real discussion of this issue, via a fictive dialogue between senior executives in a tech company aimed at helping people live the life they `really' want to live.
How will the designers settle on a metric for their product to optimise?
Economies are instances of complex socio-technical systems that are shaped by the interactions of large numbers of individuals.
The individual behavior and decision-making of consumer agents is determined by complex psychological dynamics that include their own assessment of present and future economic conditions as well as those of others, potentially leading to feedback loops that affect the macroscopic state of the economic system.
We propose that the large-scale interactions of a nation's citizens with its online resources can reveal the complex dynamics of their collective psychology, including their assessment of future system states.
Here we introduce a behavioral index of Chinese Consumer Confidence (C3I) that computationally relates large-scale online search behavior recorded by Google Trends data to the macroscopic variable of consumer confidence.
Our results indicate that such computational indices may reveal the components and complex dynamics of consumer psychology as a collective socio-economic phenomenon, potentially leading to improved and more refined economic forecasting.
Information technologies today can inform each of us about the best alternatives for shortest paths from origins to destinations, but they do not contain incentives or alternatives that manage the information efficiently to get collective benefits.
To obtain such benefits, we need to have not only good estimates of how the traffic is formed but also to have target strategies to reduce enough vehicles from the best possible roads in a feasible way.
The opportunity is that during large events the traffic inconveniences in large cities are unusually high, yet temporary, and the entire population may be more willing to adopt collective recommendations for social good.
In this paper, we integrate for the first time big data resources to quantify the impact of events and propose target strategies for collective good at urban scale.
In the context of the Olympic Games in Rio de Janeiro, we first predict the expected increase in traffic.
To that end, we integrate data from: mobile phones, Airbnb, Waze, and transit information, with game schedules and information of venues.
Next, we evaluate the impact of the Olympic Games to the travel of commuters, and propose different route choice scenarios during the peak hours.
Moreover, we gather information on the trips that contribute the most to the global congestion and that could be redirected from vehicles to transit.
Interestingly, we show that (i) following new route alternatives during the event with individual shortest path can save more collective travel time than keeping the routine routes, uncovering the positive value of information technologies during events; (ii) with only a small proportion of people selected from specific areas switching from driving to public transport, the collective travel time can be reduced to a great extent.
Results are presented on-line for the evaluation of the public and policy makers.
Image Segmentation is a technique of partitioning the original image into some distinct classes.
Many possible solutions may be available for segmenting an image into a certain number of classes, each one having different quality of segmentation.
In our proposed method, multilevel thresholding technique has been used for image segmentation.
A new approach of Cuckoo Search (CS) is used for selection of optimal threshold value.
In other words, the algorithm is used to achieve the best solution from the initial random threshold values or solutions and to evaluate the quality of a solution correlation function is used.
Finally, MSE and PSNR are measured to understand the segmentation quality.
Stochastic behaviors of resistive random access memory (RRAM) play an important role in the design of cross-point memory arrays.
A Monte Carlo compact model of oxide RRAM is developed and calibrated with experiments on various device stack configurations.
With Monte Carlo SPICE simulations, we show that an increase in array size and interconnect wire resistance will statistically deteriorate write functionality.
Write failure probability (WFP) has an exponential dependency on device uniformity and supply voltage (VDD), and the array bias scheme is a key knob.
Lowering array VDD leads to higher effective energy consumption (EEC) due to the increase in WFP when the variation statistics are included in the analysis.
Random-access simulations indicate that data sparsity statistically benefits write functionality and energy consumption.
Finally, we show that a pseudo-sub-array topology with uniformly distributed pre-forming cells in the pristine high resistance state is able to reduce both WFP and EEC, enabling higher net capacity for memory circuits due to improved variation tolerance.
Path planning is typically considered in Artificial Intelligence as a graph searching problem and R* is state-of-the-art algorithm tailored to solve it.
The algorithm decomposes given path finding task into the series of subtasks each of which can be easily (in computational sense) solved by well-known methods (such as A*).
Parameterized random choice is used to perform the decomposition and as a result R* performance largely depends on the choice of its input parameters.
In our work we formulate a range of assumptions concerning possible upper and lower bounds of R* parameters, their interdependency and their influence on R* performance.
Then we evaluate these assumptions by running a large number of experiments.
As a result we formulate a set of heuristic rules which can be used to initialize the values of R* parameters in a way that leads to algorithm's best performance.
Standard algorithms for finding the shortest path in a graph require that the cost of a path be additive in edge costs, and typically assume that costs are deterministic.
We consider the problem of uncertain edge costs, with potential probabilistic dependencies among the costs.
Although these dependencies violate the standard dynamic-programming decomposition, we identify a weaker stochastic consistency condition that justifies a generalized dynamic-programming approach based on stochastic dominance.
We present a revised path-planning algorithm and prove that it produces optimal paths under time-dependent uncertain costs.
We test the algorithm by applying it to a model of stochastic bus networks, and present empirical performance results comparing it to some alternatives.
Finally, we consider extensions of these concepts to a more general class of problems of heuristic search under uncertainty.
Linear rules have played an increasing role in structural proof theory in recent years.
It has been observed that the set of all sound linear inference rules in Boolean logic is already coNP-complete, i.e. that every Boolean tautology can be written as a (left- and right-)linear rewrite rule.
In this paper we study properties of systems consisting only of linear inferences.
Our main result is that the length of any 'nontrivial' derivation in such a system is bound by a polynomial.
As a consequence there is no polynomial-time decidable sound and complete system of linear inferences, unless coNP=NP.
We draw tools and concepts from term rewriting, Boolean function theory and graph theory in order to access some required intermediate results.
At the same time we make several connections between these areas that, to our knowledge, have not yet been presented and constitute a rich theoretical framework for reasoning about linear TRSs for Boolean logic.
Reinforcement learning has significant applications for multi-agent systems, especially in unknown dynamic environments.
However, most multi-agent reinforcement learning (MARL) algorithms suffer from such problems as exponential computation complexity in the joint state-action space, which makes it difficult to scale up to realistic multi-agent problems.
In this paper, a novel algorithm named negotiation-based MARL with sparse interactions (NegoSI) is presented.
In contrast to traditional sparse-interaction based MARL algorithms, NegoSI adopts the equilibrium concept and makes it possible for agents to select the non-strict Equilibrium Dominating Strategy Profile (non-strict EDSP) or Meta equilibrium for their joint actions.
The presented NegoSI algorithm consists of four parts: the equilibrium-based framework for sparse interactions, the negotiation for the equilibrium set, the minimum variance method for selecting one joint action and the knowledge transfer of local Q-values.
In this integrated algorithm, three techniques, i.e., unshared value functions, equilibrium solutions and sparse interactions are adopted to achieve privacy protection, better coordination and lower computational complexity, respectively.
To evaluate the performance of the presented NegoSI algorithm, two groups of experiments are carried out regarding three criteria: steps of each episode (SEE), rewards of each episode (REE) and average runtime (AR).
The first group of experiments is conducted using six grid world games and shows fast convergence and high scalability of the presented algorithm.
Then in the second group of experiments NegoSI is applied to an intelligent warehouse problem and simulated results demonstrate the effectiveness of the presented NegoSI algorithm compared with other state-of-the-art MARL algorithms.
Logic programming provides a very high-level view of programming, which comes at the cost of some execution efficiency.
Improving performance of logic programs is thus one of the holy grails of Prolog system implementations and a wide range of approaches have historically been taken towards this goal.
Designing computational models that both exploit the available parallelism in a given application and that try hard to reduce the explored search space has been an ongoing line of research for many years.
These goals in particular have motivated the design of several computational models, one of which is the Extended Andorra Model (EAM).
In this paper, we present a preliminary specification and implementation of the EAM with Implicit Control, the WAM2EAM, which supplies regular WAM instructions with an EAM-centered interpretation.
We present some of the experiments we have performed to best test our design for a library for MathScheme, the mechanized mathematics software system we are building.
We wish for our library design to use and reflect, as much as possible, the mathematical structure present in the objects which populate the library.
This manual describes the competition software for the Simulated Car Racing Championship, an international competition held at major conferences in the field of Evolutionary Computation and in the field of Computational Intelligence and Games.
It provides an overview of the architecture, the instructions to install the software and to run the simple drivers provided in the package, the description of the sensors and the actuators.
In this paper, we introduce a rule-based approach to annotate Locative and Directional Expressions in Arabic natural language text.
The annotation is based on a constructed semantic map of the spatiality domain.
Challenges are twofold: first, we need to study how locative and directional expressions are expressed linguistically in these texts; and second, we need to automatically annotate the relevant textual segments accordingly.
The research method we will use in this article is analytic-descriptive.
We will validate this approach on specific novel rich with these expressions and show that it has very promising results.
We will be using NOOJ as a software tool to implement finite-state transducers to annotate linguistic elements according to Locative and Directional Expressions.
In conclusion, NOOJ allowed us to write linguistic rules for the automatic annotation in Arabic text of Locative and Directional Expressions.
Orthogonal frequency division multiplexing (OFDM) and single-carrier frequency domain equalization (SC-FDE) are two commonly adopted modulation schemes for frequency-selective channels.
Compared to SC-FDE, OFDM generally achieves higher data rate, but at the cost of higher transmit signal peak-to-average power ratio (PAPR) that leads to lower power amplifier efficiency.
This paper proposes a new modulation scheme, called flexible multi-group single-carrier (FMG-SC), which encapsulates both OFDM and SC-FDE as special cases, thus achieving more flexible rate-PAPR trade-offs between them.
Specifically, a set of frequency subcarriers are flexibly divided into orthogonal groups based on their channel gains, and SC-FDE is applied over each of the groups to send different data streams in parallel.
We aim to maximize the achievable sum-rate of all groups by optimizing the subcarrier-group mapping.
We propose two low-complexity subcarrier grouping methods and show via simulation that they perform very close to the optimal grouping by exhaustive search.
Simulation results also show the effectiveness of the proposed FMG-SC modulation scheme with optimized subcarrier grouping in improving the rate-PAPR trade-off over conventional OFDM and SC-FDE.
In 2013, Tsai et al. cryptanalyzed Yeh et al. scheme and shown that Yeh et al., scheme is vulnerable to various cryptographic attacks and proposed an improved scheme.
In this poster we will show that Tsai et al., scheme is also vulnerable to undetectable online password guessing attack, on success of the attack, the adversary can perform all major cryptographic attacks.
As apart of our contribution, we have proposed an improved scheme which overcomes the defects in Tsai et al. and Yeh et al. schemes.
Breast cancer is the second most common malignancy among women and has become a major public health problem in current society.
Traditional breast cancer identification requires experienced pathologists to carefully read the breast slice, which is laborious and suffers from inter-observer variations.
Consequently, an automatic classification framework for breast cancer identification is worthwhile to develop.
Recent years witnessed the development of deep learning technique.
Increasing number of medical applications start to use deep learning to improve diagnosis accuracy.
In this paper, we proposed a novel training strategy, namely reversed active learning (RAL), to train network to automatically classify breast cancer images.
Our RAL is applied to the training set of a simple convolutional neural network (CNN) to remove mislabeled images.
We evaluate the CNN trained with RAL on publicly available ICIAR 2018 Breast Cancer Dataset (IBCD).
The experimental results show that our RAL increases the slice-based accuracy of CNN from 93.75% to 96.25%.
We present the first sample compression algorithm for nearest neighbors with non-trivial performance guarantees.
We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal.
Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds.
Some encouraging empirical results are also presented.
There are several distinct failure modes for overoptimization of systems on the basis of metrics.
This occurs when a metric which can be used to improve a system is used to an extent that further optimization is ineffective or harmful, and is sometimes termed Goodhart's Law.
This class of failure is often poorly understood, partly because terminology for discussing them is ambiguous, and partly because discussion using this ambiguous terminology ignores distinctions between different failure modes of this general type.
This paper expands on an earlier discussion by Garrabrant, which notes there are "(at least) four different mechanisms" that relate to Goodhart's Law.
This paper is intended to explore these mechanisms further, and specify more clearly how they occur.
This discussion should be helpful in better understanding these types of failures in economic regulation, in public policy, in machine learning, and in Artificial Intelligence alignment.
The importance of Goodhart effects depends on the amount of power directed towards optimizing the proxy, and so the increased optimization power offered by artificial intelligence makes it especially critical for that field.
Literary works reference a variety of globally shared themes including well-known people, events, and time periods.
It is particularly interesting to locate patterns that are either invariant across time or exhibit a characteristic change across time, as they could imply something important about society that those works record.
This paper suggests the use of Google n-gram viewer as a fast prototyping method for examining time-based properties over a rich sample of literary prose.
Using this method, we find that some repeating periods of time, like Sunday, are referenced disproportionally, allowing us to pose questions such as why a day like Thursday is so unpopular.
Furthermore, by treating software as a work of prose, we can apply a similar analysis to open-source software repositories and explore time-based relations in commit logs.
Doing a simple statistical analysis on a few temporal keywords in the log records, we reinforce and weaken a few beliefs on how college students approach open source software.
Finally, we help readers working on their own temporal analysis by comparing the fundamental differences between literary works and code repositories, and suggest blog or wiki as recently-emerging works.
This paper presents a new way to study registration based trackers by decomposing them into three constituent sub modules: appearance model, state space model and search method.
It is often the case that when a new tracker is introduced in literature, it only contributes to one or two of these sub modules while using existing methods for the rest.
Since these are often selected arbitrarily by the authors, they may not be optimal for the new method.
In such cases, our breakdown can help to experimentally find the best combination of methods for these sub modules while also providing a framework within which the contributions of the new tracker can be clearly demarcated and thus studied better.
We show how existing trackers can be broken down using the suggested methodology and compare the performance of the default configuration chosen by the authors against other possible combinations to demonstrate the new insights that can be gained by such an approach.
We also present an open source system that provides a convenient interface to plug in a new method for any sub module and test it against all possible combinations of methods for the other two sub modules while also serving as a fast and efficient solution for practical tracking requirements.
The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods.
In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process.
MDNet includes an image model and a language model.
The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels.
The overall network is trained end-to-end by using our developed optimization strategy.
Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines.
The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.
In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation.
At every step, the decoder uses this mechanism to focus on different parts of the source sentence to gather the most useful information before outputting its target word.
Recently, the effectiveness of the attention mechanism has also been explored for multimodal tasks, where it becomes possible to focus both on sentence parts and image regions that they describe.
In this paper, we compare several attention mechanism on the multimodal translation task (English, image to German) and evaluate the ability of the model to make use of images to improve translation.
We surpass state-of-the-art scores on the Multi30k data set, we nevertheless identify and report different misbehavior of the machine while translating.
This paper develops new theory and algorithms to recover signals that are approximately sparse in some general dictionary (i.e., a basis, frame, or over-/incomplete matrix) but corrupted by a combination of interference having a sparse representation in a second general dictionary and measurement noise.
The algorithms and analytical recovery conditions consider varying degrees of signal and interference support-set knowledge.
Particular applications covered by the proposed framework include the restoration of signals impaired by impulse noise, narrowband interference, or saturation/clipping, as well as image in-painting, super-resolution, and signal separation.
Two application examples for audio and image restoration demonstrate the efficacy of the approach.
Sliding super point is a special host defined under sliding time window with which there are huge other hosts contact.
It plays important roles in network security and management.
But how to detect them in real time from nowadays high-speed network which contains several distributed routers is a hard task.
Distributed sliding super point detection requires an algorithm that can estimate the number of contacting hosts incrementally, scan packets faster than their flowing speed and reconstruct sliding super point at the end of a time period.
But no existing algorithm satisfies these three requirements simultaneously.
To solve this problem, this paper firstly proposed a distributed sliding super point detection algorithm running on GPU.
The advantage of this algorithm comes from a novel sliding estimator, which can estimate contacting host number incrementally under a sliding window, and a set of reversible hash functions, by which sliding super points could be regained without storing additional data such as IP list.
There are two main procedures in this algorithm: packets scanning and sliding super points reconstruction.
Both could run parallel without any data reading conflict.
When deployed on a low cost GPU, this algorithm could deal with traffic with bandwidth as high as 680 Gb/s.
A real world core network traffic is used to evaluate the performance of this sliding super point detection algorithm on a cheap GPU, Nvidia GTX950 with 4 GB graphic memory.
Experiments comparing with other algorithms under discrete time window show that this algorithm has the highest accuracy.
Under sliding time widow, this algorithm has the same performance as in discrete time window, where no other algorithms can work.
We study a two-level uncapacitated lot-sizing problem with inventory bounds that occurs in a supply chain composed of a supplier and a retailer.
The first level with the demands is the retailer level and the second one is the supplier level.
The aim is to minimize the cost of the supply chain so as to satisfy the demands when the quantity of item that can be held in inventory at each period is limited.
The inventory bounds can be imposed at the retailer level, at the supplier level or at both levels.
We propose a polynomial dynamic programming algorithm to solve this problem when the inventory bounds are set on the retailer level.
When the inventory bounds are set on the supplier level, we show that the problem is NP-hard.
We give a pseudo-polynomial algorithm which solves this problem when there are inventory bounds on both levels.
In the case where demand lot-splitting is not allowed, i.e. each demand has to be satisfied by a single order, we prove that the uncapacitated lot-sizing problem with inventory bounds is strongly NP-hard.
This implies that the two-level lot-sizing problems with inventory bounds are also strongly NP-hard when demand lot-splitting is considered.
Methods for teaching machines to answer visual questions have made significant progress in the last few years, but although demonstrating impressive results on particular datasets, these methods lack some important human capabilities, including integrating new visual classes and concepts in a modular manner, providing explanations for the answer and handling new domains without new examples.
In this paper we present a system that achieves state-of-the-art results on the CLEVR dataset without any questions-answers training, utilizes real visual estimators and explains the answer.
The system includes a question representation stage followed by an answering procedure, which invokes an extendable set of visual estimators.
It can explain the answer, including its failures, and provide alternatives to negative answers.
The scheme builds upon a framework proposed recently, with extensions allowing the system to deal with novel domains without relying on training examples.
Technical Universities (TUs) exhibit a distinct ranking performance in comparison with other universities.
In this paper we identify 137 TUs included in the THE Ranking (2017 edition) and analyse their scores statistically.
The results highlight the existence of clusters of TUs showing a general high performance in the Industry Income category and, in many cases, a low performance on Research and Teaching.
Finally, the global score weights were simulated, creating several scenarios that confirmed that the majority of TUs (except those with a world-class status) would increase their final scores if industrial income was accounted for at the levels parametrised.
Communication systems for multicasting information and energy simultaneously to more than one user are investigated.
In the system under study, a transmitter sends the same message and signal to multiple receivers over distinct and independent channels.
In this setting, results for compound channels are applied to relate the operational compound capacity to the informational measurements.
The fundamental limit under a received energy constraint, called the multicast capacity-energy function, is studied and a single-letter expression is derived.
The ideas are illustrated via a numerical example with two receivers.
The problem of receiver segmentation, in which the receivers are divided into several groups, is also considered.
Over-segmentation, or super-pixel generation, is a common preliminary stage for many computer vision applications.
New acquisition technologies enable the capturing of 3D point clouds that contain color and geometrical information.
This 3D information introduces a new conceptual change that can be utilized to improve the results of over-segmentation, which uses mainly color information, and to generate clusters of points we call super-points.
We consider a variety of possible 3D extensions of the Local Variation (LV) graph based over-segmentation algorithms, and compare them thoroughly.
We consider different alternatives for constructing the connectivity graph, for assigning the edge weights, and for defining the merge criterion, which must now account for the geometric information and not only color.
Following this evaluation, we derive a new generic algorithm for over-segmentation of 3D point clouds.
We call this new algorithm Point Cloud Local Variation (PCLV).
The advantages of the new over-segmentation algorithm are demonstrated on both outdoor and cluttered indoor scenes.
Performance analysis of the proposed approach compared to state-of-the-art 2D and 3D over-segmentation algorithms shows significant improvement according to the common performance measures.
In the article, an experiment is aimed at clarifying the transfer efficiency of the database in the cloud infrastructure.
The system was added to the control unit, which has guided the database search in the local part or in the cloud.
It is shown that the time data acquisition remains unchanged as a result of modification.
Suggestions have been made about the use of the theory of dynamic systems to hybrid cloud database.
The present work is aimed at attracting the attention of spe-cialists in the field of cloud database to the apparatus control theory.
The experiment presented in this article allows the use of the description of the known methods for solving important practical problems.
Modern networks are large, highly complex and dynamic.
Add to that the mobility of the agents comprising many of these networks.
It is difficult or even impossible for such systems to be managed centrally in an efficient manner.
It is imperative for such systems to attain a degree of self-management.
Self-healing i.e. the capability of a system in a good state to recover to another good state in face of an attack, is desirable for such systems.
In this paper, we discuss the self-healing model for dynamic reconfigurable systems.
In this model, an omniscient adversary inserts or deletes nodes from a network and the algorithm responds by adding a limited number of edges in order to maintain invariants of the network.
We look at some of the results in this model and argue for their applicability and further extensions of the results and the model.
We also look at some of the techniques we have used in our earlier work, in particular, we look at the idea of maintaining virtual graphs mapped over the existing network and assert that this may be a useful technique to use in many problem domains.
Query expansion is a method for alleviating the vocabulary mismatch problem present in information retrieval tasks.
Previous works have shown that terms selected for query expansion by traditional methods such as pseudo-relevance feedback are not always helpful to the retrieval process.
In this paper, we show that this is also true for more recently proposed embedding-based query expansion methods.
We then introduce an artificial neural network classifier to predict the usefulness of query expansion terms.
This classifier uses term word embeddings as inputs.
We perform experiments on four TREC newswire and web collections show that using terms selected by the classifier for expansion significantly improves retrieval performance when compared to competitive baselines.
The results are also shown to be more robust than the baselines.
This paper presents a novel approach for learning self-awareness models for autonomous vehicles.
The proposed technique is based on the availability of synchronized multi-sensor dynamic data related to different maneuvering tasks performed by a human operator.
It is shown that different machine learning approaches can be used to first learn single modality models using coupled Dynamic Bayesian Networks; such models are then correlated at event level to discover contextual multi-modal concepts.
In the presented case, visual perception and localization are used as modalities.
Cross-correlations among modalities in time is discovered from data and are described as probabilistic links connecting shared and private multi-modal DBNs at the event (discrete) level.
Results are presented on experiments performed on an autonomous vehicle, highlighting potentiality of the proposed approach to allow anomaly detection and autonomous decision making based on learned self-awareness models.
The relation between Science (what we can explain) and Art (what we can't) has long been acknowledged and while every science contains an artistic part, every art form also needs a bit of science.
Among all scientific disciplines, programming holds a special place for two reasons.
First, the artistic part is not only undeniable but also essential.
Second, and much like in a purely artistic discipline, the act of programming is driven partly by the notion of aesthetics: the pleasure we have in creating beautiful things.
Even though the importance of aesthetics in the act of programming is now unquestioned, more could still be written on the subject.
The field called "psychology of programming" focuses on the cognitive aspects of the activity, with the goal of improving the productivity of programmers.
While many scientists have emphasized their concern for aesthetics and the impact it has on their activity, few computer scientists have actually written about their thought process while programming.
What makes us like or dislike such and such language or paradigm?
Why do we shape our programs the way we do?
By answering these questions from the angle of aesthetics, we may be able to shed some new light on the art of programming.
Starting from the assumption that aesthetics is an inherently transversal dimension, it should be possible for every programmer to find the same aesthetic driving force in every creative activity they undertake, not just programming, and in doing so, get deeper insight on why and how they do things the way they do.
On the other hand, because our aesthetic sensitivities are so personal, all we can really do is relate our own experiences and share it with others, in the hope that it will inspire them to do the same.
My personal life has been revolving around three major creative activities, of equal importance: programming in Lisp, playing Jazz music, and practicing Aikido.
But why so many of them, why so different ones, and why these specifically?
By introspecting my personal aesthetic sensitivities, I eventually realized that my tastes in the scientific, artistic, and physical domains are all motivated by the same driving forces, hence unifying Lisp, Jazz, and Aikido as three expressions of a single essence, not so different after all.
Lisp, Jazz, and Aikido are governed by a limited set of rules which remain simple and unobtrusive.
Conforming to them is a pleasure.
Because Lisp, Jazz, and Aikido are inherently introspective disciplines, they also invite you to transgress the rules in order to find your own.
Breaking the rules is fun.
Finally, if Lisp, Jazz, and Aikido unify so many paradigms, styles, or techniques, it is not by mere accumulation but because they live at the meta-level and let you reinvent them.
Working at the meta-level is an enlightening experience.
Understand your aesthetic sensitivities and you may gain considerable insight on your own psychology of programming.
Mine is perhaps common to most lispers.
Perhaps also common to other programming communities, but that, is for the reader to decide...
We propose a method to improve traditional character-based PPM text compression algorithms.
Consider a text file as a sequence of alternating words and non-words, the basic idea of our algorithm is to encode non-words and prefixes of words using character-based context models and encode suffixes of words using dictionary models.
By using dictionary models, the algorithm can encode multiple characters as a whole, and thus enhance the compression efficiency.
The advantages of the proposed algorithm are: 1) it does not require any text preprocessing; 2) it does not need any explicit codeword to identify switch between context and dictionary models; 3) it can be applied to any character-based PPM algorithms without incurring much additional computational cost.
Test results show that significant improvements can be obtained over character-based PPM, especially in low order cases.
Dodis et al. proposed an improved version of the fuzzy vault scheme, one of the most popular primitives used in biometric cryptosystems, requiring less storage and leaking less information.
Recently, Blanton and Aliasgari have shown that the relation of two improved fuzzy vault records of the same individual may be determined by solving a system of non-linear equations.
However, they conjectured that this is feasible for small parameters only.
In this paper, we present a new attack against the improved fuzzy vault scheme based on the extended Euclidean algorithm that determines if two records are related and recovers the elements by which the protected features, e.g., the biometric templates, differ.
Our theoretical and empirical analysis demonstrates that the attack is very effective and efficient for practical parameters.
Furthermore, we show how this attack can be extended to fully recover both feature sets from related vault records much more efficiently than possible by attacking each record individually.
We complement this work by deriving lower bounds for record multiplicity attacks and use these to show that our attack is asymptotically optimal in an information theoretic sense.
Finally, we propose remedies to harden the scheme against record multiplicity attacks.
In a reversible language, any forward computation can be undone by a finite sequence of backward steps.
Reversible computing has been studied in the context of different programming languages and formalisms, where it has been used for testing and verification, among others.
In this paper, we consider a subset of Erlang, a functional and concurrent programming language based on the actor model.
We present a formal semantics for reversible computation in this language and prove its main properties, including its causal consistency.
We also build on top of it a rollback operator that can be used to undo the actions of a process up to a given checkpoint.
Network slicing to enable resource sharing among multiple tenants --network operators and/or services-- is considered a key functionality for next generation mobile networks.
This paper provides an analysis of a well-known model for resource sharing, the 'share-constrained proportional allocation' mechanism, to realize network slicing.
This mechanism enables tenants to reap the performance benefits of sharing, while retaining the ability to customize their own users' allocation.
This results in a network slicing game in which each tenant reacts to the user allocations of the other tenants so as to maximize its own utility.
We show that, under appropriate conditions, the game associated with such strategic behavior converges to a Nash equilibrium.
At the Nash equilibrium, a tenant always achieves the same, or better, performance than under a static partitioning of resources, hence providing the same level of protection as such static partitioning.
We further analyze the efficiency and fairness of the resulting allocations, providing tight bounds for the price of anarchy and envy-freeness.
Our analysis and extensive simulation results confirm that the mechanism provides a comprehensive practical solution to realize network slicing.
Our theoretical results also fill a gap in the literature regarding the analysis of this resource allocation model under strategic players.
Issues regarding explainable AI involve four components: users, laws & regulations, explanations and algorithms.
Together these components provide a context in which explanation methods can be evaluated regarding their adequacy.
The goal of this chapter is to bridge the gap between expert users and lay users.
Different kinds of users are identified and their concerns revealed, relevant statements from the General Data Protection Regulation are analyzed in the context of Deep Neural Networks (DNNs), a taxonomy for the classification of existing explanation methods is introduced, and finally, the various classes of explanation methods are analyzed to verify if user concerns are justified.
Overall, it is clear that (visual) explanations can be given about various aspects of the influence of the input on the output.
However, it is noted that explanation methods or interfaces for lay users are missing and we speculate which criteria these methods / interfaces should satisfy.
Finally it is noted that two important concerns are difficult to address with explanation methods: the concern about bias in datasets that leads to biased DNNs, as well as the suspicion about unfair outcomes.
Air quality forecasting has been regarded as the key problem of air pollution early warning and control management.
In this paper, we propose a novel deep learning model for air quality (mainly PM2.5) forecasting, which learns the spatial-temporal correlation features and interdependence of multivariate air quality related time series data by hybrid deep learning architecture.
Due to the nonlinear and dynamic characteristics of multivariate air quality time series data, the base modules of our model include one-dimensional Convolutional Neural Networks (CNN) and Bi-directional Long Short-term Memory networks (Bi-LSTM).
The former is to extract the local trend features and the latter is to learn long temporal dependencies.
Then we design a jointly hybrid deep learning framework which based on one-dimensional CNN and Bi-LSTM for shared representation features learning of multivariate air quality related time series data.
The experiment results show that our model is capable of dealing with PM2.5 air pollution forecasting with satisfied accuracy.
Near-miss experiences are one of the main sources of intense emotions.
Despite people's consistency when judging near-miss situations and when communicating about them, there is no integrated theoretical account of the phenomenon.
In particular, individuals' reaction to near-miss situations is not correctly predicted by rationality-based or probability-based optimization.
The present study suggests that emotional intensity in the case of near-miss is in part predicted by Simplicity Theory.
To improve the efficiency of surgical trajectory segmentation for robot learning in robot-assisted minimally invasive surgery, this paper presents a fast unsupervised method using video and kinematic data, followed by a promoting procedure to address the over-segmentation issue.
Unsupervised deep learning network, stacking convolutional auto-encoder, is employed to extract more discriminative features from videos in an effective way.
To further improve the accuracy of segmentation, on one hand, wavelet transform is used to filter out the noises existed in the features from video and kinematic data.
On the other hand, the segmentation result is promoted by identifying the adjacent segments with no state transition based on the predefined similarity measurements.
Extensive experiments on a public dataset JIGSAWS show that our method achieves much higher accuracy of segmentation than state-of-the-art methods in the shorter time.
Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN).
In this paper, we propose a novel convolutional neural networks (CNN) based framework for both action classification and detection.
Raw skeleton coordinates as well as skeleton motion are fed directly into CNN for label prediction.
A novel skeleton transformer module is designed to rearrange and select important skeleton joints automatically.
With a simple 7-layer network, we obtain 89.3% accuracy on validation set of the NTU RGB+D dataset.
For action detection in untrimmed videos, we develop a window proposal network to extract temporal segment proposals, which are further classified within the same network.
On the recent PKU-MMD dataset, we achieve 93.7% mAP, surpassing the baseline by a large margin.
To better detect pedestrians of various scales, deep multi-scale methods usually detect pedestrians of different scales by different in-network layers.
However, the semantic levels of features from different layers are usually inconsistent.
In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches.
As a result, the different branches have the same depth and the output features of different branches have similarly high-level semantics.
Due to the difference of receptive fields, the different branches are suitable to detect pedestrians of different scales.
Meanwhile, the multi-branch network does not introduce additional parameters by sharing convolutional weights of different branches.
To further improve detection performance, skip-layer connections among different branches are used to add context to the branch of relatively small receptive filed, and dilated convolution is incorporated into part branches to enlarge the resolutions of output feature maps.
When they are embedded into Faster RCNN architecture, the weighted scores of proposal generation network and proposal classification network are further proposed.
Experiments on KITTI dataset, Caltech pedestrian dataset, and Citypersons dataset demonstrate the effectiveness of proposed method.
On these pedestrian datasets, the proposed method achieves state-of-the-art detection performance.
Moreover, experiments on COCO benchmark show the proposed method is also suitable for general object detection.
This paper presents an iterative smoothing technique for polygonal approximation of digital image boundary.
The technique starts with finest initial segmentation points of a curve.
The contribution of initially segmented points towards preserving the original shape of the image boundary is determined by computing the significant measure of every initial segmentation points which is sensitive to sharp turns, which may be missed easily when conventional significant measures are used for detecting dominant points.
The proposed method differentiates between the situations when a point on the curve between two points on a curve projects directly upon the line segment or beyond this line segment.
It not only identifies these situations, but also computes its significant contribution for these situations differently.
This situation-specific treatment allows preservation of points with high curvature even as revised set of dominant points are derived.
The experimental results show that the proposed technique competes well with the state of the art techniques.
We consider the learning of algorithmic tasks by mere observation of input-output pairs.
Rather than studying this as a black-box discrete regression problem with no assumption whatsoever on the input-output mapping, we concentrate on tasks that are amenable to the principle of divide and conquer, and study what are its implications in terms of learning.
This principle creates a powerful inductive bias that we leverage with neural architectures that are defined recursively and dynamically, by learning two scale-invariant atomic operations: how to split a given input into smaller sets, and how to merge two partially solved tasks into a larger partial solution.
Our model can be trained in weakly supervised environments, namely by just observing input-output pairs, and in even weaker environments, using a non-differentiable reward signal.
Moreover, thanks to the dynamic aspect of our architecture, we can incorporate the computational complexity as a regularization term that can be optimized by backpropagation.
We demonstrate the flexibility and efficiency of the Divide-and-Conquer Network on several combinatorial and geometric tasks: convex hull, clustering, knapsack and euclidean TSP.
Thanks to the dynamic programming nature of our model, we show significant improvements in terms of generalization error and computational complexity.
Incremental learning from non-stationary data poses special challenges to the field of machine learning.
Although new algorithms have been developed for this, assessment of results and comparison of behaviors are still open problems, mainly because evaluation metrics, adapted from more traditional tasks, can be ineffective in this context.
Overall, there is a lack of common testing practices.
This paper thus presents a testbed for incremental non-stationary learning algorithms, based on specially designed synthetic datasets.
Also, test results are reported for some well-known algorithms to show that the proposed methodology is effective at characterizing their strengths and weaknesses.
It is expected that this methodology will provide a common basis for evaluating future contributions in the field.
An event-based state estimation approach for reducing communication in a networked control system is proposed.
Multiple distributed sensor-actuator-agents observe a dynamic process and sporadically exchange their measurements and inputs over a bus network.
Based on these data, each agent estimates the full state of the dynamic system, which may exhibit arbitrary inter-agent couplings.
Local event-based protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy.
This event-based scheme is shown to mimic a centralized Luenberger observer design up to guaranteed bounds, and stability is proven in the sense of bounded estimation errors for bounded disturbances.
The stability result extends to the distributed control system that results when the local state estimates are used for distributed feedback control.
Simulation results highlight the benefit of the event-based approach over classical periodic ones in reducing communication requirements.
We propose the use of incomplete dot products (IDP) to dynamically adjust the number of input channels used in each layer of a convolutional neural network during feedforward inference.
IDP adds monotonically non-increasing coefficients, referred to as a "profile", to the channels during training.
The profile orders the contribution of each channel in non-increasing order.
At inference time, the number of channels used can be dynamically adjusted to trade off accuracy for lowered power consumption and reduced latency by selecting only a beginning subset of channels.
This approach allows for a single network to dynamically scale over a computation range, as opposed to training and deploying multiple networks to support different levels of computation scaling.
Additionally, we extend the notion to multiple profiles, each optimized for some specific range of computation scaling.
We present experiments on the computation and accuracy trade-offs of IDP for popular image classification models and datasets.
We demonstrate that, for MNIST and CIFAR-10, IDP reduces computation significantly, e.g., by 75%, without significantly compromising accuracy.
We argue that IDP provides a convenient and effective means for devices to lower computation costs dynamically to reflect the current computation budget of the system.
For example, VGG-16 with 50% IDP (using only the first 50% of channels) achieves 70% in accuracy on the CIFAR-10 dataset compared to the standard network which achieves only 35% accuracy when using the reduced channel set.
We present an approach that exploits hierarchical Recurrent Neural Networks (RNNs) to tackle the video captioning problem, i.e., generating one or multiple sentences to describe a realistic video.
Our hierarchical framework contains a sentence generator and a paragraph generator.
The sentence generator produces one simple short sentence that describes a specific short video interval.
It exploits both temporal- and spatial-attention mechanisms to selectively focus on visual elements during generation.
The paragraph generator captures the inter-sentence dependency by taking as input the sentential embedding produced by the sentence generator, combining it with the paragraph history, and outputting the new initial state for the sentence generator.
We evaluate our approach on two large-scale benchmark datasets: YouTubeClips and TACoS-MultiLevel.
The experiments demonstrate that our approach significantly outperforms the current state-of-the-art methods with BLEU@4 scores 0.499 and 0.305 respectively.
We consider the spatial stochastic model of single-tier downlink cellular networks, where the wireless base stations are deployed according to a general stationary point process on the Euclidean plane with general i.i.d. propagation effects.
Recently, Ganti & Haenggi (2016) consider the same general cellular network model and, as one of many significant results, derive the tail asymptotics of the signal-to-interference ratio (SIR) distribution.
However, they do not mention any conditions under which the result holds.
In this paper, we compensate their result for the lack of the condition and expose a sufficient condition for the asymptotic result to be valid.
We further illustrate some examples satisfying such a sufficient condition and indicate the corresponding asymptotic results for the example models.
We give also a simple counterexample violating the sufficient condition.
Network intrusion detection is the process of identifying malicious behaviors that target a network and its resources.
Current systems implementing intrusion detection processes observe traffic at several data collecting points in the network but analysis is often centralized or partly centralized.
These systems are not scalable and suffer from the single point of failure, i.e. attackers only need to target the central node to compromise the whole system.
This paper proposes an anomaly-based fully distributed network intrusion detection system where analysis is run at each data collecting point using a naive Bayes classifier.
Probability values computed by each classifier are shared among nodes using an iterative average consensus protocol.
The final analysis is performed redundantly and in parallel at the level of each data collecting point, thus avoiding the single point of failure issue.
We run simulations focusing on DDoS attacks with several network configurations, comparing the accuracy of our fully distributed system with a hierarchical one.
We also analyze communication costs and convergence speed during consensus phases.
Standard artificial neural networks suffer from the well-known issue of catastrophic forgetting, making continual or lifelong learning problematic.
Recently, numerous methods have been proposed for continual learning, but due to differences in evaluation protocols it is difficult to directly compare their performance.
To enable more meaningful comparisons, we identified three distinct continual learning scenarios based on whether task identity is known and, if it is not, whether it needs to be inferred.
Performing the split and permuted MNIST task protocols according to each of these scenarios, we found that regularization-based approaches (e.g., elastic weight consolidation) failed when task identity needed to be inferred.
In contrast, generative replay combined with distillation (i.e., using class probabilities as "soft targets") achieved superior performance in all three scenarios.
In addition, we reduced the computational cost of generative replay by integrating the generative model into the main model by equipping it with generative feedback connections.
This Replay-through-Feedback approach substantially shortened training time with no or negligible loss in performance.
We believe this to be an important first step towards making the powerful technique of generative replay scalable to real-world continual learning applications.
Given a network of nodes, minimizing the spread of a contagion using a limited budget is a well-studied problem with applications in network security, viral marketing, social networks, and public health.
In real graphs, virus may infect a node which in turn infects its neighbor nodes and this may trigger an epidemic in the whole graph.
The goal thus is to select the best k nodes (budget constraint) that are immunized (vaccinated, screened, filtered) so as the remaining graph is less prone to the epidemic.
It is known that the problem is, in all practical models, computationally intractable even for moderate sized graphs.
In this paper we employ ideas from spectral graph theory to define relevance and importance of nodes.
Using novel graph theoretic techniques, we then design an efficient approximation algorithm to immunize the graph.
Theoretical guarantees on the running time of our algorithm show that it is more efficient than any other known solution in the literature.
We test the performance of our algorithm on several real world graphs.
Experiments show that our algorithm scales well for large graphs and outperforms state of the art algorithms both in quality (containment of epidemic) and efficiency (runtime and space complexity).
Connectivity of wireless sensor networks (WSNs) is a fundamental global property expected to be maintained even though some sensor nodes are at fault.
In this paper, we investigate the connectivity of random geometric graphs (RGGs) in the node fault model as an abstract model of ad hoc WSNs with unreliable nodes.
In the model, each node is assumed to be stochastically at fault, i.e., removed from a graph.
As a measure of reliability, the network breakdown probability is then defined as the average probability that a resulting survival graph is disconnected over RGGs.
We examine RGGs with general connection functions as an extension of a conventional RGG model and provide two mathematical analyses: the asymptotic analysis for infinite RGGs that reveals the phase transition thresholds of connectivity, and the non-asymptotic analysis for finite RGGs that provides a useful approximation formula.
Those analyses are supported by numerical simulations in the Rayleigh SISO model reflecting a practical wireless channel.
Diffusion Tensor Imaging (DTI) is an effective tool for the analysis of structural brain connectivity in normal development and in a broad range of brain disorders.
However efforts to derive inherent characteristics of structural brain networks have been hampered by the very high dimensionality of the data, relatively small sample sizes, and the lack of widely acceptable connectivity-based regions of interests (ROIs).
Typical approaches have focused either on regions defined by standard anatomical atlases that do not incorporate anatomical connectivity, or have been based on voxel-wise analysis, which results in loss of statistical power relative to structure-wise connectivity analysis.
In this work, we propose a novel, computationally efficient iterative clustering method to generate connectivity-based whole-brain parcellations that converge to a stable parcellation in a few iterations.
Our algorithm is based on a sparse representation of the whole brain connectivity matrix, which reduces the number of edges from around a half billion to a few million while incorporating the necessary spatial constraints.
We show that the resulting regions in a sense capture the inherent connectivity information present in the data, and are stable with respect to initialization and the randomization scheme within the algorithm.
These parcellations provide consistent structural regions across the subjects of population samples that are homogeneous with respect to anatomic connectivity.
Our method also derives connectivity structures that can be used to distinguish between population samples with known different structural connectivity.
In particular, new results in structural differences for different population samples such as Females vs Males, Normal Controls vs Schizophrenia, and different age groups in Normal Controls are also shown.
Our goal is to train a policy for autonomous driving via imitation learning that is robust enough to drive a real vehicle.
We find that standard behavior cloning is insufficient for handling complex driving scenarios, even when we leverage a perception system for preprocessing the input and a controller for executing the output on the car: 30 million examples are still not enough.
We propose exposing the learner to synthesized data in the form of perturbations to the expert's driving, which creates interesting situations such as collisions and/or going off the road.
Rather than purely imitating all data, we augment the imitation loss with additional losses that penalize undesirable events and encourage progress -- the perturbations then provide an important signal for these losses and lead to robustness of the learned model.
We show that the ChauffeurNet model can handle complex situations in simulation, and present ablation experiments that emphasize the importance of each of our proposed changes and show that the model is responding to the appropriate causal factors.
Finally, we demonstrate the model driving a car in the real world.
The unsupervised Pretraining method has been widely used in aiding human action recognition.
However, existing methods focus on reconstructing the already present frames rather than generating frames which happen in future.In this paper, We propose an improved Variantial Autoencoder model to extract the features with a high connection to the coming scenarios, also known as Predictive Learning.
Our framework lists as following: two steam 3D-convolution neural networks are used to extract both spatial and temporal information as latent variables.
Then a resample method is introduced to create new normal distribution probabilistic latent variables and finally, the deconvolution neural network will use these latent variables generate next frames.
Through this possess, we train the model to focus more on how to generate the future and thus it will extract the future high connected features.
In the experiment stage, A large number of experiments on UT and UCF101 datasets reveal that future generation aids Prediction does improve the performance.
Moreover, the Future Representation Learning Network reach a higher score than other methods when in half observation.
This means that Future Representation Learning is better than the traditional Representation Learning and other state- of-the-art methods in solving the human action prediction problems to some extends.
HTTP-based video streaming technologies allow for flexible rate selection strategies that account for time-varying network conditions.
Such rate changes may adversely affect the user's Quality of Experience; hence online prediction of the time varying subjective quality can lead to perceptually optimised bitrate allocation policies.
Recent studies have proposed to use dynamic network approaches for continuous-time prediction; yet they do not consider multiple video quality models as inputs nor consider forecasting ensembles.
Here we address the problem of predicting continuous-time subjective quality using multiple inputs fed to a non-linear autoregressive network.
By considering multiple network configurations and by applying simple averaging forecasting techniques, we are able to considerably improve prediction performance and decrease forecasting errors.
The notes which play the most important and second most important roles in expressing a raga are called Vadi and Samvadi swars respectively in (North) Indian Classical music.
Like Bageshree, Bhairavi, Shankara, Hamir and Kalingra, Rageshree is another controversial raga so far as the choice of Vadi-Samvadi selection is concerned where there are two different opinions.
In the present work, a two minute vocal recording of raga Rageshree is subjected to a careful statistical analysis.
Our analysis is broken into three phases: first half, middle half and last half.
Under a multinomial model set up holding appreciably in the first two phases, only one opinion is found acceptable.
In the last phase the distribution seems to be quasi multinomial, characterized by an unstable nature of relative occurrence of pitch of all the notes and although the note whose relative occurrence of pitch suddenly shoots is the Vadi swar selected from our analysis of the first two phases, we take it as an outlier demanding a separate treatment like any other in statistics.
Selection of Vadi-Samvadi notes in a quasi-multinomial set up is still an open research problem.
An interesting musical cocktail is proposed, however, embedding several ideas like melodic property of notes, note combinations and pitch movements between notes, using some weighted combination of psychological and statistical stability of notes along with watching carefully the sudden shoot of one or more notes whenever there is enough evidence that multinomial model has broken down.
In the measurement process, there are many parameters affecting the measurement results: the influence of the probe system, material stiffness of measured workpiece, the calibration of the probe with a reference sphere, the thermal effects.
We want to obtain the limits of a measurement methodology to be able to validate a result.
The study is applied to a simple part.
We observe the dispersion of the position of different drilled holes (XYZ values in a coordinate system) when we change the quality of the part and the method of calculation.
We use the Design of Experiment (Taguchi method) to realize our study.
We study the influence of the part quality on a measurement results.
We consider two parameters to define the part quality (flatness and perpendicularity).
We will also study the influence of different methods of calculation to determine the coordinate system.
We can use two options in Metrolog XG software (tangent plane with or without orientation constraint).
The originality of this paper is that we present a method for the design of experiment that uses CATIA (CAD system) to generate the measured parts.
In this way we can realize a design of experiment with a largest number of experimental results.
This is a positive point for a statistical analysis.
We are also free to define the parts we want to study without manufacturing difficulties.
Neural networks with random hidden nodes have gained increasing interest from researchers and practical applications.
This is due to their unique features such as very fast training and universal approximation property.
In these networks the weights and biases of hidden nodes determining the nonlinear feature mapping are set randomly and are not learned.
Appropriate selection of the intervals from which weights and biases are selected is extremely important.
This topic has not yet been sufficiently explored in the literature.
In this work a method of generating random weights and biases is proposed.
This method generates the parameters of the hidden nodes in such a way that nonlinear fragments of the activation functions are located in the input space regions with data and can be used to construct the surface approximating a nonlinear target function.
The weights and biases are dependent on the input data range and activation function type.
The proposed methods allows us to control the generalization degree of the model.
These all lead to improvement in approximation performance of the network.
Several experiments show very promising results.
The main objective of this project is to segment different breast ultrasound images to find out lesion area by discarding the low contrast regions as well as the inherent speckle noise.
The proposed method consists of three stages (removing noise, segmentation, classification) in order to extract the correct lesion.
We used normalized cuts approach to segment ultrasound images into regions of interest where we can possibly finds the lesion, and then K-means classifier is applied to decide finally the location of the lesion.
For every original image, an annotated ground-truth image is given to perform comparison with the obtained experimental results, providing accurate evaluation measures.
This research paper designates the importance and usage of the case study approach effectively to educating and training software designers and software engineers both in academic and industry.
Subsequently an account of the use of case studies based on software engineering in the education of professionals, there is a conversation of issues in training software designers and how a case teaching method can be used to state these issues.
The paper describes a software project titled Online Tower Plotting System (OTPS) to develop a complete and comprehensive case study, along with supporting educational material.
The case study is aimed to demonstrate a variety of software areas, modules and courses: from bachelor through masters, doctorates and even for ongoing professional development.
Recommender systems recommend items more accurately by analyzing users' potential interest on different brands' items.
In conjunction with users' rating similarity, the presence of users' implicit feedbacks like clicking items, viewing items specifications, watching videos etc. have been proved to be helpful for learning users' embedding, that helps better rating prediction of users.
Most existing recommender systems focus on modeling of ratings and implicit feedbacks ignoring users' explicit feedbacks.
Explicit feedbacks can be used to validate the reliability of the particular users and can be used to learn about the users' characteristic.
Users' characteristic mean what type of reviewers they are.
In this paper, we explore three different models for recommendation with more accuracy focusing on users' explicit feedbacks and implicit feedbacks.
First one is RHC-PMF that predicts users' rating more accurately based on user's three explicit feedbacks (rating, helpfulness score and centrality) and second one is RV-PMF, where user's implicit feedback (view relationship) is considered.
Last one is RHCV-PMF, where both type of feedbacks are considered.
In this model users' explicit feedbacks' similarity indicate the similarity of their reliability and characteristic and implicit feedback's similarity indicates their preference similarity.
Extensive experiments on real world dataset, i.e.Amazon.com online review dataset shows that our models perform better compare to base-line models in term of users' rating prediction.
RHCV-PMF model also performs better rating prediction compare to baseline models for cold start users and cold start items.
This study improves the performance of neural named entity recognition by a margin of up to 11% in F-score on the example of a low-resource language like German, thereby outperforming existing baselines and establishing a new state-of-the-art on each single open-source dataset.
Rather than designing deeper and wider hybrid neural architectures, we gather all available resources and perform a detailed optimization and grammar-dependent morphological processing consisting of lemmatization and part-of-speech tagging prior to exposing the raw data to any training process.
We test our approach in a threefold monolingual experimental setup of a) single, b) joint, and c) optimized training and shed light on the dependency of downstream-tasks on the size of corpora used to compute word embeddings.
Modern cities and metropolitan areas all over the world face new management challenges in the 21st century primarily due to increasing demands on living standards by the urban population.
These challenges range from climate change, pollution, transportation, and citizen engagement, to urban planning, and security threats.
The primary goal of a Smart City is to counteract these problems and mitigate their effects by means of modern ICT to improve urban administration and infrastructure.
Key ideas are to utilise network communication to inter-connect public authorities; but also to deploy and integrate numerous sensors and actuators throughout the city infrastructure - which is also widely known as the Internet of Things (IoT).
Thus, IoT technologies will be an integral part and key enabler to achieve many objectives of the Smart City vision.
The contributions of this paper are as follows.
We first examine a number of IoT platforms, technologies and network standards that can help to foster a Smart City environment.
Second, we introduce the EU project MONICA which aims for demonstration of large-scale IoT deployments at public, inner-city events and give an overview on its IoT platform architecture.
And third, we provide a case-study report on SmartCity activities by the City of Hamburg and provide insights on recent (on-going) field tests of a vertically integrated, end-to-end IoT sensor application.
Trajectory Prediction of dynamic objects is a widely studied topic in the field of artificial intelligence.
Thanks to a large number of applications like predicting abnormal events, navigation system for the blind, etc. there have been many approaches to attempt learning patterns of motion directly from data using a wide variety of techniques ranging from hand-crafted features to sophisticated deep learning models for unsupervised feature learning.
All these approaches have been limited by problems like inefficient features in the case of hand crafted features, large error propagation across the predicted trajectory and no information of static artefacts around the dynamic moving objects.
We propose an end to end deep learning model to learn the motion patterns of humans using different navigational modes directly from data using the much popular sequence to sequence model coupled with a soft attention mechanism.
We also propose a novel approach to model the static artefacts in a scene and using these to predict the dynamic trajectories.
The proposed method, tested on trajectories of pedestrians, consistently outperforms previously proposed state of the art approaches on a variety of large scale data sets.
We also show how our architecture can be naturally extended to handle multiple modes of movement (say pedestrians, skaters, bikers and buses) simultaneously.
Greater penetration of Distributed Energy Resources (DERs) in power networks requires coordination strategies that allow for self-adjustment of contributions in a network of DERs, owing to variability in generation and demand.
In this article, a distributed scheme is proposed that enables a DER in a network to arrive at viable power reference commands that satisfies the DERs local constraints on its generation and loads it has to service, while, the aggregated behavior of multiple DERs in the network and their respective loads meet the ancillary services demanded by the grid.
The Net-load Management system for a single unit is referred to as the Local Inverter System (LIS) in this article .
A distinguishing feature of the proposed consensus based solution is the distributed finite time termination of the algorithm that allows each LIS unit in the network to determine power reference commands in the presence of communication delays in a distributed manner.
The proposed scheme allows prioritization of Renewable Energy Sources (RES) in the network and also enables auto-adjustment of contributions from LIS units with lower priority resources (non-RES).
The methods are validated using hardware-in-the-loop simulations with Raspberry PI devices as distributed control units, implementing the proposed distributed algorithm and responsible for determining and dispatching realtime power reference commands to simulated power electronics interface emulating LIS units for demand response.
In computer vision, the estimation of the fundamental matrix is a basic problem that has been extensively studied.
The accuracy of the estimation imposes a significant influence on subsequent tasks such as the camera trajectory determination and 3D reconstruction.
In this paper we propose a new method for fundamental matrix estimation that makes use of clustering a group of 4D vectors.
The key insight is the observation that among the 4D vectors constructed from matching pairs of points obtained from the SIFT algorithm, well-defined cluster points tend to be reliable inliers suitable for fundamental matrix estimation.
Based on this, we utilizes a recently proposed efficient clustering method through density peaks seeking and propose a new clustering assisted method.
Experimental results show that the proposed algorithm is faster and more accurate than currently commonly used methods.
In this paper, we investigate two decomposition methods for their convergence rate which are used to solve security constrained economic dispatch (SCED): 1) Lagrangian Relaxation (LR), and 2) Augmented Lagrangian Relaxation (ALR).
First, the centralized SCED problem is posed for a 6-bus test network and then it is decomposed into subproblems using both of the methods.
In order to model the tie-line between decomposed areas of the test network, a novel method is proposed.
The advantages and drawbacks of each method are discussed in terms of accuracy and information privacy.
We show that there is a tradeoff between the information privacy and the convergence rate.
It has been found that ALR converges faster compared to LR, due to the large amount of shared data.
Currency trading (Forex) is the largest world market in terms of volume.
We analyze trading and tweeting about the EUR-USD currency pair over a period of three years.
First, a large number of tweets were manually labeled, and a Twitter stance classification model is constructed.
The model then classifies all the tweets by the trading stance signal: buy, hold, or sell (EUR vs. USD).
The Twitter stance is compared to the actual currency rates by applying the event study methodology, well-known in financial economics.
It turns out that there are large differences in Twitter stance distribution and potential trading returns between the four groups of Twitter users: trading robots, spammers, trading companies, and individual traders.
Additionally, we observe attempts of reputation manipulation by post festum removal of tweets with poor predictions, and deleting/reposting of identical tweets to increase the visibility without tainting one's Twitter timeline.
The past several years have witnessed the rapid progress of end-to-end Neural Machine Translation (NMT).
However, there exists discrepancy between training and inference in NMT when decoding, which may lead to serious problems since the model might be in a part of the state space it has never seen during training.
To address the issue, Scheduled Sampling has been proposed.
However, there are certain limitations in Scheduled Sampling and we propose two dynamic oracle-based methods to improve it.
We manage to mitigate the discrepancy by changing the training process towards a less guided scheme and meanwhile aggregating the oracle's demonstrations.
Experimental results show that the proposed approaches improve translation quality over standard NMT system.
Deep convolutional neural networks have led to breakthrough results in practical feature extraction applications.
The mathematical analysis of these networks was pioneered by Mallat, 2012.
Specifically, Mallat considered so-called scattering networks based on identical semi-discrete wavelet frames in each network layer, and proved translation-invariance as well as deformation stability of the resulting feature extractor.
The purpose of this paper is to develop Mallat's theory further by allowing for different and, most importantly, general semi-discrete frames (such as, e.g., Gabor frames, wavelets, curvelets, shearlets, ridgelets) in distinct network layers.
This allows to extract wider classes of features than point singularities resolved by the wavelet transform.
Our generalized feature extractor is proven to be translation-invariant, and we develop deformation stability results for a larger class of deformations than those considered by Mallat.
For Mallat's wavelet-based feature extractor, we get rid of a number of technical conditions.
The mathematical engine behind our results is continuous frame theory, which allows us to completely detach the invariance and deformation stability proofs from the particular algebraic structure of the underlying frames.
In the context of natural language processing, representation learning has emerged as a newly active research subject because of its excellent performance in many applications.
Learning representations of words is a pioneering study in this school of research.
However, paragraph (or sentence and document) embedding learning is more suitable/reasonable for some tasks, such as sentiment classification and document summarization.
Nevertheless, as far as we are aware, there is relatively less work focusing on the development of unsupervised paragraph embedding methods.
Classic paragraph embedding methods infer the representation of a given paragraph by considering all of the words occurring in the paragraph.
Consequently, those stop or function words that occur frequently may mislead the embedding learning process to produce a misty paragraph representation.
Motivated by these observations, our major contributions in this paper are twofold.
First, we propose a novel unsupervised paragraph embedding method, named the essence vector (EV) model, which aims at not only distilling the most representative information from a paragraph but also excluding the general background information to produce a more informative low-dimensional vector representation for the paragraph.
Second, in view of the increasing importance of spoken content processing, an extension of the EV model, named the denoising essence vector (D-EV) model, is proposed.
The D-EV model not only inherits the advantages of the EV model but also can infer a more robust representation for a given spoken paragraph against imperfect speech recognition.
Planarity Testing is the problem of determining whether a given graph is planar while planar embedding is the corresponding construction problem.
The bounded space complexity of these problems has been determined to be exactly Logspace by Allender and Mahajan with the aid of Reingold's result.
Unfortunately, the algorithm is quite daunting and generalizing it to say, the bounded genus case seems a tall order.
In this work, we present a simple planar embedding algorithm running in logspace.
We hope this algorithm will be more amenable to generalization.
The algorithm is based on the fact that 3-connected planar graphs have a unique embedding, a variant of Tutte's criterion on conflict graphs of cycles and an explicit change of cycle basis.% for planar graphs.
We also present a logspace algorithm to find obstacles to planarity, viz. a Kuratowski minor, if the graph is non-planar.
To the best of our knowledge this is the first logspace algorithm for this problem.
We describe an innovative framework for prescription of personalised health apps by integrating Personal Health Records (PHR) with disease-specific mobile applications for managing medical conditions and the communication with clinical professionals.
The prescribed apps record multiple variables including medical history enriched with innovative features such as integration with medical monitoring devices and wellbeing trackers to provide patients and clinicians with a personalised support on disease management.
Our framework is based on an existing PHR ecosystem called TreC, uniquely positioned between healthcare provider and the patients, which is being used by over 70.000 patients in Trentino region in Northern Italy.
We also describe three important aspects of health app prescription and how medical information is automatically encoded through the TreC framework and is prescribed as a personalised app, ready to be installed in the patients' smartphone.
On a daily investment decision in a security market, the price earnings (PE) ratio is one of the most widely applied methods being used as a firm valuation tool by investment experts.
Unfortunately, recent academic developments in financial econometrics and machine learning rarely look at this tool.
In practice, fundamental PE ratios are often estimated only by subjective expert opinions.
The purpose of this research is to formalize a process of fundamental PE estimation by employing advanced dynamic Bayesian network (DBN) methodology.
The estimated PE ratio from our model can be used either as a information support for an expert to make investment decisions, or as an automatic trading system illustrated in experiments.
Forward-backward inference and EM parameter estimation algorithms are derived with respect to the proposed DBN structure.
Unlike existing works in literatures, the economic interpretation of our DBN model is well-justified by behavioral finance evidences of volatility.
A simple but practical trading strategy is invented based on the result of Bayesian inference.
Extensive experiments show that our trading strategy equipped with the inferenced PE ratios consistently outperforms standard investment benchmarks.
Two approaches are proposed for cross-pose face recognition, one is based on the 3D reconstruction of facial components and the other is based on the deep Convolutional Neural Network (CNN).
Unlike most 3D approaches that consider holistic faces, the proposed approach considers 3D facial components.
It segments a 2D gallery face into components, reconstructs the 3D surface for each component, and recognizes a probe face by component features.
The segmentation is based on the landmarks located by a hierarchical algorithm that combines the Faster R-CNN for face detection and the Reduced Tree Structured Model for landmark localization.
The core part of the CNN-based approach is a revised VGG network.
We study the performances with different settings on the training set, including the synthesized data from 3D reconstruction, the real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a classifier or designed as a feature extractor.
The two recognition approaches and the fast landmark localization are evaluated in extensive experiments, and compared to stateof-the-art methods to demonstrate their efficacy.
In this paper, we implement an information-theoretic approach to travel behaviour analysis by introducing a generative modelling framework to identify informative latent characteristics in travel decision making.
It involves developing a joint tri-partite Bayesian graphical network model using a Restricted Boltzmann Machine (RBM) generative modelling framework.
We apply this framework on a mode choice survey data to identify abstract latent variables and compare the performance with a traditional latent variable model with specific latent preferences -- safety, comfort, and environmental.
Data collected from a joint stated and revealed preference mode choice survey in Quebec, Canada were used to calibrate the RBM model.
Results show that a signficant impact on model likelihood statistics and suggests that machine learning tools are highly suitable for modelling complex networks of conditional independent behaviour interactions.
Autonomous vehicles (AVs) require accurate metric and topological location estimates for safe, effective navigation and decision-making.
Although many high-definition (HD) roadmaps exist, they are not always accurate since public roads are dynamic, shaped unpredictably by both human activity and nature.
Thus, AVs must be able to handle situations in which the topology specified by the map does not agree with reality.
We present the Variable Structure Multiple Hidden Markov Model (VSM-HMM) as a framework for localizing in the presence of topological uncertainty, and demonstrate its effectiveness on an AV where lane membership is modeled as a topological localization process.
VSM-HMMs use a dynamic set of HMMs to simultaneously reason about location within a set of most likely current topologies and therefore may also be applied to topological structure estimation as well as AV lane estimation.
In addition, we present an extension to the Earth Mover's Distance which allows uncertainty to be taken into account when computing the distance between belief distributions on simplices of arbitrary relative sizes.
Reservoir Computing is a bio-inspired computing paradigm for processing time dependent signals.
The performance of its analogue implementation are comparable to other state of the art algorithms for tasks such as speech recognition or chaotic time series prediction, but these are often constrained by the offline training methods commonly employed.
Here we investigated the online learning approach by training an opto-electronic reservoir computer using a simple gradient descent algorithm, programmed on an FPGA chip.
Our system was applied to wireless communications, a quickly growing domain with an increasing demand for fast analogue devices to equalise the nonlinear distorted channels.
We report error rates up to two orders of magnitude lower than previous implementations on this task.
We show that our system is particularly well-suited for realistic channel equalisation by testing it on a drifting and a switching channels and obtaining good performances
Various machine learning methods for writer independent recognition of Malayalam handwritten district names are discussed in this paper.
Data collected from 56 different writers are used for the experiments.
The proposed work can be used for the recognition of district in the address written in Malayalam.
Different methods for Dimensionality reduction are discussed.
Features consider for the recognition are Histogram of Oriented Gradient descriptor, Number of Black Pixels in the upper half and lower half, length of image.
Classifiers used in this work are Neural Network, SVM and RandomForest.
We describe an embarrassingly parallel, anytime Monte Carlo method for likelihood-free models.
The algorithm starts with the view that the stochasticity of the pseudo-samples generated by the simulator can be controlled externally by a vector of random numbers u, in such a way that the outcome, knowing u, is deterministic.
For each instantiation of u we run an optimization procedure to minimize the distance between summary statistics of the simulator and the data.
After reweighing these samples using the prior and the Jacobian (accounting for the change of volume in transforming from the space of summary statistics to the space of parameters) we show that this weighted ensemble represents a Monte Carlo estimate of the posterior distribution.
The procedure can be run embarrassingly parallel (each node handling one sample) and anytime (by allocating resources to the worst performing sample).
The procedure is validated on six experiments.
In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction.
We present a method for extracting disambiguated hypernymy relationships that propagates hypernyms to sets of synonyms (synsets), constructs embeddings for these sets, and establishes sense-aware relationships between matching synsets.
Evaluation on two gold standard datasets for English and Russian shows that the method successfully recognizes hypernymy relationships that cannot be found with standard Hearst patterns and Wiktionary datasets for the respective languages.
Mobile health applications that track activities, such as exercise, sleep, and diet, are becoming widely used.
While these activity tracking applications have the potential to improve our health, user engagement and retention are critical factors for their success.
However, long-term user engagement patterns in real-world activity tracking applications are not yet well understood.
Here we study user engagement patterns within a mobile physical activity tracking application consisting of 115 million logged activities taken by over a million users over 31 months.
Specifically, we show that over 75% of users return and re-engage with the application after prolonged periods of inactivity, no matter the duration of the inactivity.
We find a surprising result that the re-engagement usage patterns resemble those of the start of the initial engagement period, rather than being a simple continuation of the end of the initial engagement period.
This evidence points to a conceptual model of multiple lives of user engagement, extending the prevalent single life view of user activity.
We demonstrate that these multiple lives occur because the users have a variety of different primary intents or goals for using the app.
We find evidence for users being more likely to stop using the app once they achieved their primary intent or goal (e.g., weight loss).
However, these users might return once their original intent resurfaces (e.g., wanting to lose newly gained weight).
Based on insights developed in this work, including a marker of improved primary intent performance, our prediction models achieve 71% ROC AUC.
Overall, our research has implications for modeling user re-engagement in health activity tracking applications and has consequences for how notifications, recommendations as well as gamification can be used to increase engagement.
Linear prediction (LP) technique estimates an optimum all-pole filter of a given order for a frame of speech signal.
The coefficients of the all-pole filter, 1/A(z) are referred to as LP coefficients (LPCs).
The gain of the inverse of the all-pole filter, A(z) at z = 1, i.e, at frequency = 0, A(1) corresponds to the sum of LPCs, which has the property of being lower (higher) than a threshold for the sonorants (fricatives).
When the inverse-tan of A(1), denoted as T(1), is used a feature and tested on the sonorant and fricative frames of the entire TIMIT database, an accuracy of 99.07% is obtained.
Hence, we refer to T(1) as sonorant-fricative discrimination index (SFDI).
This property has also been tested for its robustness for additive white noise and on the telephone quality speech of the NTIMIT database.
These results are comparable to, or in some respects, better than the state-of-the-art methods proposed for a similar task.
Such a property may be used for segmenting a speech signal or for non-uniform frame-rate analysis.
Current image captioning approaches generate descriptions which lack specific information, such as named entities that are involved in the images.
In this paper we propose a new task which aims to generate informative image captions, given images and hashtags as input.
We propose a simple but effective approach to tackle this problem.
We first train a convolutional neural networks - long short term memory networks (CNN-LSTM) model to generate a template caption based on the input image.
Then we use a knowledge graph based collective inference algorithm to fill in the template with specific named entities retrieved via the hashtags.
Experiments on a new benchmark dataset collected from Flickr show that our model generates news-style image descriptions with much richer information.
Our model outperforms unimodal baselines significantly with various evaluation metrics.
The rising popularity of social media has radically changed the way news content is propagated, including interactive attempts with new dimensions.
To date, traditional news media such as newspapers, television and radio have already adapted their activities to the online news media by utilizing social media, blogs, websites etc.
This paper provides some insight into the social media presence of worldwide popular news media outlets.
Despite the fact that these large news media propagate content via social media environments to a large extent and very little is known about the news item producers, providers and consumers in the news media community in social media.To better understand these interactions, this work aims to analyze news items in two large social media, Twitter and Facebook.
Towards that end, we collected all published posts on Twitter and Facebook from 48 news media to perform descriptive and predictive analyses using the dataset of 152K tweets and 80K Facebook posts.
We explored a set of news media that originate content by themselves in social media, those who distribute their news items to other news media and those who consume news content from other news media and/or share replicas.
We propose a predictive model to increase news media popularity among readers based on the number of posts, number of followers and number of interactions performed within the news media community.
The results manifested that, news media should disperse their own content and they should publish first in social media in order to become a popular news media and receive more attractions to their news items from news readers.
Scholars have often relied on name initials to resolve name ambiguities in large-scale coauthorship network research.
This approach bears the risk of incorrectly merging or splitting author identities.
The use of initial-based disambiguation has been justified by the assumption that such errors would not affect research findings too much.
This paper tests this assumption by analyzing coauthorship networks from five academic fields - biology, computer science, nanoscience, neuroscience, and physics - and an interdisciplinary journal, PNAS.
Name instances in datasets of this study were disambiguated based on heuristics gained from previous algorithmic disambiguation solutions.
We use disambiguated data as a proxy of ground-truth to test the performance of three types of initial-based disambiguation.
Our results show that initial-based disambiguation can misrepresent statistical properties of coauthorship networks: it deflates the number of unique authors, number of component, average shortest paths, clustering coefficient, and assortativity, while it inflates average productivity, density, average coauthor number per author, and largest component size.
Also, on average, more than half of top 10 productive or collaborative authors drop off the lists.
Asian names were found to account for the majority of misidentification by initial-based disambiguation due to their common surname and given name initials.
Enlarged perivascular spaces (EPVS) in the brain are an emerging imaging marker for cerebral small vessel disease, and have been shown to be related to increased risk of various neurological diseases, including stroke and dementia.
Automatic quantification of EPVS would greatly help to advance research into its etiology and its potential as a risk indicator of disease.
We propose a convolutional network regression method to quantify the extent of EPVS in the basal ganglia from 3D brain MRI.
We first segment the basal ganglia and subsequently apply a 3D convolutional regression network designed for small object detection within this region of interest.
The network takes an image as input, and outputs a quantification score of EPVS.
The network has significantly more convolution operations than pooling ones and no final activation, allowing it to span the space of real numbers.
We validated our approach using a dataset of 2000 brain MRI scans scored visually.
Experiments with varying sizes of training and test sets showed that a good performance can be achieved with a training set of only 200 scans.
With a training set of 1000 scans, the intraclass correlation coefficient (ICC) between our scoring method and the expert's visual score was 0.74.
Our method outperforms by a large margin - more than 0.10 - four more conventional automated approaches based on intensities, scale-invariant feature transform, and random forest.
We show that the network learns the structures of interest and investigate the influence of hyper-parameters on the performance.
We also evaluate the reproducibility of our network using a set of 60 subjects scanned twice (scan-rescan reproducibility).
On this set our network achieves an ICC of 0.93, while the intrarater agreement reaches 0.80.
Furthermore, the automatic EPVS scoring correlates similarly to age as visual scoring.
Machine-learning models have been recently used for detecting malicious Android applications, reporting impressive performances on benchmark datasets, even when trained only on features statically extracted from the application, such as system calls and permissions.
However, recent findings have highlighted the fragility of such in-vitro evaluations with benchmark datasets, showing that very few changes to the content of Android malware may suffice to evade detection.
How can we thus trust that a malware detector performing well on benchmark data will continue to do so when deployed in an operating environment?
To mitigate this issue, the most popular Android malware detectors use linear, explainable machine-learning models to easily identify the most influential features contributing to each decision.
In this work, we generalize this approach to any black-box machine- learning model, by leveraging a gradient-based approach to identify the most influential local features.
This enables using nonlinear models to potentially increase accuracy without sacrificing interpretability of decisions.
Our approach also highlights the global characteristics learned by the model to discriminate between benign and malware applications.
Finally, as shown by our empirical analysis on a popular Android malware detection task, it also helps identifying potential vulnerabilities of linear and nonlinear models against adversarial manipulations.
We present a statistical-modelling method for piano reduction, i.e.converting an ensemble score into piano scores, that can control performance difficulty.
While previous studies have focused on describing the condition for playable piano scores, it depends on player's skill and can change continuously with the tempo.
We thus computationally quantify performance difficulty as well as musical fidelity to the original score, and formulate the problem as optimization of musical fidelity under constraints on difficulty values.
First, performance difficulty measures are developed by means of probabilistic generative models for piano scores and the relation to the rate of performance errors is studied.
Second, to describe musical fidelity, we construct a probabilistic model integrating a prior piano-score model and a model representing how ensemble scores are likely to be edited.
An iterative optimization algorithm for piano reduction is developed based on statistical inference of the model.
We confirm the effect of the iterative procedure; we find that subjective difficulty and musical fidelity monotonically increase with controlled difficulty values; and we show that incorporating sequential dependence of pitches and fingering motion in the piano-score model improves the quality of reduction scores in high-difficulty cases.
Deep learning and convolutional neural networks (CNN) have been intensively used in many image processing topics during last years.
As far as steganalysis is concerned, the use of CNN allows reaching the state-of-the-art results.
The performances of such networks often rely on the size of their learning database.
An obvious preliminary assumption could be considering that "the bigger a database is, the better the results are".
However, it appears that cautions have to be taken when increasing the database size if one desire to improve the classification accuracy i.e. enhance the steganalysis efficiency.
To our knowledge, no study has been performed on the enrichment impact of a learning database on the steganalysis performance.
What kind of images can be added to the initial learning set?
What are the sensitive criteria: the camera models used for acquiring the images, the treatments applied to the images, the cameras proportions in the database, etc?
This article continues the work carried out in a previous paper, and explores the ways to improve the performances of CNN.
It aims at studying the effects of "base augmentation" on the performance of steganalysis using a CNN.
We present the results of this study using various experimental protocols and various databases to define the good practices in base augmentation for steganalysis.
Object queries are essential in information seeking and decision making in vast areas of applications.
However, a query may involve complex conditions on objects and sets, which can be arbitrarily nested and aliased.
The objects and sets involved as well as the demand---the given parameter values of interest---can change arbitrarily.
How to implement object queries efficiently under all possible updates, and furthermore to provide complexity guarantees?
This paper describes an automatic method.
The method allows powerful queries to be written completely declaratively.
It transforms demand as well as all objects and sets into relations.
Most importantly, it defines invariants for not only the query results, but also all auxiliary values about the objects and sets involved, including those for propagating demand, and incrementally maintains all of them.
Implementation and experiments with problems from a variety of application areas, including distributed algorithms and probabilistic queries, confirm the analyzed complexities, trade-offs, and significant improvements over prior work.
This study concerned the active use of Wikipedia as a teaching tool in the classroom in higher education, trying to identify different usage profiles and their characterization.
A questionnaire survey was administrated to all full-time and part-time teachers at the Universitat Oberta de Catalunya and the Universitat Pompeu Fabra, both in Barcelona, Spain.
The questionnaire was designed using the Technology Acceptance Model as a reference, including items about teachers web 2.0 profile, Wikipedia usage, expertise, perceived usefulness, easiness of use, visibility and quality, as well as Wikipedia status among colleagues and incentives to use it more actively.
Clustering and statistical analysis were carried out using the k-medoids algorithm and differences between clusters were assessed by means of contingency tables and generalized linear models (logit).
The respondents were classified in four clusters, from less to more likely to adopt and use Wikipedia in the classroom, namely averse (25.4%), reluctant (17.9%), open (29.5%) and proactive (27.2%).
Proactive faculty are mostly men teaching part-time in STEM fields, mainly engineering, while averse faculty are mostly women teaching full-time in non-STEM fields.
Nevertheless, questionnaire items related to visibility, quality, image, usefulness and expertise determine the main differences between clusters, rather than age, gender or domain.
Clusters involving a positive view of Wikipedia and at least some frequency of use clearly outnumber those with a strictly negative stance.
This goes against the common view that faculty members are mostly sceptical about Wikipedia.
Environmental factors such as academic culture and colleagues opinion are more important than faculty personal characteristics, especially with respect to what they think about Wikipedia quality.
Regularizing the gradient norm of the output of a neural network with respect to its inputs is a powerful technique, rediscovered several times.
This paper presents evidence that gradient regularization can consistently improve classification accuracy on vision tasks, using modern deep neural networks, especially when the amount of training data is small.
We introduce our regularizers as members of a broader class of Jacobian-based regularizers.
We demonstrate empirically on real and synthetic data that the learning process leads to gradients controlled beyond the training points, and results in solutions that generalize well.
Modern multi-core systems have a large number of design parameters, most of which are discrete-valued, and this number is likely to keep increasing as chip complexity rises.
Further, the accurate evaluation of a potential design choice is computationally expensive because it requires detailed cycle-accurate system simulation.
If the discrete parameter space can be embedded into a larger continuous parameter space, then continuous space techniques can, in principle, be applied to the system optimization problem.
Such continuous space techniques often scale well with the number of parameters.
We propose a novel technique for embedding the discrete parameter space into an extended continuous space so that continuous space techniques can be applied to the embedded problem using cycle accurate simulation for evaluating the objective function.
This embedding is implemented using simulation-based ergodic interpolation, which, unlike spatial interpolation, produces the interpolated value within a single simulation run irrespective of the number of parameters.
We have implemented this interpolation scheme in a cycle-based system simulator.
In a characterization study, we observe that the interpolated performance curves are continuous, piece-wise smooth, and have low statistical error.
We use the ergodic interpolation-based approach to solve a large multi-core design optimization problem with 31 design parameters.
Our results indicate that continuous space optimization using ergodic interpolation-based embedding can be a viable approach for large multi-core design optimization problems.
Plasmas with varying collisionalities occur in many applications, such as tokamak edge regions, where the flows are characterized by significant variations in density and temperature.
While a kinetic model is necessary for weakly-collisional high-temperature plasmas, high collisionality in colder regions render the equations numerically stiff due to disparate time scales.
In this paper, we propose an implicit-explicit algorithm for such cases, where the collisional term is integrated implicitly in time, while the advective term is integrated explicitly in time, thus allowing time step sizes that are comparable to the advective time scales.
This partitioning results in a more efficient algorithm than those using explicit time integrators, where the time step sizes are constrained by the stiff collisional time scales.
We implement semi-implicit additive Runge-Kutta methods in COGENT, a finite-volume gyrokinetic code for mapped, multiblock grids and test the accuracy, convergence, and computational cost of these semi-implicit methods for test cases with highly-collisional plasmas.
Magnetic skyrmions are promising candidates for next-generation information carriers, owing to their small size, topological stability, and ultralow depinning current density.
A wide variety of skyrmionic device concepts and prototypes have been proposed, highlighting their potential applications.
Here, we report on a bioinspired skyrmionic device with synaptic plasticity.
The synaptic weight of the proposed device can be strengthened/weakened by positive/negative stimuli, mimicking the potentiation/depression process of a biological synapse.
Both short-term plasticity(STP) and long-term potentiation(LTP) functionalities have been demonstrated for a spiking time-dependent plasticity(STDP) scheme.
This proposal suggests new possibilities for synaptic devices for use in spiking neuromorphic computing applications.
Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language.
In this article we propose an approach based on a beam search algorithm and a language model working at the byte/character level, the latter component implemented either as an n-gram model or a recurrent neural network.
The resulting system analyzes the text input with no word boundaries one token at a time, which can be a character or a byte, and uses the information gathered by the language model to determine if a boundary must be placed in the current position or not.
Our aim is to use this system in a preprocessing step for a microtext normalization system.
This means that it needs to effectively cope with the data sparsity present on this kind of texts.
We also strove to surpass the performance of two readily available word segmentation systems: The well-known and accessible Word Breaker by Microsoft, and the Python module WordSegment by Grant Jenks.
The results show that we have met our objectives, and we hope to continue to improve both the precision and the efficiency of our system in the future.
We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy.
A set of representatives provides an intuitive description of each cluster, supports the clustering process, and helps to interpret the clustering results.
The projection-based nature of the clustering approach allows us to bypass dimensionality and feature extraction problems that arise in the context of graph datasets reduced to pairwise distances or feature vectors.
While achieving high quality and (human) interpretable clusterings, the runtime of the algorithm only grows linearly with the number of graphs.
Furthermore, the approach is easy to parallelize and therefore suitable for very large datasets.
Our extensive experimental evaluation on synthetic and real world datasets demonstrates the superiority of our approach over existing structural and subspace clustering algorithms, both, from a runtime and quality point of view.
We consider the problem of property testing for differential privacy: with black-box access to a purportedly private algorithm, can we verify its privacy guarantees?
In particular, we show that any privacy guarantee that can be efficiently verified is also efficiently breakable in the sense that there exist two databases between which we can efficiently distinguish.
We give lower bounds on the query complexity of verifying pure differential privacy, approximate differential privacy, random pure differential privacy, and random approximate differential privacy.
We also give algorithmic upper bounds.
The lower bounds obtained in the work are infeasible for the scale of parameters that are typically considered reasonable in the differential privacy literature, even when we suppose that the verifier has access to an (untrusted) description of the algorithm.
A central message of this work is that verifying privacy requires compromise by either the verifier or the algorithm owner.
Either the verifier has to be satisfied with a weak privacy guarantee, or the algorithm owner has to compromise on side information or access to the algorithm.
This paper describes the stages faced during the development of an Android program which obtains and decodes live images from DJI Phantom 3 Professional Drone and implements certain features of the TensorFlow Android Camera Demo application.
Test runs were made and outputs of the application were noted.
A lake was classified as seashore, breakwater and pier with the proximities of 24.44%, 21.16% and 12.96% respectfully.
The joystick of the UAV controller and laptop keyboard was classified with the proximities of 19.10% and 13.96% respectfully.
The laptop monitor was classified as screen, monitor and television with the proximities of 18.77%, 14.76% and 14.00% respectfully.
The computer used during the development of this study was classified as notebook and laptop with the proximities of 20.04% and 11.68% respectfully.
A tractor parked at a parking lot was classified with the proximity of 12.88%.
A group of cars in the same parking lot were classified as sports car, racer and convertible with the proximities of 31.75%, 18.64% and 13.45% respectfully at an inference time of 851ms.
We propose a novel unsupervised image segmentation algorithm, which aims to segment an image into several coherent parts.
It requires no user input, no supervised learning phase and assumes an unknown number of segments.
It achieves this by first over-segmenting the image into several hundred superpixels.
These are iteratively joined on the basis of a discriminative classifier trained on color and texture information obtained from each superpixel.
The output of the classifier is regularized by a Markov random field that lends more influence to neighbouring superpixels that are more similar.
In each iteration, similar superpixels fall under the same label, until only a few coherent regions remain in the image.
The algorithm was tested on a standard evaluation data set, where it performs on par with state-of-the-art algorithms in term of precision and greatly outperforms the state of the art by reducing the oversegmentation of the object of interest.
Dropout is a very effective way of regularizing neural networks.
Stochastically "dropping out" units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization.
Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in order to get a more powerful ensemble.
In this paper, we show that using a fixed dropout probability during training is a suboptimal choice.
We thus propose a time scheduling for the probability of retaining neurons in the network.
This induces an adaptive regularization scheme that smoothly increases the difficulty of the optimization problem.
This idea of "starting easy" and adaptively increasing the difficulty of the learning problem has its roots in curriculum learning and allows one to train better models.
Indeed, we prove that our optimization strategy implements a very general curriculum scheme, by gradually adding noise to both the input and intermediate feature representations within the network architecture.
Experiments on seven image classification datasets and different network architectures show that our method, named Curriculum Dropout, frequently yields to better generalization and, at worst, performs just as well as the standard Dropout method.
An important issue with oversampled FIR analysis filter banks (FBs) is to determine inverse synthesis FBs, when they exist.
Given any complex oversampled FIR analysis FB, we first provide an algorithm to determine whether there exists an inverse FIR synthesis system.
We also provide a method to ensure the Hermitian symmetry property on the synthesis side, which is serviceable to processing real-valued signals.
As an invertible analysis scheme corresponds to a redundant decomposition, there is no unique inverse FB.
Given a particular solution, we parameterize the whole family of inverses through a null space projection.
The resulting reduced parameter set simplifies design procedures, since the perfect reconstruction constrained optimization problem is recast as an unconstrained optimization problem.
The design of optimized synthesis FBs based on time or frequency localization criteria is then investigated, using a simple yet efficient gradient algorithm.
Face recognition technology has demonstrated tremendous progress over the past few years, primarily due to advances in representation learning.
As we witness the widespread adoption of these systems, it is imperative to consider the security of face representations.
In this paper, we explore the practicality of using a fully homomorphic encryption based framework to secure a database of face templates.
This framework is designed to preserve the privacy of users and prevent information leakage from the templates, while maintaining their utility through template matching directly in the encrypted domain.
Additionally, we also explore a batching and dimensionality reduction scheme to trade-off face matching accuracy and computational complexity.
Experiments on benchmark face datasets (LFW, IJB-A, IJB-B, CASIA) indicate that secure face matching can be practically feasible (16 KB template size and 0.01 sec per match pair for 512-dimensional features from SphereFace) while exhibiting minimal loss in matching performance.
Modern information systems are changing the idea of "data processing" to the idea of "concept processing", meaning that instead of processing words, such systems process semantic concepts which carry meaning and share contexts with other concepts.
Ontology is commonly used as a structure that captures the knowledge about a certain area via providing concepts and relations between them.
Traditionally, concept hierarchies have been built manually by knowledge engineers or domain experts.
However, the manual construction of a concept hierarchy suffers from several limitations such as its coverage and the enormous costs of its extension and maintenance.
Ontology learning, usually referred to the (semi-)automatic support in ontology development, is usually divided into steps, going from concepts identification, passing through hierarchy and non-hierarchy relations detection and, seldom, axiom extraction.
It is reasonable to say that among these steps the current frontier is in the establishment of concept hierarchies, since this is the backbone of ontologies and, therefore, a good concept hierarchy is already a valuable resource for many ontology applications.
The automatic construction of concept hierarchies from texts is a complex task and much work have been proposing approaches to better extract relations between concepts.
These different proposals have never been contrasted against each other on the same set of data and across different languages.
Such comparison is important to see whether they are complementary or incremental.
Also, we can see whether they present different tendencies towards recall and precision.
This paper evaluates these different methods on the basis of hierarchy metrics such as density and depth, and evaluation metrics such as Recall and Precision.
Results shed light over the comprehensive set of methods according to the literature in the area.
It is well known that closed-form analytical solutions for AC power flow equations do not exist in general.
This paper proposes a multi-dimensional holomorphic embedding method (MDHEM) to obtain an explicit approximate analytical AC power-flow solution by finding a physical germ solution and arbitrarily embedding each power, each load or groups of loads with respective scales.
Based on the MDHEM, the complete approximate analytical solutions to the power flow equations in the high-dimensional space become achievable, since the voltage vector of each bus can be explicitly expressed by a convergent multivariate power series of all the loads.
Unlike the traditional iterative methods for power flow calculation and inaccurate sensitivity analysis method for voltage control, the algebraic variables of a power system in all operating conditions can be prepared offline and evaluated online by only plugging in the values of any operating conditions into the scales of the non-linear multivariate power series.
Case studies implemented on the 4-bus test system and the IEEE 14-bus standard system confirm the effectiveness of the proposed method.
It is well-known that degree two finite field extensions can be equipped with a Hermitian-like structure similar to the extension of the complex field over the reals.
In this contribution, using this structure, we develop a modular character theory and the appropriate Fourier transform for some particular kind of finite Abelian groups.
Moreover we introduce the notion of bent functions for finite field valued functions rather than usual complex-valued functions, and we study several of their properties.
In particular we prove that this bentness notion is a consequence of that of Logachev, Salnikov and Yashchenko, introduced in "Bent functions on a finite Abelian group" (1997).
In addition this new bentness notion is also generalized to a vectorial setting.
The basic features of some of the most versatile and popular open source frameworks for machine learning (TensorFlow, Deep Learning4j, and H2O) are considered and compared.
Their comparative analysis was performed and conclusions were made as to the advantages and disadvantages of these platforms.
The performance tests for the de facto standard MNIST data set were carried out on H2O framework for deep learning algorithms designed for CPU and GPU platforms for single-threaded and multithreaded modes of operation.
We take up the challenge of designing realistic computational models of large interacting cell populations.
The goal is essentially to bring Gillespie's celebrated stochastic methodology to the level of an interacting population of cells.
Specifically, we are interested in how the gold standard of single cell computational modeling, here taken to be spatial stochastic reaction-diffusion models, may be efficiently coupled with a similar approach at the cell population level.
Concretely, we target a recently proposed set of pathways for pattern formation involving Notch-Delta signaling mechanisms.
These involve cell-to-cell communication as mediated both via direct membrane contact sites as well as via cellular protrusions.
We explain how to simulate the process in growing tissue using a multilevel approach and we discuss implications for future development of the associated computational methods.
We report on a data-driven investigation aimed at understanding the dynamics of message spreading in a real-world dynamical network of human proximity.
We use data collected by means of a proximity-sensing network of wearable sensors that we deployed at three different social gatherings, simultaneously involving several hundred individuals.
We simulate a message spreading process over the recorded proximity network, focusing on both the topological and the temporal properties.
We show that by using an appropriate technique to deal with the temporal heterogeneity of proximity events, a universal statistical pattern emerges for the delivery times of messages, robust across all the data sets.
Our results are useful to set constraints for generic processes of data dissemination, as well as to validate established models of human mobility and proximity that are frequently used to simulate realistic behaviors.
This paper proposes a novel deep architecture to address multi-label image recognition, a fundamental and practical task towards general visual understanding.
Current solutions for this task usually rely on an extra step of extracting hypothesis regions (i.e., region proposals), resulting in redundant computation and sub-optimal performance.
In this work, we achieve the interpretable and contextualized multi-label image classification by developing a recurrent memorized-attention module.
This module consists of two alternately performed components: i) a spatial transformer layer to locate attentional regions from the convolutional feature maps in a region-proposal-free way and ii) an LSTM (Long-Short Term Memory) sub-network to sequentially predict semantic labeling scores on the located regions while capturing the global dependencies of these regions.
The LSTM also output the parameters for computing the spatial transformer.
On large-scale benchmarks of multi-label image classification (e.g., MS-COCO and PASCAL VOC 07), our approach demonstrates superior performances over other existing state-of-the-arts in both accuracy and efficiency.
Millimeter wave (mmWave) communication is one feasible solution for high data-rate applications like vehicular-to-everything communication and next generation cellular communication.
Configuring mmWave links, which can be done through channel estimation or beam-selection, however, is a source of significant overhead.
In this paper, we propose to use spatial information extracted at sub-6 GHz to help establish the mmWave link.
First, we review the prior work on frequency dependent channel behavior and outline a simulation strategy to generate multi-band frequency dependent channels.
Second, assuming: (i) narrowband channels and a fully digital architecture at sub-6 GHz; and (ii) wideband frequency selective channels, OFDM signaling, and an analog architecture at mmWave, we outline strategies to incorporate sub-6 GHz spatial information in mmWave compressed beam selection.
We formulate compressed beam-selection as a weighted sparse signal recovery problem, and obtain the weighting information from sub-6 GHz channels.
In addition, we outline a structured precoder/combiner design to tailor the training to out-of-band information.
We also extend the proposed out-of-band aided compressed beam-selection approach to leverage information from all active OFDM subcarriers.
The simulation results for achievable rate show that out-of-band aided beam-selection can reduce the training overhead of in-band only beam-selection by 4x.
This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems.
The proposed approach handles both tall/narrow and wide/short datasets.
We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving millions of observations or features.
With the prevalence of accessible depth sensors, dynamic human body skeletons have attracted much attention as a robust modality for action recognition.
Previous methods model skeletons based on RNN or CNN, which has limited expressive power for irregular joints.
In this paper, we represent skeletons naturally on graphs and propose a generalized graph convolutional neural networks (GGCN) for skeleton-based action recognition, aiming to capture space-time variation via spectral graph theory.
In particular, we construct a generalized graph over consecutive frames, where each joint is not only connected to its neighboring joints in the same frame strongly or weakly, but also linked with relevant joints in the previous and subsequent frames.
The generalized graphs are then fed into GGCN along with the coordinate matrix of the skeleton sequence for feature learning, where we deploy high-order and fast Chebyshev approximation of spectral graph convolution in the network.
Experiments show that we achieve the state-of-the-art performance on the widely used NTU RGB+D, UT-Kinect and SYSU 3D datasets.
Human action recognition in 3D skeleton sequences has attracted a lot of research attention.
Recently, Long Short-Term Memory (LSTM) networks have shown promising performance in this task due to their strengths in modeling the dependencies and dynamics in sequential data.
As not all skeletal joints are informative for action recognition, and the irrelevant joints often bring noise which can degrade the performance, we need to pay more attention to the informative ones.
However, the original LSTM network does not have explicit attention ability.
In this paper, we propose a new class of LSTM network, Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action recognition.
This network is capable of selectively focusing on the informative joints in each frame of each skeleton sequence by using a global context memory cell.
To further improve the attention capability of our network, we also introduce a recurrent attention mechanism, with which the attention performance of the network can be enhanced progressively.
Moreover, we propose a stepwise training scheme in order to train our network effectively.
Our approach achieves state-of-the-art performance on five challenging benchmark datasets for skeleton based action recognition.
This article defines a complement of a function and conditions for existence of such a complement function and presents few algorithms to construct a complement.
We consider the task of learning to estimate human pose in still images.
In order to avoid the high cost of full supervision, we propose to use a diverse data set, which consists of two types of annotations: (i) a small number of images are labeled using the expensive ground-truth pose; and (ii) other images are labeled using the inexpensive action label.
As action information helps narrow down the pose of a human, we argue that this approach can help reduce the cost of training without significantly affecting the accuracy.
To demonstrate this we design a probabilistic framework that employs two distributions: (i) a conditional distribution to model the uncertainty over the human pose given the image and the action; and (ii) a prediction distribution, which provides the pose of an image without using any action information.
We jointly estimate the parameters of the two aforementioned distributions by minimizing their dissimilarity coefficient, as measured by a task-specific loss function.
During both training and testing, we only require an efficient sampling strategy for both the aforementioned distributions.
This allows us to use deep probabilistic networks that are capable of providing accurate pose estimates for previously unseen images.
Using the MPII data set, we show that our approach outperforms baseline methods that either do not use the diverse annotations or rely on pointwise estimates of the pose.
In this work, we consider the problem of influence maximization on a hypergraph.
We first extend the Independent Cascade (IC) model to hypergraphs, and prove that the traditional influence maximization problem remains submodular.
We then present a variant of the influence maximization problem (HEMI) where one seeks to maximize the number of hyperedges, a majority of whose nodes are influenced.
We prove that HEMI is non-submodular under the diffusion model proposed.
Open-domain human-computer conversation has been attracting increasing attention over the past few years.
However, there does not exist a standard automatic evaluation metric for open-domain dialog systems; researchers usually resort to human annotation for model evaluation, which is time- and labor-intensive.
In this paper, we propose RUBER, a Referenced metric and Unreferenced metric Blended Evaluation Routine, which evaluates a reply by taking into consideration both a groundtruth reply and a query (previous user-issued utterance).
Our metric is learnable, but its training does not require labels of human satisfaction.
Hence, RUBER is flexible and extensible to different datasets and languages.
Experiments on both retrieval and generative dialog systems show that RUBER has a high correlation with human annotation.
Although the recent progress is substantial, deep learning methods can be vulnerable to the maliciously generated adversarial examples.
In this paper, we present a novel training procedure and a thresholding test strategy, towards robust detection of adversarial examples.
In training, we propose to minimize the reverse cross-entropy (RCE), which encourages a deep network to learn latent representations that better distinguish adversarial examples from normal ones.
In testing, we propose to use a thresholding strategy as the detector to filter out adversarial examples for reliable predictions.
Our method is simple to implement using standard algorithms, with little extra training cost compared to the common cross-entropy minimization.
We apply our method to defend various attacking methods on the widely used MNIST and CIFAR-10 datasets, and achieve significant improvements on robust predictions under all the threat models in the adversarial setting.
Recently, a RGB image encryption algorithm based on DNA encoding and chaos map has been proposed.
It was reported that the encryption algorithm can be broken with four pairs of chosen plain-images and the corresponding cipher-images.
This paper re-evaluates the security of the encryption algorithm, and finds that the encryption algorithm can be broken efficiently with only one known plain-image.
The effectiveness of the proposed known-plaintext attack is supported by both rigorous theoretical analysis and experimental results.
In addition, two other security defects are also reported.
We revisit the problem of asymmetric binary hypothesis testing against a composite alternative hypothesis.
We introduce a general framework to treat such problems when the alternative hypothesis adheres to certain axioms.
In this case we find the threshold rate, the optimal error and strong converse exponents (at large deviations from the threshold) and the second order asymptotics (at small deviations from the threshold).
We apply our results to find operational interpretations of various Renyi information measures.
In case the alternative hypothesis is comprised of bipartite product distributions, we find that the optimal error and strong converse exponents are determined by variations of Renyi mutual information.
In case the alternative hypothesis consists of tripartite distributions satisfying the Markov property, we find that the optimal exponents are determined by variations of Renyi conditional mutual information.
In either case the relevant notion of Renyi mutual information depends on the precise choice of the alternative hypothesis.
As such, our work also strengthens the view that different definitions of Renyi mutual information, conditional entropy and conditional mutual information are adequate depending on the context in which the measures are used.
We consider perfect secret key generation for a ``pairwise independent network'' model in which every pair of terminals share a random binary string, with the strings shared by distinct terminal pairs being mutually independent.
The terminals are then allowed to communicate interactively over a public noiseless channel of unlimited capacity.
All the terminals as well as an eavesdropper observe this communication.
The objective is to generate a perfect secret key shared by a given set of terminals at the largest rate possible, and concealed from the eavesdropper.
First, we show how the notion of perfect omniscience plays a central role in characterizing perfect secret key capacity.
Second, a multigraph representation of the underlying secrecy model leads us to an efficient algorithm for perfect secret key generation based on maximal Steiner tree packing.
This algorithm attains capacity when all the terminals seek to share a key, and, in general, attains at least half the capacity.
Third, when a single ``helper'' terminal assists the remaining ``user'' terminals in generating a perfect secret key, we give necessary and sufficient conditions for the optimality of the algorithm; also, a ``weak'' helper is shown to be sufficient for optimality.
The goal of this paper is to identify individuals by analyzing their gait.
Instead of using binary silhouettes as input data (as done in many previous works) we propose and evaluate the use of motion descriptors based on densely sampled short-term trajectories.
We take advantage of state-of-the-art people detectors to define custom spatial configurations of the descriptors around the target person, obtaining a rich representation of the gait motion.
The local motion features (described by the Divergence-Curl-Shear descriptor) extracted on the different spatial areas of the person are combined into a single high-level gait descriptor by using the Fisher Vector encoding.
The proposed approach, coined Pyramidal Fisher Motion, is experimentally validated on `CASIA' dataset (parts B and C), `TUM GAID' dataset, `CMU MoBo' dataset and the recent `AVA Multiview Gait' dataset.
The results show that this new approach achieves state-of-the-art results in the problem of gait recognition, allowing to recognize walking people from diverse viewpoints on single and multiple camera setups, wearing different clothes, carrying bags, walking at diverse speeds and not limited to straight walking paths.
Unmanned Aerial Vehicles (UAVs) have recently rapidly grown to facilitate a wide range of innovative applications that can fundamentally change the way cyber-physical systems (CPSs) are designed.
CPSs are a modern generation of systems with synergic cooperation between computational and physical potentials that can interact with humans through several new mechanisms.
The main advantages of using UAVs in CPS application is their exceptional features, including their mobility, dynamism, effortless deployment, adaptive altitude, agility, adjustability, and effective appraisal of real-world functions anytime and anywhere.
Furthermore, from the technology perspective, UAVs are predicted to be a vital element of the development of advanced CPSs.
Therefore, in this survey, we aim to pinpoint the most fundamental and important design challenges of multi-UAV systems for CPS applications.
We highlight key and versatile aspects that span the coverage and tracking of targets and infrastructure objects, energy-efficient navigation, and image analysis using machine learning for fine-grained CPS applications.
Key prototypes and testbeds are also investigated to show how these practical technologies can facilitate CPS applications.
We present and propose state-of-the-art algorithms to address design challenges with both quantitative and qualitative methods and map these challenges with important CPS applications to draw insightful conclusions on the challenges of each application.
Finally, we summarize potential new directions and ideas that could shape future research in these areas.
This paper considers a downlink cloud radio access network (C-RAN) in which all the base-stations (BSs) are connected to a central computing cloud via digital backhaul links with finite capacities.
Each user is associated with a user-centric cluster of BSs; the central processor shares the user's data with the BSs in the cluster, which then cooperatively serve the user through joint beamforming.
Under this setup, this paper investigates the user scheduling, BS clustering and beamforming design problem from a network utility maximization perspective.
Differing from previous works, this paper explicitly considers the per-BS backhaul capacity constraints.
We formulate the network utility maximization problem for the downlink C-RAN under two different models depending on whether the BS clustering for each user is dynamic or static over different user scheduling time slots.
In the former case, the user-centric BS cluster is dynamically optimized for each scheduled user along with the beamforming vector in each time-frequency slot, while in the latter case the user-centric BS cluster is fixed for each user and we jointly optimize the user scheduling and the beamforming vector to account for the backhaul constraints.
In both cases, the nonconvex per-BS backhaul constraints are approximated using the reweighted l1-norm technique.
This approximation allows us to reformulate the per-BS backhaul constraints into weighted per-BS power constraints and solve the weighted sum rate maximization problem through a generalized weighted minimum mean square error approach.
This paper shows that the proposed dynamic clustering algorithm can achieve significant performance gain over existing naive clustering schemes.
This paper also proposes two heuristic static clustering schemes that can already achieve a substantial portion of the gain.
One Monad to Prove Them All is a modern fairy tale about curiosity and perseverance, two important properties of a successful PhD student.
We follow the PhD student Mona on her adventure of proving properties about Haskell programs in the proof assistant Coq.
On the one hand, as a PhD student in computer science Mona observes an increasing demand for correct software products.
In particular, because of the large amount of existing software, verifying existing software products becomes more important.
Verifying programs in the functional programming language Haskell is no exception.
On the other hand, Mona is delighted to see that communities in the area of theorem proving are becoming popular.
Thus, Mona sets out to learn more about the interactive theorem prover Coq and verifying Haskell programs in Coq.
To prove properties about a Haskell function in Coq, Mona has to translate the function into Coq code.
As Coq programs have to be total and Haskell programs are often not, Mona has to model partiality explicitly in Coq.
In her quest for a solution Mona finds an ancient manuscript that explains how properties about Haskell functions can be proven in the proof assistant Agda by translating Haskell programs into monadic Agda programs.
By instantiating the monadic program with a concrete monad instance the proof can be performed in either a total or a partial setting.
Mona discovers that the proposed transformation does not work in Coq due to a restriction in the termination checker.
In fact the transformation does not work in Agda anymore as well, as the termination checker in Agda has been improved.
We follow Mona on an educational journey through the land of functional programming where she learns about concepts like free monads and containers as well as basics and restrictions of proof assistants like Coq.
These concepts are well-known individually, but their interplay gives rise to a solution for Mona's problem based on the originally proposed monadic tranformation that has not been presented before.
When Mona starts to test her approach by proving a statement about simple Haskell functions, she realizes that her approach has an additional advantage over the original idea in Agda.
Mona's final solution not only works for a specific monad instance but even allows her to prove monad-generic properties.
Instead of proving properties over and over again for specific monad instances she is able to prove properties that hold for all monads representable by a container-based instance of the free monad.
In order to strengthen her confidence in the practicability of her approach, Mona evaluates her approach in a case study that compares two implementations for queues.
In order to share the results with other functional programmers the fairy tale is available as a literate Coq file.
If you are a citizen of the land of functional programming or are at least familiar with its customs, had a journey that involved reasoning about functional programs of your own, or are just a curious soul looking for the next story about monads and proofs, then this tale is for you.
Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes that allow computations to be applied directly on encrypted data without requiring a secret key.
This enables novel application scenarios where a client can safely offload storage and computation to a third-party cloud provider without having to trust the software and the hardware vendors with the decryption keys.
Recent advances in both FHE schemes and implementations have moved such applications from theoretical possibilities into the realm of practicalities.
This paper proposes a compact and well-reasoned interface called the Homomorphic Instruction Set Architecture (HISA) for developing FHE applications.
Just as the hardware ISA interface enabled hardware advances to proceed independent of software advances in the compiler and language runtimes, HISA decouples compiler optimizations and runtimes for supporting FHE applications from advancements in the underlying FHE schemes.
This paper demonstrates the capabilities of HISA by building an end-to-end software stack for evaluating neural network models on encrypted data.
Our stack includes an end-to-end compiler, runtime, and a set of optimizations.
Our approach shows generated code, on a set of popular neural network architectures, is faster than hand-optimized implementations.
This research considers the task of evolving the physical structure of a robot to enhance its performance in various environments, which is a significant problem in the field of Evolutionary Robotics.
Inspired by the fields of evolutionary art and sculpture, we evolve only targeted parts of a robot, which simplifies the optimisation problem compared to traditional approaches that must simultaneously evolve both (actuated) body and brain.
Exploration fidelity is emphasised in areas of the robot most likely to benefit from shape optimisation, whilst exploiting existing robot structure and control.
Our approach uses a Genetic Algorithm to optimise collections of Bezier splines that together define the shape of a legged robot's tibia, and leg performance is evaluated in parallel in a high-fidelity simulator.
The leg is represented in the simulator as 3D-printable file, and as such can be readily instantiated in reality.
Provisional experiments in three distinct environments show the evolution of environment-specific leg structures that are both high-performing and notably different to those evolved in the other environments.
This proof-of-concept represents an important step towards the environment-dependent optimisation of performance-critical components for a range of ubiquitous, standard, and already-capable robots that can carry out a wide variety of tasks.
Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors.
It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest compressed file.
Then transfer (or store) the index number of the best compressor (it requires log m bits) and the compressed file.The only problem is the time, which essentially increases due to the need to compress the file m times (in order to find the best compressor).
We propose a method that encodes the file with the optimal compressor, but uses a relatively small additional time: the ratio of this extra time and the total time of calculation can be limited by an arbitrary positive constant.
Generally speaking, in many situations it may be necessary find the best data compressor out of a given set, which is often done by comparing them empirically.
One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.
We propose a novel reflection color model consisting of body essence and (mixed) neuter, and present an effective method for separating dichromatic reflection components using a single image.
Body essence is an entity invariant to interface reflection, and has two degrees of freedom unlike hue and maximum chromaticity.
As a result, the proposed method is insensitive to noise and proper for colors around CMY (cyan, magenta, and yellow) as well as RGB (red, green, and blue), contrary to the maximum chromaticity-based methods.
Interface reflection is separated by using a Gaussian function, which removes a critical thresholding problem.
Furthermore, the method does not require any region segmentation.
Experimental results show the efficacy of the proposed model and method.
In order to improve the performances of recently-presented improved normalized subband adaptive filter (INSAF) and proportionate INSAF algorithms for highly noisy system, this paper proposes their set-membership versions by exploiting the theory of set-membership filtering.
Apart from obtaining smaller steady-state error, the proposed algorithms significantly reduce the overall computational complexity.
In addition, to further improve the steady-state performance for the algorithms, their smooth variants are developed by using the smoothed absolute subband output errors to update the step sizes.
Simulation results in the context of acoustic echo cancellation have demonstrated the superiority of the proposed algorithms.
In response to failures of central planning, the Chinese government has experimented not only with free-market trade zones, but with allowing non-profit foundations to operate in a decentralized fashion.
A network study shows how these foundations have connected together by sharing board members, in a structural parallel to what is seen in corporations in the United States and Europe.
This board interlocking leads to the emergence of an elite group with privileged network positions.
While the presence of government officials on non-profit boards is widespread, government officials are much less common in a subgroup of foundations that control just over half of all revenue in the network.
This subgroup, associated with business elites, not only enjoys higher levels of within-elite links, but even preferentially excludes government officials from the NGOs with higher degree.
The emergence of this structurally autonomous sphere is associated with major political and social events in the state-society relationship.
Cluster analysis reveals multiple internal components within this sphere that share similar levels of network influence.
Rather than a core-periphery structure centered around government officials, the Chinese non-profit world appears to be a multipolar one of distinct elite groups, many of which achieve high levels of independence from direct government control.
The traditional methods of the biology, based on illustrative descriptions and linear logic explanations, are discussed.
This work aims to improve this approach by introducing alternative tools to describe and represent complex biological systems.
Two models were developed, one mathematical and another computational, both were made in order to study the biological process between free radicals and antioxidants.
Each model was used to study the same process but in different scenarios.
The mathematical model was used to study the biological process in an epithelial cells culture; this model was validated with the experimental data of Anne Hanneken's research group from the Department of Molecular and Experimental Medicine, published by the journal Investigative Ophthalmology and Visual Science in July 2006.
The computational model was used to study the same process in an individual.
The model was made using C++ programming language, supported by the network theory of aging.
We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round.
The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e.WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015.
An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016).
This establishes WE2015 as a stable benchmark for the 2D Simulation League.
We then contrast two ranking methods and suggest two options for future evaluation challenges.
The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League.
The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.
Data analysis and monitoring of road networks in terms of reliability and performance are valuable but hard to achieve, especially when the analytical information has to be available to decision makers on time.
The gathering and analysis of the observable facts can be used to infer knowledge about traffic congestion over time and gain insights into the roads safety.
However, the continuous monitoring of live traffic information produces a vast amount of data that makes it difficult for business intelligence (BI) tools to generate metrics and key performance indicators (KPI) in nearly real-time.
In order to overcome these limitations, we propose the application of a big-data based and process-centric approach that integrates with operational traffic information systems to give insights into the road network's efficiency.
This paper demonstrates how the adoption of an existent process-oriented DSS solution with big-data support can be leveraged to monitor and analyse live traffic data on an acceptable response time basis.
This work studies the representational mapping across multimodal data such that given a piece of the raw data in one modality the corresponding semantic description in terms of the raw data in another modality is immediately obtained.
Such a representational mapping can be found in a wide spectrum of real-world applications including image/video retrieval, object recognition, action/behavior recognition, and event understanding and prediction.
To that end, we introduce a simplified training objective for learning multimodal embeddings using the skip-gram architecture by introducing convolutional "pseudowords:" embeddings composed of the additive combination of distributed word representations and image features from convolutional neural networks projected into the multimodal space.
We present extensive results of the representational properties of these embeddings on various word similarity benchmarks to show the promise of this approach.
Rotation estimation of known rigid objects is important for robotic applications such as dexterous manipulation.
Most existing methods for rotation estimation use intermediate representations such as templates, global or local feature descriptors, or object coordinates, which require multiple steps in order to infer the object pose.
We propose to directly regress a pose vector from raw point cloud segments using a convolutional neural network.
Experimental results show that our method can potentially achieve competitive performance compared to a state-of-the-art method, while also showing more robustness against occlusion.
Our method does not require any post processing such as refinement with the iterative closest point algorithm.
The high computational and parameter complexity of neural networks makes their training very slow and difficult to deploy on energy and storage-constrained computing systems.
Many network complexity reduction techniques have been proposed including fixed-point implementation.
However, a systematic approach for designing full fixed-point training and inference of deep neural networks remains elusive.
We describe a precision assignment methodology for neural network training in which all network parameters, i.e., activations and weights in the feedforward path, gradients and weight accumulators in the feedback path, are assigned close to minimal precision.
The precision assignment is derived analytically and enables tracking the convergence behavior of the full precision training, known to converge a priori.
Thus, our work leads to a systematic methodology of determining suitable precision for fixed-point training.
The near optimality (minimality) of the resulting precision assignment is validated empirically for four networks on the CIFAR-10, CIFAR-100, and SVHN datasets.
The complexity reduction arising from our approach is compared with other fixed-point neural network designs.
A description and annotation guidelines for the Yahoo Webscope release of Query Treebank, Version 1.0, May 2016.
We present a novel deformable groupwise registration method, applied to large 3D image groups.
Our approach extracts 3D SURF keypoints from images, computes matched pairs of keypoints and registers the group by minimizing pair distances in a hubless way i.e. without computing any central mean image.
Using keypoints significantly reduces the problem complexity compared to voxel-based approaches, and enables us to provide an in-core global optimization, similar to the Bundle Adjustment for 3D reconstruction.
As we aim at registering images of different patients, the matching step yields many outliers.
Then we propose a new EM-weighting algorithm which efficiently discards outliers.
Global optimization is carried out with a fast gradient descent algorithm.
This allows our approach to robustly register large datasets.
The result is a set of half transforms which link the volumes together and can be subsequently exploited for computational anatomy, landmark detection or image segmentation.
We show experimental results on whole-body CT scans, with groups of up to 103 volumes.
On a benchmark based on anatomical landmarks, our algorithm compares favorably with the star-groupwise voxel-based ANTs and NiftyReg approaches while being much faster.
We also discuss the limitations of our approach for lower resolution images such as brain MRI.
Logical systems with classical negation and means for sentential or propositional self-reference involve, in some way, paradoxical statements such as the liar.
However, the paradox disappears if one replaces classical by an appropriate non-classical negation such as a paraconsistent one (no paradox arises if the liar is both true and false).
We consider a non-Fregean logic which is a revised and extended version (Lewitzka 2012) of Epsilon-T-Logic originally introduced by (Straeter 1992) as a logic with a total truth predicate and propositional quantifiers.
Self-reference is achieved by means of equations between formulas which are interpreted over a model-theoretic universe of propositions.
Paradoxical statements, such as the liar, can be asserted only by unsatisfiable equations and do not correlate with propositions.
In this paper, we generalize Epsilon-T-Logic to a four-valued logic related to Dunn/Belnap logic B_4.
We also define three-valued versions related to Kleene's logic K_3 and Priest's Logic of Paradox P_3, respectively.
In this many-valued setting, models may contain liars and other "paradoxical" propositions which are ruled out by the more restrictive classical semantics.
We introduce these many-valued non-Fregean logics as extensions of abstract parameter logics such that parameter logic and extension are of the same logical type.
For this purpose, we define and study abstract logics of type B_4, K_3 and P_3.
Using semantic methods we show compactness of the consequence relation of abstract logics of type B_4, give a representation as minimally generated logics and establish a connection to the approach of (Font 1997).
Finally, we present a complete sequent calculus for the Epsilon-T-style extension of classical abstract logics simplifying constructions originally developed by (Straeter 1992, Zeitz 2000, Lewitzka 1998).
In this paper, we examine the problem of robotic manipulation of granular media.
We evaluate multiple predictive models used to infer the dynamics of scooping and dumping actions.
These models are evaluated on a task that involves manipulating the media in order to deform it into a desired shape.
Our best performing model is based on a highly-tailored convolutional network architecture with domain-specific optimizations, which we show accurately models the physical interaction of the robotic scoop with the underlying media.
We empirically demonstrate that explicitly predicting physical mechanics results in a policy that out-performs both a hand-crafted dynamics baseline, and a "value-network", which must otherwise implicitly predict the same mechanics in order to produce accurate value estimates.
How much is the h-index of an editor of a well ranked journal improved due to citations which occur after his or her appointment?
Scientific recognition within academia is widely measured nowadays by the number of citations or h-index.
Our dataset is based on a sample of four editors from a well ranked journal (impact factor - IF - greater than 2).
The target group consists of two editors who seem to benefit by their position through an increased citation number (and subsequently h-index) within journal.
The total amount of citations for the target group is bigger than 600.
The control group is formed by another set of two editors from the same journal whose relations between their positions and their citation records remain neutral.
The total amount of citations for the control group is more than 1200.
The timespan for which pattern of citations has been studied is 1975-2015.
Previous coercive citations for a journal benefit (increase its IF) has been signaled.
To the best of our knowledge, this is a pioneering work on coercive citations for personal (or editors) benefit.
Editorial teams should be aware about this type of potentially unethical behavior and act accordingly.
The JCT-VC standardized Screen Content Coding (SCC) extension in the HEVC HM RExt + SCM reference codec offers an impressive coding efficiency performance when compared with HM RExt alone; however, it is not significantly perceptually optimized.
For instance, it does not include advanced HVS-based perceptual coding methods, such as JND-based spatiotemporal masking schemes.
In this paper, we propose a novel JND-based perceptual video coding technique for HM RExt + SCM.
The proposed method is designed to further improve the compression performance of HM RExt + SCM when applied to YCbCr 4:4:4 SC video data.
In the proposed technique, luminance masking and chrominance masking are exploited to perceptually adjust the Quantization Step Size (QStep) at the Coding Block (CB) level.
Compared with HM RExt 16.10 + SCM 8.0, the proposed method considerably reduces bitrates (Kbps), with a maximum reduction of 48.3%.
In addition to this, the subjective evaluations reveal that SC-PAQ achieves visually lossless coding at very low bitrates.
State-of-the-art in network science of teams offers effective recommendation methods to answer questions like who is the best replacement, what is the best team expansion strategy, but lacks intuitive ways to explain why the optimization algorithm gives the specific recommendation for a given team optimization scenario.
To tackle this problem, we develop an interactive prototype system, EXTRA, as the first step towards addressing such a sense-making challenge, through the lens of the underlying network where teams embed, to explain the team recommendation results.
The main advantages are (1) Algorithm efficacy: we propose an effective and fast algorithm to explain random walk graph kernel, the central technique for networked team recommendation; (2) Intuitive visual explanation: we present intuitive visual analysis of the recommendation results, which can help users better understand the rationality of the underlying team recommendation algorithm.
Face sketches are able to capture the spatial topology of a face while lacking some facial attributes such as race, skin, or hair color.
Existing sketch-photo recognition approaches have mostly ignored the importance of facial attributes.
In this paper, we propose a new loss function, called attribute-centered loss, to train a Deep Coupled Convolutional Neural Network (DCCNN) for the facial attribute guided sketch to photo matching.
Specifically, an attribute-centered loss is proposed which learns several distinct centers, in a shared embedding space, for photos and sketches with different combinations of attributes.
The DCCNN simultaneously is trained to map photos and pairs of testified attributes and corresponding forensic sketches around their associated centers, while preserving the spatial topology information.
Importantly, the centers learn to keep a relative distance from each other, related to their number of contradictory attributes.
Extensive experiments are performed on composite (E-PRIP) and semi-forensic (IIIT-D Semi-forensic) databases.
The proposed method significantly outperforms the state-of-the-art.
Due to the increasing dependency of critical infrastructure on synchronized clocks, network time synchronization protocols have become an attractive target for attackers.
We identify data origin authentication as the key security objective and suggest to employ recently proposed high-performance digital signature schemes (Ed25519 and MQQ-SIG)) as foundation of a novel set of security measures to secure multicast time synchronization.
We conduct experiments to verify the computational and communication efficiency for using these signatures in the standard time synchronization protocols NTP and PTP.
We propose additional security measures to prevent replay attacks and to mitigate delay attacks.
Our proposed solutions cover 1-step mode for NTP and PTP and we extend our security measures specifically to 2-step mode (PTP) and show that they have no impact on time synchronization's precision.
Approaches to decision-making under uncertainty in the belief function framework are reviewed.
Most methods are shown to blend criteria for decision under ignorance with the maximum expected utility principle of Bayesian decision theory.
A distinction is made between methods that construct a complete preference relation among acts, and those that allow incomparability of some acts due to lack of information.
Methods developed in the imprecise probability framework are applicable in the Dempster-Shafer context and are also reviewed.
Shafer's constructive decision theory, which substitutes the notion of goal for that of utility, is described and contrasted with other approaches.
The paper ends by pointing out the need to carry out deeper investigation of fundamental issues related to decision-making with belief functions and to assess the descriptive, normative and prescriptive values of the different approaches.
This articles surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user's defenses.
A significant majority of reviewed articles and web resources are from years 2012-2014.
Privacy seems to be the Achilles' heel of today's web.
Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy.
Tracking is usually performed for commercial purposes.
We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches.
A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies.
We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses.
We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft).
For each of the tracking methods, we present possible defenses.
Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes - their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way.
Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users' privacy.
We study motion feasibility conditions of decentralized multi-agent control systems on Lie groups with collision avoidance constraints, modeled by an undirected graph.
We first consider agents modeled by a kinematic left invariant control systems (single integrator) and next as dynamical control systems (double integrator) determined by a left-trivialized Lagrangian function.
In the kinematic approach, we study the problem of determining whether there are nontrivial trajectories of all agent kinematics that maintain the collision avoidance constraints.
Solutions of the problem give rise to linear combinations of the control inputs in a linear subspace annihilating the constraints.
In the dynamical problem, first order necessary conditions for the existence of feasible motions are obtained using techniques from variational calculus on manifolds and by introducing collision avoidance constraints among agents into an augmented action functional by using the Lagrange multipliers theorem.
Deep learning based on artificial neural networks is a powerful machine learning method that, in the last few years, has been successfully used to realize tasks, e.g., image classification, speech recognition, translation of languages, etc., that are usually simple to execute by human beings but extremely difficult to perform by machines.
This is one of the reasons why deep learning is considered to be one of the main enablers to realize the notion of artificial intelligence.
In order to identify the best architecture of an artificial neural network that allows one to fit input-output data pairs, the current methodology in deep learning methods consists of employing a data-driven approach.
Once the artificial neural network is trained, it is capable of responding to never-observed inputs by providing the optimum output based on past acquired knowledge.
In this context, a recent trend in the deep learning community is to complement pure data-driven approaches with prior information based on expert knowledge.
In this work, we describe two methods that implement this strategy, which aim at optimizing wireless communication networks.
In addition, we illustrate numerical results in order to assess the performance of the proposed approaches compared with pure data-driven implementations.
We introduce XtraPuLP, a new distributed-memory graph partitioner designed to process trillion-edge graphs.
XtraPuLP is based on the scalable label propagation community detection technique, which has been demonstrated as a viable means to produce high quality partitions with minimal computation time.
On a collection of large sparse graphs, we show that XtraPuLP partitioning quality is comparable to state-of-the-art partitioning methods.
We also demonstrate that XtraPuLP can produce partitions of real-world graphs with billion+ vertices in minutes.
Further, we show that using XtraPuLP partitions for distributed-memory graph analytics leads to significant end-to-end execution time reduction.
For the first time a mathematical object is presented - a reversible cellular Automaton - with many paradoxical qualities, the main ones among them are: a frequent quickly return to its original state, the presence of a large number of conservation laws and paradoxical "fuzzy" symmetries, which connects the current position of the automaton with its signature Main Integral.
We propose a method for annotating the location of objects in ImageNet.
Traditionally, this is cast as an image window classification problem, where each window is considered independently and scored based on its appearance alone.
Instead, we propose a method which scores each candidate window in the context of all other windows in the image, taking into account their similarity in appearance space as well as their spatial relations in the image plane.
We devise a fast and exact procedure to optimize our scoring function over all candidate windows in an image, and we learn its parameters using structured output regression.
We demonstrate on 92000 images from ImageNet that this significantly improves localization over recent techniques that score windows in isolation.
Recent advances in video super-resolution have shown that convolutional neural networks combined with motion compensation are able to merge information from multiple low-resolution (LR) frames to generate high-quality images.
Current state-of-the-art methods process a batch of LR frames to generate a single high-resolution (HR) frame and run this scheme in a sliding window fashion over the entire video, effectively treating the problem as a large number of separate multi-frame super-resolution tasks.
This approach has two main weaknesses: 1) Each input frame is processed and warped multiple times, increasing the computational cost, and 2) each output frame is estimated independently conditioned on the input frames, limiting the system's ability to produce temporally consistent results.
In this work, we propose an end-to-end trainable frame-recurrent video super-resolution framework that uses the previously inferred HR estimate to super-resolve the subsequent frame.
This naturally encourages temporally consistent results and reduces the computational cost by warping only one image in each step.
Furthermore, due to its recurrent nature, the proposed method has the ability to assimilate a large number of previous frames without increased computational demands.
Extensive evaluations and comparisons with previous methods validate the strengths of our approach and demonstrate that the proposed framework is able to significantly outperform the current state of the art.
Similarity searching of molecular structure has been an important application in the Chemoinformatics, especially in drug discovery.
Similarity searching is a common method used for identification of molecular structure.
It involve three main principal component of similarity searching: structure representation; weighting scheme; and similarity coefficient.
In this paper, we introduces Weighted Tanimoto Coefficient based on weighted Euclidean distance in order to investigate the effect of weight function on the result for similarity searching.
The Tanimoto coefficient is one of the popular similarity coefficients used to measure the similarity between pairs of the molecule.
The most of research area found that the similarity searching is based on binary or fingerprint data.
Meanwhile, we used non-binary data and was set amphetamine structure as a reference or targeted structure and the rest of the dataset becomes a database structure.
Throughout this study, it showed that there is definitely gives a different result between a similarity searching with and without weight.
#NAME?
It explores possible origins of Physical Human-Human communication, more precisely, the hypothes
Image steganography is a growing research field, where sensitive contents are embedded in images, keeping their visual quality intact.
Researchers have used correlated color space such as RGB, where modification to one channel affects the overall quality of stego-images, hence decreasing its suitability for steganographic algorithms.
Therefore, in this paper, we propose an adaptive LSB substitution method using uncorrelated color space, increasing the property of imperceptibility while minimizing the chances of detection by the human vision system.
In the proposed scheme, the input image is passed through an image scrambler, resulting in an encrypted image, preserving the privacy of image contents, and then converted to HSV color space for further processing.
The secret contents are encrypted using an iterative magic matrix encryption algorithm (IMMEA) for better security, producing the cipher contents.
An adaptive LSB substitution method is then used to embed the encrypted data inside the V-plane of HSV color model based on secret key-directed block magic LSB mechanism.
The idea of utilizing HSV color space for data hiding is inspired from its properties including de-correlation, cost-effectiveness in processing, better stego image quality, and suitability for steganography as verified by our experiments, compared to other color spaces such as RGB, YCbCr, HSI, and Lab.
The quantitative and qualitative experimental results of the proposed framework and its application for addressing the security and privacy of visual contents in online social networks (OSNs), confirm its effectiveness in contrast to state-of-the-art methods.
We propose novel model transfer-learning methods that refine a decision forest model M learned within a "source" domain using a training set sampled from a "target" domain, assumed to be a variation of the source.
We present two random forest transfer algorithms.
The first algorithm searches greedily for locally optimal modifications of each tree structure by trying to locally expand or reduce the tree around individual nodes.
The second algorithm does not modify structure, but only the parameter (thresholds) associated with decision nodes.
We also propose to combine both methods by considering an ensemble that contains the union of the two forests.
The proposed methods exhibit impressive experimental results over a range of problems.
Previous work on surgical skill assessment using intraoperative tool motion in the operating room (OR) has focused on highly-structured surgical tasks such as cholecystectomy.
Further, these methods only considered generic motion metrics such as time and number of movements, which are of limited instructive value.
In this paper, we developed and evaluated an automated approach to the surgical skill assessment of nasal septoplasty in the OR.
The obstructed field of view and highly unstructured nature of septoplasty precludes trainees from efficiently learning the procedure.
We propose a descriptive structure of septoplasty consisting of two types of activity: (1) brushing activity directed away from the septum plane characterizing the consistency of the surgeon's wrist motion and (2) activity along the septal plane characterizing the surgeon's coverage pattern.
We derived features related to these two activity types that classify a surgeon's level of training with an average accuracy of about 72%.
The features we developed provide surgeons with personalized, actionable feedback regarding their tool motion.
Studying materials informatics from a data mining perspective can be beneficial for manufacturing and other industrial engineering applications.
Predictive data mining technique and machine learning algorithm are combined to design a knowledge discovery system for the selection of engineering materials that meet the design specifications.
Predictive method-Naive Bayesian classifier and Machine learning Algorithm - Pearson correlation coefficient method were implemented respectively for materials classification and selection.
The knowledge extracted from the engineering materials data sets is proposed for effective decision making in advanced engineering materials design applications.
Interactive Music Systems (IMS) have introduced a new world of music-making modalities.
But can we really say that they create music, as in true autonomous creation?
Here we discuss Video Interactive VST Orchestra (VIVO), an IMS that considers extra-musical information by adopting a simple salience based model of user-system interaction when simulating intentionality in automatic music generation.
Key features of the theoretical framework, a brief overview of pilot research, and a case study providing validation of the model are presented.
This research demonstrates that a meaningful user/system interplay is established in what we define as reflexive multidominance.
The QUIC protocol combines features that were initially found inside the TCP, TLS and HTTP/2 protocols.
The IETF is currently finalising a complete specification of this protocol.
More than a dozen of independent implementations have been developed in parallel with these standardisation activities.
We propose and implement a QUIC test suite that interacts with public QUIC servers to verify their conformance with key features of the IETF specification.
Our measurements, gathered over a semester, provide a unique viewpoint on the evolution of a protocol and of its implementations.
They highlight the arrival of new features and some regressions among the different implementations.
In this study, we describe the behavior of LTE over the sea and investigate the problem of radio resource block allocation in such SINR limited maritime channels.
For simulations of such sea environment, we considered a network scenario of Bosphorus Strait in Istanbul, Turkey with different number of ships ferrying between two ports at a given time.
After exploiting the network characteristics, we formulated and solved the radio resource allocation problem by max-min integer linear programming method.
The radio resource allocation fairness in terms of Jain's fairness index was computed and it was compared with round robin and opportunistic methods.
Results show that the max-min optimization method performs better than the opportunistic and round robin methods.
This result in turn reflects that the max-min optimization method gives us the high minimum best throughput as compared to other two methods considering different ship density scenarios in the sea.
Also, it was observed that as the number of ships begin to increase in the sea, the max-min method performs significantly better with good fairness as compared to the other two methods.
In the never-ending quest for tools that enable an ISP to smooth troubleshooting and improve awareness of network behavior, very much effort has been devoted in the collection of data by active and passive measurement at the data plane and at the control plane level.
Exploitation of collected data has been mostly focused on anomaly detection and on root-cause analysis.
Our objective is somewhat in the middle.
We consider traceroutes collected by a network of probes and aim at introducing a practically applicable methodology to quickly spot measurements that are related to high-impact events happened in the network.
Such filtering process eases further in- depth human-based analysis, for example with visual tools which are effective only when handling a limited amount of data.
We introduce the empathy relation between traceroutes as the cornerstone of our formal characterization of the traceroutes related to a network event.
Based on this model, we describe an algorithm that finds traceroutes related to high-impact events in an arbitrary set of measurements.
Evidence of the effectiveness of our approach is given by experimental results produced on real-world data.
In this paper, the problem of finding a Nash equilibrium of a multi-player game is considered.
The players are only aware of their own cost functions as well as the action space of all players.
We develop a relatively fast algorithm within the framework of inexact-ADMM.
It requires a communication graph for the information exchange between the players as well as a few mild assumptions on cost functions.
The convergence proof of the algorithm to a Nash equilibrium of the game is then provided.
Moreover, the convergence rate is investigated via simulations.
This Note investigates the bias of the sampling importance resampling (SIR) filter in estimation of the state transition noise in the state space model.
The SIR filter may suffer from sample impoverishment that is caused by the resampling and therefore will benefit from a sampling proposal that has a heavier tail, e.g. the state transition noise simulated for particle preparation is bigger than the true noise involved with the state dynamics.
This is because a comparably big transition noise used for particle propagation can spread overlapped particles to counteract impoverishment, giving better approximation of the posterior.
As such, the SIR filter tends to yield a biased (bigger-than-the-truth) estimate of the transition noise if it is unknown and needs to be estimated, at least, in the forward-only filtering estimation.
The bias is elaborated via the direct roughening approach by means of both qualitative logical deduction and quantitative numerical simulation.
In order to avoid the "Midas Touch" problem, gaze-based interfaces for selection often introduce a dwell time: a fixed amount of time the user must fixate upon an object before it is selected.
Past interfaces have used a uniform dwell time across all objects.
Here, we propose an algorithm for adjusting the dwell times of different objects based on the inferred probability that the user intends to select them.
In particular, we introduce a probabilistic model of natural gaze behavior while surfing the web to infer the probability that each hyperlink is the intended hyperlink.
We assign shorter dwell times to more likely hyperlinks and longer dwell times to less likely hyperlinks, resulting a variable dwell time gaze-based browser.
We have evaluated this method objectively both in simulation and experimentally, and subjectively through questionnaires.
Our results demonstrate that the proposed algorithm achieves a better tradeoff between accuracy and speed.
Domain adaptation (DA) aims to generalize a learning model across training and testing data despite the mismatch of their data distributions.
In light of a theoretical estimation of upper error bound, we argue in this paper that an effective DA method should 1) search a shared feature subspace where source and target data are not only aligned in terms of distributions as most state of the art DA methods do, but also discriminative in that instances of different classes are well separated; 2) account for the geometric structure of the underlying data manifold when inferring data labels on the target domain.
In comparison with a baseline DA method which only cares about data distribution alignment between source and target, we derive three different DA models, namely CDDA, GA-DA, and DGA-DA, to highlight the contribution of Close yet Discriminative DA(CDDA) based on 1), Geometry Aware DA (GA-DA) based on 2), and finally Discriminative and Geometry Aware DA (DGA-DA) implementing jointly 1) and 2).
Using both synthetic and real data, we show the effectiveness of the proposed approach which consistently outperforms state of the art DA methods over 36 image classification DA tasks through 6 popular benchmarks.
We further carry out in-depth analysis of the proposed DA method in quantifying the contribution of each term of our DA model and provide insights into the proposed DA methods in visualizing both real and synthetic data.
We define and construct efficient depth-universal and almost-size-universal quantum circuits.
Such circuits can be viewed as general-purpose simulators for central classes of quantum circuits and can be used to capture the computational power of the circuit class being simulated.
For depth we construct universal circuits whose depth is the same order as the circuits being simulated.
For size, there is a log factor blow-up in the universal circuits constructed here.
We prove that this construction is nearly optimal.
Long Short-Term Memory (LSTM) is the primary recurrent neural networks architecture for acoustic modeling in automatic speech recognition systems.
Residual learning is an efficient method to help neural networks converge easier and faster.
In this paper, we propose several types of residual LSTM methods for our acoustic modeling.
Our experiments indicate that, compared with classic LSTM, our architecture shows more than 8% relative reduction in Phone Error Rate (PER) on TIMIT tasks.
At the same time, our residual fast LSTM approach shows 4% relative reduction in PER on the same task.
Besides, we find that all this architecture could have good results on THCHS-30, Librispeech and Switchboard corpora.
We present a number of powerful local mechanisms for maintaining a dynamic swarm of robots with limited capabilities and information, in the presence of external forces and permanent node failures.
We propose a set of local continuous algorithms that together produce a generalization of a Euclidean Steiner tree.
At any stage, the resulting overall shape achieves a good compromise between local thickness, global connectivity, and flexibility to further continuous motion of the terminals.
The resulting swarm behavior scales well, is robust against node failures, and performs close to the best known approximation bound for a corresponding centralized static optimization problem.
Machine translation is a natural candidate problem for reinforcement learning from human feedback: users provide quick, dirty ratings on candidate translations to guide a system to improve.
Yet, current neural machine translation training focuses on expensive human-generated reference translations.
We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback.
Our algorithm combines the advantage actor-critic algorithm (Mnih et al., 2016) with the attention-based neural encoder-decoder architecture (Luong et al., 2015).
This algorithm (a) is well-designed for problems with a large action space and delayed rewards, (b) effectively optimizes traditional corpus-level machine translation metrics, and (c) is robust to skewed, high-variance, granular feedback modeled after actual human behaviors.
Inverse dynamics is used extensively in robotics and biomechanics applications.
In manipulator and legged robots, it can form the basis of an effective nonlinear control strategy by providing a robot with both accurate positional tracking and active compliance.
In biomechanics applications, inverse dynamics control can approximately determine the net torques applied at anatomical joints that correspond to an observed motion.
In the context of robot control, using inverse dynamics requires knowledge of all contact forces acting on the robot; accurately perceiving external forces applied to the robot requires filtering and thus significant time delay.
An alternative approach has been suggested in recent literature: predicting contact and actuator forces simultaneously under the assumptions of rigid body dynamics, rigid contact, and friction.
Existing such inverse dynamics approaches have used approximations to the contact models, which permits use of fast numerical linear algebra algorithms.
In contrast, we describe inverse dynamics algorithms that are derived only from first principles and use established phenomenological models like Coulomb friction.
We assess these inverse dynamics algorithms in a control context using two virtual robots: a locomoting quadrupedal robot and a fixed-based manipulator gripping a box while using perfectly accurate sensor data from simulation.
The data collected from these experiments gives an upper bound on the performance of such controllers in situ.
For points of comparison, we assess performance on the same tasks with both error feedback control and inverse dynamics control with virtual contact force sensing.
We introduce Delay Pruning, a simple yet powerful technique to regularize dynamic Boltzmann machines (DyBM).
The recently introduced DyBM provides a particularly structured Boltzmann machine, as a generative model of a multi-dimensional time-series.
This Boltzmann machine can have infinitely many layers of units but allows exact inference and learning based on its biologically motivated structure.
DyBM uses the idea of conduction delays in the form of fixed length first-in first-out (FIFO) queues, with a neuron connected to another via this FIFO queue, and spikes from a pre-synaptic neuron travel along the queue to the post-synaptic neuron with a constant period of delay.
Here, we present Delay Pruning as a mechanism to prune the lengths of the FIFO queues (making them zero) by setting some delay lengths to one with a fixed probability, and finally selecting the best performing model with fixed delays.
The uniqueness of structure and a non-sampling based learning rule in DyBM, make the application of previously proposed regularization techniques like Dropout or DropConnect difficult, leading to poor generalization.
First, we evaluate the performance of Delay Pruning to let DyBM learn a multidimensional temporal sequence generated by a Markov chain.
Finally, we show the effectiveness of delay pruning in learning high dimensional sequences using the moving MNIST dataset, and compare it with Dropout and DropConnect methods.
Separating an image into reflectance and shading layers poses a challenge for learning approaches because no large corpus of precise and realistic ground truth decompositions exists.
The Intrinsic Images in the Wild~(IIW) dataset provides a sparse set of relative human reflectance judgments, which serves as a standard benchmark for intrinsic images.
A number of methods use IIW to learn statistical dependencies between the images and their reflectance layer.
Although learning plays an important role for high performance, we show that a standard signal processing technique achieves performance on par with current state-of-the-art.
We propose a loss function for CNN learning of dense reflectance predictions.
Our results show a simple pixel-wise decision, without any context or prior knowledge, is sufficient to provide a strong baseline on IIW.
This sets a competitive baseline which only two other approaches surpass.
We then develop a joint bilateral filtering method that implements strong prior knowledge about reflectance constancy.
This filtering operation can be applied to any intrinsic image algorithm and we improve several previous results achieving a new state-of-the-art on IIW.
Our findings suggest that the effect of learning-based approaches may have been over-estimated so far.
Explicit prior knowledge is still at least as important to obtain high performance in intrinsic image decompositions.
Breast cancer is becoming pervasive with each passing day.
Hence, its early detection is a big step in saving the life of any patient.
Mammography is a common tool in breast cancer diagnosis.
The most important step here is classification of mammogram patches as normal-abnormal and benign-malignant.
Texture of a breast in a mammogram patch plays a significant role in these classifications.
We propose a variation of Histogram of Gradients (HOG) and Gabor filter combination called Histogram of Oriented Texture (HOT) that exploits this fact.
We also revisit the Pass Band - Discrete Cosine Transform (PB-DCT) descriptor that captures texture information well.
All features of a mammogram patch may not be useful.
Hence, we apply a feature selection technique called Discrimination Potentiality (DP).
Our resulting descriptors, DP-HOT and DP-PB-DCT, are compared with the standard descriptors.
Density of a mammogram patch is important for classification, and has not been studied exhaustively.
The Image Retrieval in Medical Application (IRMA) database from RWTH Aachen, Germany is a standard database that provides mammogram patches, and most researchers have tested their frameworks only on a subset of patches from this database.
We apply our two new descriptors on all images of the IRMA database for density wise classification, and compare with the standard descriptors.
We achieve higher accuracy than all of the existing standard descriptors (more than 92%).
In this paper we present a working model of an automatic pill reminder and dispenser setup that can alleviate irregularities in taking prescribed dosage of medicines at the right time dictated by the medical practitioner and switch from approaches predominantly dependent on human memory to automation with negligible supervision, thus relieving persons from error-prone tasks of giving wrong medicine at the wrong time in the wrong amount.
Visual illusions teach us that what we see is not always what it is represented in the physical world.
Its special nature make them a fascinating tool to test and validate any new vision model proposed.
In general, current vision models are based on the concatenation of linear convolutions and non-linear operations.
In this paper we get inspiration from the similarity of this structure with the operations present in Convolutional Neural Networks (CNNs).
This motivated us to study if CNNs trained for low-level visual tasks are deceived by visual illusions.
In particular, we show that CNNs trained for image denoising, image deblurring, and computational color constancy are able to replicate the human response to visual illusions, and that the extent of this replication varies with respect to variation in architecture and spatial pattern size.
We believe that this CNNs behaviour appears as a by-product of the training for the low level vision tasks of denoising, color constancy or deblurring.
Our work opens a new bridge between human perception and CNNs: in order to obtain CNNs that better replicate human behaviour, we may need to start aiming for them to better replicate visual illusions.
This paper presents Dokei, an effective supervised domain adaptation method to transform a pre-trained CNN model to one involving efficient grouped convolution.
The basis of this approach is formalised as a novel optimisation problem constrained by group sparsity pattern (GSP), and a practical solution based on structured regularisation and maximal bipartite matching is provided.
We show that it is vital to keep the connections specified by GSP when mapping pre-trained weights to grouped convolution.
We evaluate Dokei on various domains and hardware platforms to demonstrate its effectiveness.
The models resulting from Dokei are shown to be more accurate and slimmer than prior work targeting grouped convolution, and more regular and easier to deploy than other pruning techniques.
Underwater images suffer from color distortion and low contrast, because light is attenuated while it propagates through water.
Attenuation under water varies with wavelength, unlike terrestrial images where attenuation is assumed to be spectrally uniform.
The attenuation depends both on the water body and the 3D structure of the scene, making color restoration difficult.
Unlike existing single underwater image enhancement techniques, our method takes into account multiple spectral profiles of different water types.
By estimating just two additional global parameters: the attenuation ratios of the blue-red and blue-green color channels, the problem is reduced to single image dehazing, where all color channels have the same attenuation coefficients.
Since the water type is unknown, we evaluate different parameters out of an existing library of water types.
Each type leads to a different restored image and the best result is automatically chosen based on color distribution.
We collected a dataset of images taken in different locations with varying water properties, showing color charts in the scenes.
Moreover, to obtain ground truth, the 3D structure of the scene was calculated based on stereo imaging.
This dataset enables a quantitative evaluation of restoration algorithms on natural images and shows the advantage of our method.
Ecological Momentary Assessment (EMA) data is organized in multiple levels (per-subject, per-day, etc.) and this particular structure should be taken into account in machine learning algorithms used in EMA like decision trees and its variants.
We propose a new algorithm called BBT (standing for Bagged Boosted Trees) that is enhanced by a over/under sampling method and can provide better estimates for the conditional class probability function.
Experimental results on a real-world dataset show that BBT can benefit EMA data classification and performance.
Lambda calculus is the basis of functional programming and higher order proof assistants.
However, little is known about combinatorial properties of lambda terms, in particular, about their asymptotic distribution and random generation.
This paper tries to answer questions like: How many terms of a given size are there?
What is a "typical" structure of a simply typable term?
Despite their ostensible simplicity, these questions still remain unanswered, whereas solutions to such problems are essential for testing compilers and optimizing programs whose expected efficiency depends on the size of terms.
Our approach toward the afore-mentioned problems may be later extended to any language with bound variables, i.e., with scopes and declarations.
This paper presents two complementary approaches: one, theoretical, uses complex analysis and generating functions, the other, experimental, is based on a generator of lambda-terms.
Thanks to de Bruijn indices, we provide three families of formulas for the number of closed lambda terms of a given size and we give four relations between these numbers which have interesting combinatorial interpretations.
As a by-product of the counting formulas, we design an algorithm for generating lambda terms.
Performed tests provide us with experimental data, like the average depth of bound variables and the average number of head lambdas.
We also create random generators for various sorts of terms.
Thereafter, we conduct experiments that answer questions like: What is the ratio of simply typable terms among all terms?
(Very small!)
How are simply typable lambda terms distributed among all lambda terms?
(A typable term almost always starts with an abstraction.)
In this paper, abstractions and applications have size 1 and variables have size 0.
Twitter introduced lists in late 2009 as a means of curating tweets into meaningful themes.
Lists were quickly adopted by media companies as a means of organising content around news stories.
Thus the curation of these lists is important, they should contain the key information gatekeepers and present a balanced perspective on the story.
Identifying members to add to a list on an emerging topic is a delicate process.
From a network analysis perspective there are a number of views on the Twitter network that can be explored, e.g. followers, retweets mentions etc.
We present a process for integrating these views in order to recommend authoritative commentators to include on a list.
This process is evaluated on manually curated lists about unrest in Bahrain and the Iowa caucuses for the 2012 US election.
This paper describes a novel approach to analyze and control systems with multi-mode oscillation problems.
Traditional single dominant mode analysis fails to provide effective control actions when several modes have similar low damping ratios.
This work addresses this problem by considering all modes in the formulation of the system kinetic oscillation energy.
The integral of energy over time defines the total action as a measure of dynamic performance, and its sensitivity allows comparing the performance of different actuators/locations in the system to select the most effective one to damp the oscillation energy.
Time domain simulations in the IEEE 9-bus system and IEEE 39-bus system verify the findings obtained by the oscillation energy based analysis.
Applications of the proposed method in control and system planning are discussed.
We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing.
The unsupervised component is based on a generative model in which latent sentences generate the unpaired logical forms.
We apply this method to a number of semantic parsing tasks focusing on domains with limited access to labelled training data and extend those datasets with synthetically generated logical forms.
The abilities to perceive, learn, and use generalities, similarities, classes, i.e., semantic memory (SM), is central to cognition.
Machine learning (ML), neural network, and AI research has been primarily driven by tasks requiring such abilities.
However, another central facet of cognition, single-trial formation of permanent memories of experiences, i.e., episodic memory (EM), has had relatively little focus.
Only recently has EM-like functionality been added to Deep Learning (DL) models, e.g., Neural Turing Machine, Memory Networks.
However, in these cases: a) EM is implemented as a separate module, which entails substantial data movement (and so, time and power) between the DL net itself and EM; and b) individual items are stored localistically within the EM, precluding realizing the exponential representational efficiency of distributed over localist coding.
We describe Sparsey, an unsupervised, hierarchical, spatial/spatiotemporal associative memory model differing fundamentally from mainstream ML models, most crucially, in its use of sparse distributed representations (SDRs), or, cell assemblies, which admits an extremely efficient, single-trial learning algorithm that maps input similarity into code space similarity (measured as intersection).
SDRs of individual inputs are stored in superposition and because similarity is preserved, the patterns of intersections over the assigned codes reflect the similarity, i.e., statistical, structure, of all orders, not simply pairwise, over the inputs.
Thus, SM, i.e., a generative model, is built as a computationally free side effect of the act of storing episodic memory traces of individual inputs, either spatial patterns or sequences.
We report initial results on MNIST and on the Weizmann video event recognition benchmarks.
While we have not yet attained SOTA class accuracy, learning takes only minutes on a single CPU.
Recent advances in self-interference cancellation enable radios to transmit and receive on the same frequency at the same time.
Such a full duplex radio is being considered as a potential candidate for the next generation of wireless networks due to its ability to increase the spectral efficiency of wireless systems.
In this paper, the performance of full duplex radio in small cellular systems is analyzed by assuming full duplex capable base stations and half duplex user equipment.
However, using only full duplex base stations increases interference leading to outage.
We therefore propose a mixed multi-cell system, composed of full duplex and half duplex cells.
A stochastic geometry based model of the proposed mixed system is provided, which allows us to derive the outage and area spectral efficiency of such a system.
The effect of full duplex cells on the performance of the mixed system is presented under different network parameter settings.
We show that the fraction of cells that have full duplex base stations can be used as a design parameter by the network operator to target an optimal tradeoff between area spectral efficiency and outage in a mixed system.
The recent development of multi-agent simulations brings about a need for population synthesis.
It is a task of reconstructing the entire population from a sampling survey of limited size (1% or so), supplying the initial conditions from which simulations begin.
This paper presents a new kernel density estimator for this task.
Our method is an analogue of the classical Breiman-Meisel-Purcell estimator, but employs novel techniques that harness the huge degree of freedom which is required to model high-dimensional nonlinearly correlated datasets: the crossover kernel, the k-nearest neighbor restriction of the kernel construction set and the bagging of kernels.
The performance as a statistical estimator is examined through real and synthetic datasets.
We provide an "optimization-free" parameter selection rule for our method, a theory of how our method works and a computational cost analysis.
To demonstrate the usefulness as a population synthesizer, our method is applied to a household synthesis task for an urban micro-simulator.
Currently there is an active Post-Quantum Cryptography (PQC) solutions search, which attempts to find cryptographic protocols resistant to attacks by means of for instance Shor polynomial time algorithm for numerical field problems like integer factorization (IFP) or the discrete logarithm (DLP).
The use of non-commutative or non-associative structures are, among others, valid choices for these kinds of protocols.
In our case, we focus on a permutation subgroup of high order and belonging to the symmetric group S381.
Using adequate one-way functions (OWF), we derived a Diffie-Hellman key exchange and an ElGamal ciphering procedure that only relies on combinatorial operations.
Both OWF pose hard search problems which are assumed as not belonging to BQP time-complexity class.
Obvious advantages of present protocols are their conceptual simplicity, fast throughput implementations, high cryptanalytic security and no need for arithmetic operations and therefore extended precision libraries.
Such features make them suitable for low performance and low power consumption platforms like smart cards, USB-keys and cellphones.
In practical mobile communication engineering applications, surfaces of antenna array deployment regions are usually uneven.
Therefore, massive multi-input-multi-output (MIMO) communication systems usually transmit wireless signals by irregular antenna arrays.
To evaluate the performance of irregular antenna arrays, the matrix correlation coefficient and ergodic received gain are defined for massive MIMO communication systems with mutual coupling effects.
Furthermore, the lower bound of the ergodic achievable rate, symbol error rate (SER) and average outage probability are firstly derived for multi-user massive MIMO communication systems using irregular antenna arrays.
Asymptotic results are also derived when the number of antennas approaches infinity.
Numerical results indicate that there exists a maximum achievable rate when the number of antennas keeps increasing in massive MIMO communication systems using irregular antenna arrays.
Moreover, the irregular antenna array outperforms the regular antenna array in the achievable rate of massive MIMO communication systems when the number of antennas is larger than or equal to a given threshold.
Community detection in a complex network is an important problem of much interest in recent years.
In general, a community detection algorithm chooses an objective function and captures the communities of the network by optimizing the objective function, and then, one uses various heuristics to solve the optimization problem to extract the interesting communities for the user.
In this article, we demonstrate the procedure to transform a graph into points of a metric space and develop the methods of community detection with the help of a metric defined for a pair of points.
We have also studied and analyzed the community structure of the network therein.
The results obtained with our approach are very competitive with most of the well-known algorithms in the literature, and this is justified over the large collection of datasets.
On the other hand, it can be observed that time taken by our algorithm is quite less compared to other methods and justifies the theoretical findings.
This paper presents a wp-style calculus for obtaining bounds on the expected run-time of probabilistic programs.
Its application includes determining the (possibly infinite) expected termination time of a probabilistic program and proving positive almost-sure termination - does a program terminate with probability one in finite expected time?
We provide several proof rules for bounding the run-time of loops, and prove the soundness of the approach with respect to a simple operational model.
We show that our approach is a conservative extension of Nielson's approach for reasoning about the run-time of deterministic programs.
We analyze the expected run-time of some example programs including a one-dimensional random walk and the coupon collector problem.
Spurred by the development of cloud computing, there has been considerable recent interest in the Database-as-a-Service (DaaS) paradigm.
Users lacking in expertise or computational resources can outsource their data and database management needs to a third-party service provider.
Outsourcing, however, raises an important issue of result integrity: how can the client verify with lightweight overhead that the query results returned by the service provider are correct (i.e., the same as the results of query execution locally)?
This survey focuses on categorizing and reviewing the progress on the current approaches for result integrity of SQL query evaluation in the DaaS model.
The survey also includes some potential future research directions for result integrity verification of the outsourced computations.
We propose a cost-effective framework for preference elicitation and aggregation under the Plackett-Luce model with features.
Given a budget, our framework iteratively computes the most cost-effective elicitation questions in order to help the agents make a better group decision.
We illustrate the viability of the framework with experiments on Amazon Mechanical Turk, which we use to estimate the cost of answering different types of elicitation questions.
We compare the prediction accuracy of our framework when adopting various information criteria that evaluate the expected information gain from a question.
Our experiments show carefully designed information criteria are much more efficient, i.e., they arrive at the correct answer using fewer queries, than randomly asking questions given the budget constraint.
We study complex time series (spike trains) of online user communication while spreading messages about the discovery of the Higgs boson in Twitter.
We focus on online social interactions among users such as retweet, mention, and reply, and construct different types of active (performing an action) and passive (receiving an action) spike trains for each user.
The spike trains are analyzed by means of local variation, to quantify the temporal behavior of active and passive users, as a function of their activity and popularity.
We show that the active spike trains are bursty, independently of their activation frequency.
For passive spike trains, in contrast, the local variation of popular users presents uncorrelated (Poisson random) dynamics.
We further characterize the correlations of the local variation in different interactions.
We obtain high values of correlation, and thus consistent temporal behavior, between retweets and mentions, but only for popular users, indicating that creating online attention suggests an alignment in the dynamics of the two interactions.
This paper contains description of such knowledge representation model as Object-Oriented Dynamic Network (OODN), which gives us an opportunity to represent knowledge, which can be modified in time, to build new relations between objects and classes of objects and to represent results of their modifications.
The model is based on representation of objects via their properties and methods.
It gives us a possibility to classify the objects and, in a sense, to build hierarchy of their types.
Furthermore, it enables to represent relation of modification between concepts, to build new classes of objects based on existing classes and to create sets and multisets of concepts.
OODN can be represented as a connected and directed graph, where nodes are concepts and edges are relations between them.
Using such model of knowledge representation, we can consider modifications of knowledge and movement through the graph of network as a process of logical reasoning or finding the right solutions or creativity, etc.
The proposed approach gives us an opportunity to model some aspects of human knowledge system and main mechanisms of human thought, in particular getting a new experience and knowledge.
Dynamic oracles provide strong supervision for training constituency parsers with exploration, but must be custom defined for a given parser's transition system.
We explore using a policy gradient method as a parser-agnostic alternative.
In addition to directly optimizing for a tree-level metric such as F1, policy gradient has the potential to reduce exposure bias by allowing exploration during training; moreover, it does not require a dynamic oracle for supervision.
On four constituency parsers in three languages, the method substantially outperforms static oracle likelihood training in almost all settings.
For parsers where a dynamic oracle is available (including a novel oracle which we define for the transition system of Dyer et al.2016), policy gradient typically recaptures a substantial fraction of the performance gain afforded by the dynamic oracle.
Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted depth maps from a deep neural network can be deployed for accurate and dense monocular reconstruction.
We propose a method where CNN-predicted dense depth maps are naturally fused together with depth measurements obtained from direct monocular SLAM.
Our fusion scheme privileges depth prediction in image locations where monocular SLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa.
We demonstrate the use of depth prediction for estimating the absolute scale of the reconstruction, hence overcoming one of the major limitations of monocular SLAM.
Finally, we propose a framework to efficiently fuse semantic labels, obtained from a single frame, with dense SLAM, yielding semantically coherent scene reconstruction from a single view.
Evaluation results on two benchmark datasets show the robustness and accuracy of our approach.
A distributed discrete-time algorithm is proposed for multi-agent networks to achieve a common least squares solution of a group of linear equations, in which each agent only knows some of the equations and is only able to receive information from its nearby neighbors.
For fixed, connected, and undirected networks, the proposed discrete-time algorithm results in each agents solution estimate to converging exponentially fast to the same least squares solution.
Moreover, the convergence does not require careful choices of time-varying small step sizes.
With the rapid growth of medical imaging research, there is a great interest in the automated detection of skin lesions with computer algorithms.
The state-of-the-art datasets for skin lesions are often accompanied with very limited amount of ground truth labeling as it is laborious and expensive.
The region of interest (ROI) detection is vital to locate the lesion accurately and robust to subtle features of different skin lesion types.
In this work, we propose the use of two object localization meta-architectures for end-to-end ROI skin lesion detection in dermoscopic images.
We trained the Faster-RCNN-InceptionV2 and SSD-InceptionV2 on ISBI-2017 training dataset and evaluate the performances on ISBI-2017 testing set, PH2 and HAM10000 datasets.
Since there was no earlier work in ROI detection for skin lesion with CNNs, we compare the performance of skin localization methods with the state-of-the-art segmentation method.
The localization methods proved superiority over the segmentation method in ROI detection on skin lesion datasets.
In addition, based on the detected ROI, an automated natural data-augmentation method is proposed.
To demonstrate the potential of our work, we developed a real-time mobile application for automated skin lesions detection.
The codes and mobile application will be made available for further research purposes.
Humans develop a common sense of style compatibility between items based on their attributes.
We seek to automatically answer questions like "Does this shirt go well with that pair of jeans?"
In order to answer these kinds of questions, we attempt to model human sense of style compatibility in this paper.
The basic assumption of our approach is that most of the important attributes for a product in an online store are included in its title description.
Therefore it is feasible to learn style compatibility from these descriptions.
We design a Siamese Convolutional Neural Network architecture and feed it with title pairs of items, which are either compatible or incompatible.
Those pairs will be mapped from the original space of symbolic words into some embedded style space.
Our approach takes only words as the input with few preprocessing and there is no laborious and expensive feature engineering.
Time-Series Classification (TSC) has attracted a lot of attention in pattern recognition, because wide range of applications from different domains such as finance and health informatics deal with time-series signals.
Bag of Features (BoF) model has achieved a great success in TSC task by summarizing signals according to the frequencies of "feature words" of a data-learned dictionary.
This paper proposes embedding the Recurrence Plots (RP), a visualization technique for analysis of dynamic systems, in the BoF model for TSC.
While the traditional BoF approach extracts features from 1D signal segments, this paper uses the RP to transform time-series into 2D texture images and then applies the BoF on them.
Image representation of time-series enables us to explore different visual descriptors that are not available for 1D signals and to treats TSC task as a texture recognition problem.
Experimental results on the UCI time-series classification archive demonstrates a significant accuracy boost by the proposed Bag of Recurrence patterns (BoR), compared not only to the existing BoF models, but also to the state-of-the art algorithms.
Network based on distributed caching of content is a new architecture to alleviate the ongoing explosive demands for rate of multi-media traffic.
In caching networks, coded caching is a recently proposed technique that achieves significant performance gains compared to uncoded caching schemes.
In this paper, we derive a lower bound on the average rate with a memory constraint for a family of caching allocation placement and a family of XOR cooperative delivery.
The lower bound inspires us how placement and delivery affect the rate memory tradeoff.
Based on the clues, we design a new placement and two new delivery algorithms.
On one hand, the new placement scheme can allocate the cache more flexibly compared to grouping scheme.
On the other hand, the new delivery can exploit more cooperative opportunities compared to the known schemes.
The simulations validate our idea.
Recently, convolutional neural network (CNN) has attracted much attention in different areas of computer vision, due to its powerful abstract feature representation.
Visual object tracking is one of the interesting and important areas in computer vision that achieves remarkable improvements in recent years.
In this work, we aim to improve both the motion and observation models in visual object tracking by leveraging representation power of CNNs.
To this end, a motion estimation network (named MEN) is utilized to seek the most likely locations of the target and prepare a further clue in addition to the previous target position.
Hence the motion estimation would be enhanced by generating a small number of candidates near two plausible positions.
The generated candidates are then fed into a trained Siamese network to detect the most probable candidate.
Each candidate is compared to an adaptable buffer, which is updated under a predefined condition.
To take into account the target appearance changes, a weighting CNN (called WCNN) adaptively assigns weights to the final similarity scores of the Siamese network using sequence-specific information.
Evaluation results on well-known benchmark datasets (OTB100, OTB50 and OTB2013) prove that the proposed tracker outperforms the state-of-the-art competitors.
We present a new network model accounting for multidimensional assortativity.
Each node is characterized by a number of features and the probability of a link between two nodes depends on common features.
We do not fix a priori the total number of possible features.
The bipartite network of the nodes and the features evolves according to a stochastic dynamics that depends on three parameters that respectively regulate the preferential attachment in the transmission of the features to the nodes, the number of new features per node, and the power-law behavior of the total number of observed features.
Our model also takes into account a mechanism of triadic closure.
We provide theoretical results and statistical estimators for the parameters of the model.
We validate our approach by means of simulations and an empirical analysis of a network of scientific collaborations.
This paper studies the joint support recovery of similar sparse vectors on the basis of a limited number of noisy linear measurements, i.e., in a multiple measurement vector (MMV) model.
The additive noise signals on each measurement vector are assumed to be Gaussian and to exhibit different variances.
The simultaneous orthogonal matching pursuit (SOMP) algorithm is generalized to weight the impact of each measurement vector on the choice of the atoms to be picked according to their noise levels.
The new algorithm is referred to as SOMP-NS where NS stands for noise stabilization.
To begin with, a theoretical framework to analyze the performance of the proposed algorithm is developed.
This framework is then used to build conservative lower bounds on the probability of partial or full joint support recovery.
Numerical simulations show that the proposed algorithm outperforms SOMP and that the theoretical lower bound provides a great insight into how SOMP-NS behaves when the weighting strategy is modified.
Opportunistic detection rules (ODRs) are variants of fixed-sample-size detection rules in which the statistician is allowed to make an early decision on the alternative hypothesis opportunistically based on the sequentially observed samples.
From a sequential decision perspective, ODRs are also mixtures of one-sided and truncated sequential detection rules.
Several results regarding ODRs are established in this paper.
In the finite regime, the maximum sample size is modeled either as a fixed finite number, or a geometric random variable with a fixed finite mean.
For both cases, the corresponding Bayesian formulations are investigated.
The former case is a slight variation of the well-known finite-length sequential hypothesis testing procedure in the literature, whereas the latter case is new, for which the Bayesian optimal ODR is shown to be a sequence of likelihood ratio threshold tests with two different thresholds: a running threshold, which is determined by solving a stationary state equation, is used when future samples are still available, and a terminal threshold (simply the ratio between the priors scaled by costs) is used when the statistician reaches the final sample and thus has to make a decision immediately.
In the asymptotic regime, the tradeoff among the exponents of the (false alarm and miss) error probabilities and the normalized expected stopping time under the alternative hypothesis is completely characterized and proved to be tight, via an information-theoretic argument.
Within the tradeoff region, one noteworthy fact is that the performance of the Stein-Chernoff Lemma is attainable by ODRs.
Modeling should play a central role in K-12 STEM education, where it could make classes much more engaging.
A model underlies every scientific theory, and models are central to all the STEM disciplines (Science, Technology, Engineering, Math).
This paper describes executable concept modeling of STEM concepts using immutable objects and pure functions in Python.
I present examples in math, physics, chemistry, and engineering, built using a proof-of-concept tool called PySTEMM .
The approach applies to all STEM areas and supports learning with pictures, narrative, animation, and graph plots.
Models can extend each other, simplifying getting started.
The functional-programming style reduces incidental complexity and code debugging.
In this paper, naive Bayesian and C4.5 Decision Tree Classifiers(DTC) are successively applied on materials informatics to classify the engineering materials into different classes for the selection of materials that suit the input design specifications.
Here, the classifiers are analyzed individually and their performance evaluation is analyzed with confusion matrix predictive parameters and standard measures, the classification results are analyzed on different class of materials.
Comparison of classifiers has found that naive Bayesian classifier is more accurate and better than the C4.5 DTC.
The knowledge discovered by the naive bayesian classifier can be employed for decision making in materials selection in manufacturing industries.
This paper presents a proposal (story) of how statically detecting unreachable objects (in Java) could be used to improve a particular runtime verification approach (for Java), namely parametric trace slicing.
Monitoring algorithms for parametric trace slicing depend on garbage collection to (i) cleanup data-structures storing monitored objects, ensuring they do not become unmanageably large, and (ii) anticipate the violation of (non-safety) properties that cannot be satisfied as a monitored object can no longer appear later in the trace.
The proposal is that both usages can be improved by making the unreachability of monitored objects explicit in the parametric property and statically introducing additional instrumentation points generating related events.
The ideas presented in this paper are still exploratory and the intention is to integrate the described techniques into the MarQ monitoring tool for quantified event automata.
In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance.
To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context.
It consists of a convolutional gated unit to perform global encoding to improve the representations of the source-side information.
Evaluations on the LCSTS and the English Gigaword both demonstrate that our model outperforms the baseline models, and the analysis shows that our model is capable of reducing repetition.
Mammogram classification is directly related to computer-aided diagnosis of breast cancer.
Traditional methods requires great effort to annotate the training data by costly manual labeling and specialized computational models to detect these annotations during test.
Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned costly need to annotate the training data.
We explore three different schemes to construct deep multi-instance networks for whole mammogram classification.
Experimental results on the INbreast dataset demonstrate the robustness of proposed deep networks compared to previous work using segmentation and detection annotations in the training.
This paper proposes a new cubical space model for the representation of continuous objects and surfaces in the n-dimensional Euclidean space by discrete sets of points.
The cubical space model concerns the process of converting a continuous object in its digital counterpart, which is a graph, enabling us to apply notions and operations used in digital imaging to cubical spaces.
We formulate a definition of a simple n-cube and prove that deleting or attaching a simple n-cube does not change the homotopy type of a cubical space.
Relying on these results, we design a procedure, which preserves basic topological properties of an n-dimensional object, for constructing compressed cubical and digital models.
With the ever increasing size of web, relevant information extraction on the Internet with a query formed by a few keywords has become a big challenge.
To overcome this, query expansion (QE) plays a crucial role in improving the Internet searches, where the user's initial query is reformulated to a new query by adding new meaningful terms with similar significance.
QE -- as part of information retrieval (IR) -- has long attracted researchers' attention.
It has also become very influential in the field of personalized social document, Question Answering over Linked Data (QALD), and, Text Retrieval Conference (TREC) and REAL sets.
This paper surveys QE techniques in IR from 1960 to 2017 with respect to core techniques, data sources used, weighting and ranking methodologies, user participation and applications (of QE techniques) -- bringing out similarities and differences.
The efficient sparse coding and reconstruction of signal vectors via linear observations has received a tremendous amount of attention over the last decade.
In this context, the automated learning of a suitable basis or overcomplete dictionary from training data sets of certain signal classes for use in sparse representations has turned out to be of particular importance regarding practical signal processing applications.
Most popular dictionary learning algorithms involve NP-hard sparse recovery problems in each iteration, which may give some indication about the complexity of dictionary learning but does not constitute an actual proof of computational intractability.
In this technical note, we show that learning a dictionary with which a given set of training signals can be represented as sparsely as possible is indeed NP-hard.
Moreover, we also establish hardness of approximating the solution to within large factors of the optimal sparsity level.
Furthermore, we give NP-hardness and non-approximability results for a recent dictionary learning variation called the sensor permutation problem.
Along the way, we also obtain a new non-approximability result for the classical sparse recovery problem from compressed sensing.
The purpose of this study is to determine whether current video datasets have sufficient data for training very deep convolutional neural networks (CNNs) with spatio-temporal three-dimensional (3D) kernels.
Recently, the performance levels of 3D CNNs in the field of action recognition have improved significantly.
However, to date, conventional research has only explored relatively shallow 3D architectures.
We examine the architectures of various 3D CNNs from relatively shallow to very deep ones on current video datasets.
Based on the results of those experiments, the following conclusions could be obtained: (i) ResNet-18 training resulted in significant overfitting for UCF-101, HMDB-51, and ActivityNet but not for Kinetics.
(ii) The Kinetics dataset has sufficient data for training of deep 3D CNNs, and enables training of up to 152 ResNets layers, interestingly similar to 2D ResNets on ImageNet.
ResNeXt-101 achieved 78.4% average accuracy on the Kinetics test set.
(iii) Kinetics pretrained simple 3D architectures outperforms complex 2D architectures, and the pretrained ResNeXt-101 achieved 94.5% and 70.2% on UCF-101 and HMDB-51, respectively.
The use of 2D CNNs trained on ImageNet has produced significant progress in various tasks in image.
We believe that using deep 3D CNNs together with Kinetics will retrace the successful history of 2D CNNs and ImageNet, and stimulate advances in computer vision for videos.
The codes and pretrained models used in this study are publicly available. https://github.com/kenshohara/3D-ResNets-PyTorch
Active contour models based on local region fitting energy can segment images with intensity inhomogeneity effectively, but their segmentation results are easy to error if the initial contour is inappropriate.
In this paper, we present a simple and universal method of improving the robustness of initial contour for these local fitting-based models.
The core idea of proposed method is exchanging the fitting values on the two sides of contour, so that the fitting values inside the contour are always larger (or smaller) than the values outside the contour in the process of curve evolution.
In this way, the whole curve will evolve along the inner (or outer) boundaries of object, and less likely to be stuck in the object or background.
Experimental results have proved that using the proposed method can enhance the robustness of initial contour and meanwhile keep the original advantages in the local fitting-based models.
Consider a set of agents that wish to estimate a vector of parameters of their mutual interest.
For this estimation goal, agents can sense and communicate.
When sensing, an agent measures (in additive gaussian noise) linear combinations of the unknown vector of parameters.
When communicating, an agent can broadcast information to a few other agents, by using the channels that happen to be randomly at its disposal at the time.
To coordinate the agents towards their estimation goal, we propose a novel algorithm called FADE (Fast and Asymptotically efficient Distributed Estimator), in which agents collaborate at discrete time-steps; at each time-step, agents sense and communicate just once, while also updating their own estimate of the unknown vector of parameters.
FADE enjoys five attractive features: first, it is an intuitive estimator, simple to derive; second, it withstands dynamic networks, that is, networks whose communication channels change randomly over time; third, it is strongly consistent in that, as time-steps play out, each agent's local estimate converges (almost surely) to the true vector of parameters; fourth, it is both asymptotically unbiased and efficient, which means that, across time, each agent's estimate becomes unbiased and the mean-square error (MSE) of each agent's estimate vanishes to zero at the same rate of the MSE of the optimal estimator at an almighty central node; fifth, and most importantly, when compared with a state-of-art consensus+innovation (CI) algorithm, it yields estimates with outstandingly lower mean-square errors, for the same number of communications -- for example, in a sparsely connected network model with 50 agents, we find through numerical simulations that the reduction can be dramatic, reaching several orders of magnitude.
Temporary earth retaining structures (TERS) help prevent collapse during construction excavation.
To ensure that these structures are operating within design specifications, load forces on supports must be monitored.
Current monitoring approaches are expensive, sparse, off-line, and thus difficult to integrate into predictive models.
This work aims to show that wirelessly connected battery powered sensors are feasible, practical, and have similar accuracy to existing sensor systems.
We present the design and validation of ReStructure, an end-to-end prototype wireless sensor network for collection, communication, and aggregation of strain data.
ReStructure was validated through a six months deployment on a real-life excavation site with all but one node producing valid and accurate strain measurements at higher frequency than existing ones.
These results and the lessons learnt provide the basis for future widespread wireless TERS monitoring that increase measurement density and integrate closely with predictive models to provide timely alerts of damage or potential failure.
In this letter, a very simple no-reference image quality assessment (NR-IQA) model for JPEG compressed images is proposed.
The proposed metric called median of unique gradients (MUG) is based on the very simple facts of unique gradient magnitudes of JPEG compressed images.
MUG is a parameterless metric and does not need training.
Unlike other NR-IQAs, MUG is independent to block size and cropping.
A more stable index called MUG+ is also introduced.
The experimental results on six benchmark datasets of natural images and a benchmark dataset of synthetic images show that MUG is comparable to the state-of-the-art indices in literature.
In addition, its performance remains unchanged for the case of the cropped images in which block boundaries are not known.
The MATLAB source code of the proposed metrics is available at https://dl.dropboxusercontent.com/u/74505502/MUG.m and https://dl.dropboxusercontent.com/u/74505502/MUGplus.m.
Several BPMN graphical tools support, at least partly, the OMG's BPMN specification.
The BPMN standard is an essential guide for tools' makers when implementing the rules regarding depiction of BPMN diagrammatic constructs.
Process modelers should also know how to rigorously use BPMN constructs when depicting business processes either for business or IT purposes.
Several already published OMG's standards include the formal specification of well-formedness rules concern-ing the metamodels they address.
However, the BPMN standard does not.
Instead, the rules regarding BPMN elements are only informally specified in natural language throughout the overall BPMN documentation.
Without strict rules concerning the correct usage of BPMN elements, no wonder that plenty of available BPMN tools fail to enforce BPMN process models' correctness.
To mitigate this problem, and therefore contribute for achieving BPMN models' correctness, we propose to supplement the BPMN metamodel with well-formedness rules expressed by OCL invariants.
So, this document contributes to bring together a set of requirements that tools' makers must comply with, in order to claim a broader BPMN 2 compliance.
For the regular process modeler, this report provides an extensive and pragmatic catalog of BPMN elements' usage, to be followed in order to attain correct BPMN process models.
Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content.
We take a step in this direction by proposing three novel ways to incorporate affective/emotional aspects into long short term memory (LSTM) encoder-decoder neural conversation models: (1) affective word embeddings, which are cognitively engineered, (2) affect-based objective functions that augment the standard cross-entropy loss, and (3) affectively diverse beam search for decoding.
Experiments show that these techniques improve the open-domain conversational prowess of encoder-decoder networks by enabling them to produce emotionally rich responses that are more interesting and natural.
Trace norm regularization is a widely used approach for learning low rank matrices.
A standard optimization strategy is based on formulating the problem as one of low rank matrix factorization which, however, leads to a non-convex problem.
In practice this approach works well, and it is often computationally faster than standard convex solvers such as proximal gradient methods.
Nevertheless, it is not guaranteed to converge to a global optimum, and the optimization can be trapped at poor stationary points.
In this paper we show that it is possible to characterize all critical points of the non-convex problem.
This allows us to provide an efficient criterion to determine whether a critical point is also a global minimizer.
Our analysis suggests an iterative meta-algorithm that dynamically expands the parameter space and allows the optimization to escape any non-global critical point, thereby converging to a global minimizer.
The algorithm can be applied to problems such as matrix completion or multitask learning, and our analysis holds for any random initialization of the factor matrices.
Finally, we confirm the good performance of the algorithm on synthetic and real datasets.
This paper presents a statistically sound method for measuring the accuracy with which a probabilistic model reflects the growth of a network, and a method for optimising parameters in such a model.
The technique is data-driven, and can be used for the modeling and simulation of any kind of evolving network.
The overall framework, a Framework for Evolving Topology Analysis (FETA), is tested on data sets collected from the Internet AS-level topology, social networking websites and a co-authorship network.
Statistical models of the growth of these networks are produced and tested using a likelihood-based method.
The models are then used to generate artificial topologies with the same statistical properties as the originals.
This work can be used to predict future growth patterns for a known network, or to generate artificial models of graph topology evolution for simulation purposes.
Particular application examples include strategic network planning, user profiling in social networks or infrastructure deployment in managed overlay-based services.
Recent studies on face attribute transfer have achieved great success.
A lot of models are able to transfer face attributes with an input image.
However, they suffer from three limitations: (1) incapability of generating image by exemplars; (2) being unable to transfer multiple face attributes simultaneously; (3) low quality of generated images, such as low-resolution or artifacts.
To address these limitations, we propose a novel model which receives two images of opposite attributes as inputs.
Our model can transfer exactly the same type of attributes from one image to another by exchanging certain part of their encodings.
All the attributes are encoded in a disentangled manner in the latent space, which enables us to manipulate several attributes simultaneously.
Besides, our model learns the residual images so as to facilitate training on higher resolution images.
With the help of multi-scale discriminators for adversarial training, it can even generate high-quality images with finer details and less artifacts.
We demonstrate the effectiveness of our model on overcoming the above three limitations by comparing with other methods on the CelebA face database.
A pytorch implementation is available at https://github.com/Prinsphield/ELEGANT.
In this paper, we study the complexity of execution in higher-order programming languages.
Our study has two facets: on the one hand we give an upper bound to the length of interactions between bounded P-visible strategies in Hyland-Ong game semantics.
This result covers models of programming languages with access to computational effects like non-determinism, state or control operators, but its semantic formulation causes a loose connection to syntax.
On the other hand we give a syntactic counterpart of our semantic study: a non-elementary upper bound to the length of the linear head reduction sequence (a low-level notion of reduction, close to the actual implementation of the reduction of higher-order programs by abstract machines) of simply-typed lambda-terms.
In both cases our upper bounds are proved optimal by giving matching lower bounds.
These two results, although different in scope, are proved using the same method: we introduce a simple reduction on finite trees of natural numbers, hereby called interaction skeletons.
We study this reduction and give upper bounds to its complexity.
We then apply this study by giving two simulation results: a semantic one measuring progress in game-theoretic interaction via interaction skeletons, and a syntactic one establishing a correspondence between linear head reduction of terms satisfying a locality condition called local scope and the reduction of interaction skeletons.
This result is then generalized to arbitrary terms by a local scopization transformation.
Resistive crossbars have emerged as promising building blocks for realizing DNNs due to their ability to compactly and efficiently realize the dominant DNN computational kernel, viz., vector-matrix multiplication.
However, a key challenge with resistive crossbars is that they suffer from a range of device and circuit level non-idealities such as interconnect parasitics, peripheral circuits, sneak paths, and process variations.
These non-idealities can lead to errors in vector-matrix multiplication that eventually degrade the DNN's accuracy.
There has been no study of the impact of non-idealities on the accuracy of large-scale DNNs, in part because existing device and circuit models are infeasible to use in application-level evaluation.
In this work, we present a fast and accurate simulation framework to enable evaluation and re-training of large-scale DNNs on resistive crossbar based hardware fabrics.
We first characterize the impact of crossbar non-idealities on errors incurred in the realized vector-matrix multiplications and observe that the errors have significant data and hardware-instance dependence that should be considered.
We propose a Fast Crossbar Model (FCM) to accurately capture the errors arising due to crossbar non-idealities while being four-to-five orders of magnitude faster than circuit simulation.
Finally, we develop RxNN, a software framework to evaluate and re-train DNNs on resistive crossbar systems.
RxNN is based on the popular Caffe machine learning framework, and we use it to evaluate a suite of large-scale DNNs developed for the ImageNet Challenge (ILSVRC).
Our experiments reveal that resistive crossbar non-idealities can lead to significant accuracy degradations (9.6%-32%) for these large-scale DNNs.
To the best of our knowledge, this work is the first quantitative evaluation of the accuracy of large-scale DNNs on resistive crossbar based hardware.
Although agreement between annotators has been studied in the past from a statistical viewpoint, little work has attempted to quantify the extent to which this phenomenon affects the evaluation of computer vision (CV) object detection algorithms.
Many researchers utilise ground truth (GT) in experiments and more often than not this GT is derived from one annotator's opinion.
How does the difference in opinion affect an algorithm's evaluation?
Four examples of typical CV problems are chosen, and a methodology is applied to each to quantify the inter-annotator variance and to offer insight into the mechanisms behind agreement and the use of GT.
It is found that when detecting linear objects annotator agreement is very low.
The agreement in object position, linear or otherwise, can be partially explained through basic image properties.
Automatic object detectors are compared to annotator agreement and it is found that a clear relationship exists.
Several methods for calculating GTs from a number of annotations are applied and the resulting differences in the performance of the object detectors are quantified.
It is found that the rank of a detector is highly dependent upon the method used to form the GT.
It is also found that although the STAPLE and LSML GT estimation methods appear to represent the mean of the performance measured using the individual annotations, when there are few annotations, or there is a large variance in them, these estimates tend to degrade.
Furthermore, one of the most commonly adopted annotation combination methods--consensus voting--accentuates more obvious features, which results in an overestimation of the algorithm's performance.
Finally, it is concluded that in some datasets it may not be possible to state with any confidence that one algorithm outperforms another when evaluating upon one GT and a method for calculating confidence bounds is discussed.
Reviews spams are prevalent in e-commerce to manipulate product ranking and customers decisions maliciously.
While spams generated based on simple spamming strategy can be detected effectively, hardened spammers can evade regular detectors via more advanced spamming strategies.
Previous work gave more attention to evasion against text and graph-based detectors, but evasions against behavior-based detectors are largely ignored, leading to vulnerabilities in spam detection systems.
Since real evasion data are scarce, we first propose EMERAL (Evasion via Maximum Entropy and Rating sAmpLing) to generate evasive spams to certain existing detectors.
EMERAL can simulate spammers with different goals and levels of knowledge about the detectors, targeting at different stages of the life cycle of target products.
We show that in the evasion-defense dynamic, only a few evasion types are meaningful to the spammers, and any spammer will not be able to evade too many detection signals at the same time.
We reveal that some evasions are quite insidious and can fail all detection signals.
We then propose DETER (Defense via Evasion generaTion using EmeRal), based on model re-training on diverse evasive samples generated by EMERAL.
Experiments confirm that DETER is more accurate in detecting both suspicious time window and individual spamming reviews.
In terms of security, DETER is versatile enough to be vaccinated against diverse and unexpected evasions, is agnostic about evasion strategy and can be released without privacy concern.
The expressive nature of the voice provides a powerful medium for communicating sonic ideas, motivating recent research on methods for query by vocalisation.
Meanwhile, deep learning methods have demonstrated state-of-the-art results for matching vocal imitations to imitated sounds, yet little is known about how well learned features represent the perceptual similarity between vocalisations and queried sounds.
In this paper, we address this question using similarity ratings between vocal imitations and imitated drum sounds.
We use a linear mixed effect regression model to show how features learned by convolutional auto-encoders (CAEs) perform as predictors for perceptual similarity between sounds.
Our experiments show that CAEs outperform three baseline feature sets (spectrogram-based representations, MFCCs, and temporal features) at predicting the subjective similarity ratings.
We also investigate how the size and shape of the encoded layer effects the predictive power of the learned features.
The results show that preservation of temporal information is more important than spectral resolution for this application.
Non-maximum suppression (NMS) is essential for state-of-the-art object detectors to localize object from a set of candidate locations.
However, accurate candidate location sometimes is not associated with a high classification score, which leads to object localization failure during NMS.
In this paper, we introduce a novel bounding box regression loss for learning bounding box transformation and localization variance together.
The resulting localization variance exhibits a strong connection to localization accuracy, which is then utilized in our new non-maximum suppression method to improve localization accuracy for object detection.
On MS-COCO, we boost the AP of VGG-16 faster R-CNN from 23.6% to 29.1% with a single model and nearly no additional computational overhead.
More importantly, our method is able to improve the AP of ResNet-50 FPN fast R-CNN from 36.8% to 37.8%, which achieves state-of-the-art bounding box refinement result.
This study provides a conceptual overview of the literature dealing with the process of citing documents (focusing on the literature from the recent decade).
It presents theories, which have been proposed for explaining the citation process, and studies having empirically analyzed this process.
The overview is referred to as conceptual, because it is structured based on core elements in the citation process: the context of the cited document, processes from selection to citation of documents, and the context of the citing document.
The core elements are presented in a schematic representation.
The overview can be used to find answers on basic questions about the practice of citing documents.
Besides understanding of the process of citing, it delivers basic information for the proper application of citations in research evaluation.
Developing information technology to democratize scientific knowledge and support citizen empowerment is a challenging task.
In our case, a local community suffered from air pollution caused by industrial activity.
The residents lacked the technological fluency to gather and curate diverse scientific data to advocate for regulatory change.
We collaborated with the community in developing an air quality monitoring system which integrated heterogeneous data over a large spatial and temporal scale.
The system afforded strong scientific evidence by using animated smoke images, air quality data, crowdsourced smell reports, and wind data.
In our evaluation, we report patterns of sharing smoke images among stakeholders.
Our survey study shows that the scientific knowledge provided by the system encourages agonistic discussions with regulators, empowers the community to support policy making, and rebalances the power relationship between stakeholders.
The paper describes the verifying methods of medical specialty from user profile of online community for health-related advices.
To avoid critical situations with the proliferation of unverified and inaccurate information in medical online community, it is necessary to develop a comprehensive software solution for verifying the user medical specialty of online community for health-related advices.
The algorithm for forming the information profile of a medical online community user is designed.
The scheme systems of formation of indicators of user specialization in the profession based on a training sample is presented.
The method of forming the user information profile of online community for healthrelated advices by computer-linguistic analysis of the information content is suggested.
The system of indicators based on a training sample of users in medical online communities is formed.
The matrix of medical specialties indicators and method of determining weight coefficients these indicators is investigated.
The proposed method of verifying the medical specialty from user profile is tested in online medical community.
With large student enrollment, MOOC instructors face the unique challenge in deciding when to intervene in forum discussions with their limited bandwidth.
We study this problem of instructor intervention.
Using a large sample of forum data culled from 61 courses, we design a binary classifier to predict whether an instructor should intervene in a discussion thread or not.
By incorporating novel information about a forum's type into the classification process, we improve significantly over the previous state-of-the-art.
We show how difficult this decision problem is in the real world by validating against indicative human judgment, and empirically show the problem's sensitivity to instructors' intervention preferences.
We conclude this paper with our take on the future research issues in intervention.
This paper discusses how distribution matching losses, such as those used in CycleGAN, when used to synthesize medical images can lead to mis-diagnosis of medical conditions.
It seems appealing to use these new image synthesis methods for translating images from a source to a target domain because they can produce high quality images and some even do not require paired data.
However, the basis of how these image translation models work is through matching the translation output to the distribution of the target domain.
This can cause an issue when the data provided in the target domain has an over or under representation of some classes (e.g.healthy or sick).
When the output of an algorithm is a transformed image there are uncertainties whether all known and unknown class labels have been preserved or changed.
Therefore, we recommend that these translated images should not be used for direct interpretation (e.g.by doctors) because they may lead to misdiagnosis of patients based on hallucinated image features by an algorithm that matches a distribution.
However there are many recent papers that seem as though this is the goal.
Redundancy is abundant in Fog networks (i.e., many computing and storage points) and grows linearly with network size.
We demonstrate the transformational role of coding in Fog computing for leveraging such redundancy to substantially reduce the bandwidth consumption and latency of computing.
In particular, we discuss two recently proposed coding concepts, namely Minimum Bandwidth Codes and Minimum Latency Codes, and illustrate their impacts in Fog computing.
We also review a unified coding framework that includes the above two coding techniques as special cases, and enables a tradeoff between computation latency and communication load to optimize system performance.
At the end, we will discuss several open problems and future research directions.
In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether.
In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be useful later in its life.
We formulate curiosity as the error in an agent's ability to predict the consequence of its own actions in a visual feature space learned by a self-supervised inverse dynamics model.
Our formulation scales to high-dimensional continuous state spaces like images, bypasses the difficulties of directly predicting pixels, and, critically, ignores the aspects of the environment that cannot affect the agent.
The proposed approach is evaluated in two environments: VizDoom and Super Mario Bros. Three broad settings are investigated: 1) sparse extrinsic reward, where curiosity allows for far fewer interactions with the environment to reach the goal; 2) exploration with no extrinsic reward, where curiosity pushes the agent to explore more efficiently; and 3) generalization to unseen scenarios (e.g. new levels of the same game) where the knowledge gained from earlier experience helps the agent explore new places much faster than starting from scratch.
Demo video and code available at https://pathak22.github.io/noreward-rl/
In this paper we aim at increasing the descriptive power of the covariance matrix, limited in capturing linear mutual dependencies between variables only.
We present a rigorous and principled mathematical pipeline to recover the kernel trick for computing the covariance matrix, enhancing it to model more complex, non-linear relationships conveyed by the raw data.
To this end, we propose Kernelized-COV, which generalizes the original covariance representation without compromising the efficiency of the computation.
In the experiments, we validate the proposed framework against many previous approaches in the literature, scoring on par or superior with respect to the state of the art on benchmark datasets for 3D action recognition.
The high probability of hardware failures prevents many advanced robots (e.g., legged robots) from being confidently deployed in real-world situations (e.g., post-disaster rescue).
Instead of attempting to diagnose the failures, robots could adapt by trial-and-error in order to be able to complete their tasks.
In this situation, damage recovery can be seen as a Reinforcement Learning (RL) problem.
However, the best RL algorithms for robotics require the robot and the environment to be reset to an initial state after each episode, that is, the robot is not learning autonomously.
In addition, most of the RL methods for robotics do not scale well with complex robots (e.g., walking robots) and either cannot be used at all or take too long to converge to a solution (e.g., hours of learning).
In this paper, we introduce a novel learning algorithm called "Reset-free Trial-and-Error" (RTE) that (1) breaks the complexity by pre-generating hundreds of possible behaviors with a dynamics simulator of the intact robot, and (2) allows complex robots to quickly recover from damage while completing their tasks and taking the environment into account.
We evaluate our algorithm on a simulated wheeled robot, a simulated six-legged robot, and a real six-legged walking robot that are damaged in several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and whose objective is to reach a sequence of targets in an arena.
Our experiments show that the robots can recover most of their locomotion abilities in an environment with obstacles, and without any human intervention.
Unlike its image based counterpart, point cloud based retrieval for place recognition has remained as an unexplored and unsolved problem.
This is largely due to the difficulty in extracting local feature descriptors from a point cloud that can subsequently be encoded into a global descriptor for the retrieval task.
In this paper, we propose the PointNetVLAD where we leverage on the recent success of deep networks to solve point cloud based retrieval for place recognition.
Specifically, our PointNetVLAD is a combination/modification of the existing PointNet and NetVLAD, which allows end-to-end training and inference to extract the global descriptor from a given 3D point cloud.
Furthermore, we propose the "lazy triplet and quadruplet" loss functions that can achieve more discriminative and generalizable global descriptors to tackle the retrieval task.
We create benchmark datasets for point cloud based retrieval for place recognition, and the experimental results on these datasets show the feasibility of our PointNetVLAD.
Our code and the link for the benchmark dataset downloads are available in our project website. http://github.com/mikacuy/pointnetvlad/
Cyber-physical systems involve a network of discrete controllers that control physical processes.
Examples range from autonomous cars to implantable medical devices, which are highly safety critical.
Hybrid Automata (HA) based formal approach is gaining momentum for the specification and validation of CPS.
HA combines the model of the plant along with its discrete controller resulting in a piece-wise continuous system with discontinuities.
Accurate detection of these discontinuities, using appropriate level crossing detectors, is a key challenge to simulation of CPS based on HA.
Existing techniques employ time discrete numerical integration with bracketing for level crossing detection.
These techniques involve back-tracking and are highly non-deterministic and hence error prone.
As level crossings happen based on the values of continuous variables, Quantized State System (QSS)- integration may be more suitable.
Existing QSS integrators, based on fixed quanta, are also unsuitable for simulating HAs.
This is since the quantum selected is not dependent on the HA guard conditions, which are the main cause of discontinuities.
Considering this, we propose a new dynamic quanta based formal model called Quantized State Hybrid Automata (QSHA).
The developed formal model and the associated simulation framework guarantees that (1) all level crossings are accurately detected and (2) the time of the level crossing is also accurate within floating point error bounds.
Interestingly, benchmarking results reveal that the proposed simulation technique takes 720, 1.33 and 4.41 times fewer simulation steps compared to standard Quantized State System (QSS)-1, Runge-Kutta (RK)-45, and Differential Algebraic System Solver (DASSL) integration based techniques respectively.
RFID systems are among the major infrastructures of the Internet of Things, which follow ISO and EPC standards.
In addition, ISO standard constitutes the main layers of supply chain, and many RFID systems benefit from ISO standard for different purposes.
In this paper, we tried to introduce addressing systems based on ISO standards, through which the range of things connected to the Internet of Things will grow.
Our proposed methods are addressing methods which can be applied to both ISO and EPC standards.
The proposed methods are simple, hierarchical, and low cost implementation.
In addition, the presented methods enhance interoperability among RFIDs, and also enjoys a high scalability, since it well covers all of EPC schemes and ISO supply chain standards.
Further, by benefiting from a new algorithm for long EPCs known as selection algorithm, they can significantly facilitate and accelerate the operation of address mapping.
A self-organizing map (SOM) is a type of competitive artificial neural network, which projects the high-dimensional input space of the training samples into a low-dimensional space with the topology relations preserved.
This makes SOMs supportive of organizing and visualizing complex data sets and have been pervasively used among numerous disciplines with different applications.
Notwithstanding its wide applications, the self-organizing map is perplexed by its inherent randomness, which produces dissimilar SOM patterns even when being trained on identical training samples with the same parameters every time, and thus causes usability concerns for other domain practitioners and precludes more potential users from exploring SOM based applications in a broader spectrum.
Motivated by this practical concern, we propose a deterministic approach as a supplement to the standard self-organizing map.
In accordance with the theoretical design, the experimental results with satellite cloud data demonstrate the effective and efficient organization as well as simplification capabilities of the proposed approach.
This paper presents a hierarchical framework based on deep reinforcement learning that learns a diversity of policies for humanoid balance control.
Conventional zero moment point based controllers perform limited actions during under-actuation, whereas the proposed framework can perform human-like balancing behaviors such as active push-off of ankles.
The learning is done through the design of an explainable reward based on physical constraints.
The simulated results are presented and analyzed.
The successful emergence of human-like behaviors through deep reinforcement learning proves the feasibility of using an AI-based approach for learning humanoid balancing control in a unified framework.
Tabling is a powerful resolution mechanism for logic programs that captures their least fixed point semantics more faithfully than plain Prolog.
In many tabling applications, we are not interested in the set of all answers to a goal, but only require an aggregation of those answers.
Several works have studied efficient techniques, such as lattice-based answer subsumption and mode-directed tabling, to do so for various forms of aggregation.
While much attention has been paid to expressivity and efficient implementation of the different approaches, soundness has not been considered.
This paper shows that the different implementations indeed fail to produce least fixed points for some programs.
As a remedy, we provide a formal framework that generalises the existing approaches and we establish a soundness criterion that explains for which programs the approach is sound.
This article is under consideration for acceptance in TPLP.
An Artificial Neural Network-based error compensation method is proposed for improving the accuracy of resolver-based 16-bit encoders by compensating for their respective systematic error profiles.
The error compensation procedure, for a particular encoder, involves obtaining its error profile by calibrating it on a precision rotary table, training the neural network by using a part of this data and then determining the corrected encoder angle by subtracting the ANN-predicted error from the measured value of the encoder angle.
Since it is not guaranteed that all the resolvers will have exactly similar error profiles because of the inherent differences in their construction on a micro scale, the ANN has been trained on one error profile at a time and the corresponding weight file is then used only for compensating the systematic error of this particular encoder.
The systematic nature of the error profile for each of the encoders has also been validated by repeated calibration of the encoders over a period of time and it was found that the error profiles of a particular encoder recorded at different epochs show near reproducible behavior.
The ANN-based error compensation procedure has been implemented for 4 encoders by training the ANN with their respective error profiles and the results indicate that the accuracy of encoders can be improved by nearly an order of magnitude from quoted values of ~6 arc-min to ~0.65 arc-min when their corresponding ANN-generated weight files are used for determining the corrected encoder angle.
A new phenomenon emerging within virtual communities is a blurring between the social and commercial activities and motivations of participants.
This paper explores motivations for participating in social commerce at a micro-business level between members of a virtual community of Malay lifestyle bloggers.
The selected community was observed in order to understand the community and 21 participants were interviewed.
We used laddering techniques to explore community attributes, the perceived consequences, and their links to the values of participants.
We found that virtual community relationship was the main influential factor, and virtual community relationship contributed to the sense of social support as well as customers' trust in social commerce.
We investigate the physical layer security of uplink single-carrier frequency-division multiple-access (SC-FDMA) systems.
Multiple users, Alices, send confidential messages to a common legitimate base-station, Bob, in the presence of an eavesdropper, Eve.
To secure the legitimate transmissions, each user superimposes an artificial noise (AN) signal on the time-domain SC-FDMA data block.
We reduce the computational and storage requirements at Bob's receiver by assuming simple per-subchannel detectors.
We assume that Eve has global channel knowledge of all links in addition to high computational capabilities, where she adopts high-complexity detectors such as single-user maximum likelihood (ML), multiuser minimum-mean-square-error (MMSE), and multiuser ML.
We analyze the correlation properties of the time-domain AN signal and illustrate how Eve can exploit them to reduce the AN effects.
We prove that the number of useful AN streams that can degrade Eve's signal-to-noise ratio (SNR) is dependent on the channel memories of Alices-Bob and Alices-Eve links.
Furthermore, we enhance the system security for the case of partial Alices-Bob channel knowledge at Eve, where Eve only knows the precoding matrices of the data and AN signals instead of knowing the entire Alices-Bob channel matrices, and propose a hybrid scheme that integrates temporal AN with channel-based secret-key extraction.
In a competitive marketing, there are a large number of players which produce the same product.
Each firm aims to diffuse its product information widely so that it's product will become popular among potential buyers.
The more popular is a product of a firm, the higher is the revenue for the firm.
A model is developed in which two players compete to spread information in the large network.
Players choose their initial seed nodes simultaneously and the information is diffused according to Independent Cascade model (ICM).
The main aim of the player is to choose the seed nodes such that they will spread its information to as many nodes as possible in a social network.
The rate of spreading of information also plays a very important role in information diffusion process.
Any node in a social network will get influenced by none or one or more than one information.
We also analyzed how much fraction of nodes in different compartment changes by changing the rate of spreading of information.
Finally, a game theory model is developed to obtain the Nash equilibrium based on best response function of the players.
This model is based on Hotelling's model of electoral competition.
State-of-the-art methods of people counting in crowded scenes rely on deep networks to estimate people density in the image plane.
Perspective distortion effects are handled implicitly by either learning scale-invariant features or estimating density in patches of different sizes, neither of which accounts for the fact that scale changes must be consistent over the whole scene.
In this paper, we show that feeding an explicit model of the scale changes to the network considerably increases performance.
An added benefit is that it lets us reason in terms of number of people per square meter on the ground, allowing us to enforce physically-inspired temporal consistency constraints that do not have to be learned.
This yields an algorithm that outperforms state-of-the-art methods on crowded scenes, especially when perspective effects are strong.
Mobile automated video surveillance system involves application of real-time image and video processing algorithms which require a vast quantity of computing and storage resources.
To support the execution of mobile automated video surveillance system, a mobile ad hoc cloud computing and networking infrastructure is proposed in which multiple mobile devices interconnected through a mobile ad hoc network are combined to create a virtual supercomputing node.
An energy efficient resource allocation scheme has also been proposed for allocation of realtime automated video surveillance tasks.
To enable communication between mobile devices, a Wi-Fi Direct based mobile ad hoc cloud networking infrastructure has been developed.
More specifically, a routing layer has been developed to support communication between Wi-Fi Direct devices in a group and multi-hop communication between devices across the group.
The proposed system has been implemented on a group of Wi-Fi Direct-enabled Samsung mobile devices.
In big data era, machine learning is one of fundamental techniques in intrusion detection systems (IDSs).
However, practical IDSs generally update their decision module by feeding new data then retraining learning models in a periodical way.
Hence, some attacks that comprise the data for training or testing classifiers significantly challenge the detecting capability of machine learning-based IDSs.
Poisoning attack, which is one of the most recognized security threats towards machine learning-based IDSs, injects some adversarial samples into the training phase, inducing data drifting of training data and a significant performance decrease of target IDSs over testing data.
In this paper, we adopt the Edge Pattern Detection (EPD) algorithm to design a novel poisoning method that attack against several machine learning algorithms used in IDSs.
Specifically, we propose a boundary pattern detection algorithm to efficiently generate the points that are near to abnormal data but considered to be normal ones by current classifiers.
Then, we introduce a Batch-EPD Boundary Pattern (BEBP) detection algorithm to overcome the limitation of the number of edge pattern points generated by EPD and to obtain more useful adversarial samples.
Based on BEBP, we further present a moderate but effective poisoning method called chronic poisoning attack.
Extensive experiments on synthetic and three real network data sets demonstrate the performance of the proposed poisoning method against several well-known machine learning algorithms and a practical intrusion detection method named FMIFS-LSSVM-IDS.
Training deep neural networks requires many training samples, but in practice training labels are expensive to obtain and may be of varying quality, as some may be from trusted expert labelers while others might be from heuristics or other sources of weak supervision such as crowd-sourcing.
This creates a fundamental quality versus-quantity trade-off in the learning process.
Do we learn from the small amount of high-quality data or the potentially large amount of weakly-labeled data?
We argue that if the learner could somehow know and take the label-quality into account when learning the data representation, we could get the best of both worlds.
To this end, we propose "fidelity-weighted learning" (FWL), a semi-supervised student-teacher approach for training deep neural networks using weakly-labeled data.
FWL modulates the parameter updates to a student network (trained on the task we care about) on a per-sample basis according to the posterior confidence of its label-quality estimated by a teacher (who has access to the high-quality labels).
Both student and teacher are learned from the data.
We evaluate FWL on two tasks in information retrieval and natural language processing where we outperform state-of-the-art alternative semi-supervised methods, indicating that our approach makes better use of strong and weak labels, and leads to better task-dependent data representations.
In this paper we demonstrate the use of intelligent optimization methodologies on the visualization optimization of virtual / simulated environments.
The problem of automatic selection of an optimized set of views, which better describes an on-going simulation over a virtual environment is addressed in the context of the RoboCup Rescue Simulation domain.
A generic architecture for optimization is proposed and described.
We outline the possible extensions of this architecture and argue on how several problems within the fields of Interactive Rendering and Visualization can benefit from it.
We show that the skip-gram formulation of word2vec trained with negative sampling is equivalent to a weighted logistic PCA.
This connection allows us to better understand the objective, compare it to other word embedding methods, and extend it to higher dimensional models.
This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently.
Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.
Consider a network of k parties, each holding a long sequence of n entries (a database), with minimum vertex-cut greater than t. We show that any empirical statistic across the network of databases can be computed by each party with perfect privacy, against any set of t < k/2 passively colluding parties, such that the worst-case distortion and communication cost (in bits per database entry) both go to zero as n, the number of entries in the databases, goes to infinity.
This is based on combining a striking dimensionality reduction result for random sampling with unconditionally secure multi-party computation protocols.
It has been a long time, since data mining technologies have made their ways to the field of data management.
Classification is one of the most important data mining tasks for label prediction, categorization of objects into groups, advertisement and data management.
In this paper, we focus on the standard classification problem which is predicting unknown labels in Euclidean space.
Most efforts in Machine Learning communities are devoted to methods that use probabilistic algorithms which are heavy on Calculus and Linear Algebra.
Most of these techniques have scalability issues for big data, and are hardly parallelizable if they are to maintain their high accuracies in their standard form.
Sampling is a new direction for improving scalability, using many small parallel classifiers.
In this paper, rather than conventional sampling methods, we focus on a discrete classification algorithm with O(n) expected running time.
Our approach performs a similar task as sampling methods.
However, we use column-wise sampling of data, rather than the row-wise sampling used in the literature.
In either case, our algorithm is completely deterministic.
Our algorithm, proposes a way of combining 2D convex hulls in order to achieve high classification accuracy as well as scalability in the same time.
First, we thoroughly describe and prove our O(n) algorithm for finding the convex hull of a point set in 2D.
Then, we show with experiments our classifier model built based on this idea is very competitive compared with existing sophisticated classification algorithms included in commercial statistical applications such as MATLAB.
Handwritten Numeral recognition plays a vital role in postal automation services especially in countries like India where multiple languages and scripts are used Discrete Hidden Markov Model (HMM) and hybrid of Neural Network (NN) and HMM are popular methods in handwritten word recognition system.
The hybrid system gives better recognition result due to better discrimination capability of the NN.
A major problem in handwriting recognition is the huge variability and distortions of patterns.
Elastic models based on local observations and dynamic programming such HMM are not efficient to absorb this variability.
But their vision is local.
But they cannot face to length variability and they are very sensitive to distortions.
Then the SVM is used to estimate global correlations and classify the pattern.
Support Vector Machine (SVM) is an alternative to NN.
In Handwritten recognition, SVM gives a better recognition result.
The aim of this paper is to develop an approach which improve the efficiency of handwritten recognition using artificial neural network
An orthogonal approach to the fuzzification of both multisets and hybrid sets is presented.
In particular, we introduce L-multi-fuzzy and L-fuzzy hybrid sets, which are general enough and in spirit with the basic concepts of fuzzy set theory.
In addition, we study the properties of these structures.
Also, the usefulness of these structures is examined in the framework of mechanical multiset processing.
More specifically, we introduce a variant of fuzzy P systems and, since simple fuzzy membrane systems have been introduced elsewhere, we simply extend previously stated results and ideas.
Modeling distributions of citations to scientific papers is crucial for understanding how science develops.
However, there is a considerable empirical controversy on which statistical model fits the citation distributions best.
This paper is concerned with rigorous empirical detection of power-law behaviour in the distribution of citations received by the most highly cited scientific papers.
We have used a large, novel data set on citations to scientific papers published between 1998 and 2002 drawn from Scopus.
The power-law model is compared with a number of alternative models using a likelihood ratio test.
We have found that the power-law hypothesis is rejected for around half of the Scopus fields of science.
For these fields of science, the Yule, power-law with exponential cut-off and log-normal distributions seem to fit the data better than the pure power-law model.
On the other hand, when the power-law hypothesis is not rejected, it is usually empirically indistinguishable from most of the alternative models.
The pure power-law model seems to be the best model only for the most highly cited papers in "Physics and Astronomy".
Overall, our results seem to support theories implying that the most highly cited scientific papers follow the Yule, power-law with exponential cut-off or log-normal distribution.
Our findings suggest also that power laws in citation distributions, when present, account only for a very small fraction of the published papers (less than 1% for most of science fields) and that the power-law scaling parameter (exponent) is substantially higher (from around 3.2 to around 4.7) than found in the older literature.
We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data.
Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift.
We establish statistical consistency guarantees for this modification.
We then show strong clustering performance on real datasets as well as promising applications to image segmentation.
Many reinforcement-learning researchers treat the reward function as a part of the environment, meaning that the agent can only know the reward of a state if it encounters that state in a trial run.
However, we argue that this is an unnecessary limitation and instead, the reward function should be provided to the learning algorithm.
The advantage is that the algorithm can then use the reward function to check the reward for states that the agent hasn't even encountered yet.
In addition, the algorithm can simultaneously learn policies for multiple reward functions.
For each state, the algorithm would calculate the reward using each of the reward functions and add the rewards to its experience replay dataset.
The Hindsight Experience Replay algorithm developed by Andrychowicz et al.(2017) does just this, and learns to generalize across a distribution of sparse, goal-based rewards.
We extend this algorithm to linearly-weighted, multi-objective rewards and learn a single policy that can generalize across all linear combinations of the multi-objective reward.
Whereas other multi-objective algorithms teach the Q-function to generalize across the reward weights, our algorithm enables the policy to generalize, and can thus be used with continuous actions.
In this paper we present a novel unsupervised representation learning approach for 3D shapes, which is an important research challenge as it avoids the manual effort required for collecting supervised data.
Our method trains an RNN-based neural network architecture to solve multiple view inter-prediction tasks for each shape.
Given several nearby views of a shape, we define view inter-prediction as the task of predicting the center view between the input views, and reconstructing the input views in a low-level feature space.
The key idea of our approach is to implement the shape representation as a shape-specific global memory that is shared between all local view inter-predictions for each shape.
Intuitively, this memory enables the system to aggregate information that is useful to better solve the view inter-prediction tasks for each shape, and to leverage the memory as a view-independent shape representation.
Our approach obtains the best results using a combination of L_2 and adversarial losses for the view inter-prediction task.
We show that VIP-GAN outperforms state-of-the-art methods in unsupervised 3D feature learning on three large scale 3D shape benchmarks.
In this paper, we design and evaluate a convolutional autoencoder that perturbs an input face image to impart privacy to a subject.
Specifically, the proposed autoencoder transforms an input face image such that the transformed image can be successfully used for face recognition but not for gender classification.
In order to train this autoencoder, we propose a novel training scheme, referred to as semi-adversarial training in this work.
The training is facilitated by attaching a semi-adversarial module consisting of a pseudo gender classifier and a pseudo face matcher to the autoencoder.
The objective function utilized for training this network has three terms: one to ensure that the perturbed image is a realistic face image; another to ensure that the gender attributes of the face are confounded; and a third to ensure that biometric recognition performance due to the perturbed image is not impacted.
Extensive experiments confirm the efficacy of the proposed architecture in extending gender privacy to face images.
Rapid categorization paradigms have a long history in experimental psychology: Characterized by short presentation times and speedy behavioral responses, these tasks highlight the efficiency with which our visual system processes natural object categories.
Previous studies have shown that feed-forward hierarchical models of the visual cortex provide a good fit to human visual decisions.
At the same time, recent work in computer vision has demonstrated significant gains in object recognition accuracy with increasingly deep hierarchical architectures.
But it is unclear how well these models account for human visual decisions and what they may reveal about the underlying brain processes.
We have conducted a large-scale psychophysics study to assess the correlation between computational models and human participants on a rapid animal vs. non-animal categorization task.
We considered visual representations of varying complexity by analyzing the output of different stages of processing in three state-of-the-art deep networks.
We found that recognition accuracy increases with higher stages of visual processing (higher level stages indeed outperforming human participants on the same task) but that human decisions agree best with predictions from intermediate stages.
Overall, these results suggest that human participants may rely on visual features of intermediate complexity and that the complexity of visual representations afforded by modern deep network models may exceed those used by human participants during rapid categorization.
Background: It is widely recognized that software effort estimation is a regression problem.
Model Tree (MT) is one of the Machine Learning based regression techniques that is useful for software effort estimation, but as other machine learning algorithms, the MT has a large space of configuration and requires to carefully setting its parameters.
The choice of such parameters is a dataset dependent so no general guideline can govern this process which forms the motivation of this work.
Aims: This study investigates the effect of using the most recent optimization algorithm called Bees algorithm to specify the optimal choice of MT parameters that fit a dataset and therefore improve prediction accuracy.
Method: We used MT with optimal parameters identified by the Bees algorithm to construct software effort estimation model.
The model has been validated over eight datasets come from two main sources: PROMISE and ISBSG.
Also we used 3-Fold cross validation to empirically assess the prediction accuracies of different estimation models.
As benchmark, results are also compared to those obtained with Stepwise Regression Case-Based Reasoning and Multi-Layer Perceptron.
Results: The results obtained from combination of MT and Bees algorithm are encouraging and outperforms other well-known estimation methods applied on employed datasets.
They are also interesting enough to suggest the effectiveness of MT among the techniques that are suitable for effort estimation.
Conclusions: The use of the Bees algorithm enabled us to automatically find optimal MT parameters required to construct effort estimation models that fit each individual dataset.
Also it provided a significant improvement on prediction accuracy.
Extracting valuable facts or informative summaries from multi-dimensional tables, i.e. insight mining, is an important task in data analysis and business intelligence.
However, ranking the importance of insights remains a challenging and unexplored task.
The main challenge is that explicitly scoring an insight or giving it a rank requires a thorough understanding of the tables and costs a lot of manual efforts, which leads to the lack of available training data for the insight ranking problem.
In this paper, we propose an insight ranking model that consists of two parts: A neural ranking model explores the data characteristics, such as the header semantics and the data statistical features, and a memory network model introduces table structure and context information into the ranking process.
We also build a dataset with text assistance.
Experimental results show that our approach largely improves the ranking precision as reported in multi evaluation metrics.
Design and architecture of cloud storage system plays a vital role in cloud computing infrastructure in order to improve the storage capacity as well as cost effectiveness.
Usually cloud storage system provides users to efficient storage space with elasticity feature.
One of the challenges of cloud storage system is difficult to balance the providing huge elastic capacity of storage and investment of expensive cost for it.
In order to solve this issue in the cloud storage infrastructure, low cost PC cluster based storage server is configured to be activated for large amount of data to provide cloud users.
Moreover, one of the contributions of this system is proposed an analytical model using M/M/1 queuing network model, which is modeled on intended architecture to provide better response time, utilization of storage as well as pending time when the system is running.
According to the analytical result on experimental testing, the storage can be utilized more than 90% of storage space.
In this paper, two parts have been described such as (i) design and architecture of PC cluster based cloud storage system.
On this system, related to cloud applications, services configurations are explained in detailed.
(ii) Analytical model has been enhanced to be increased the storage utilization on the target architecture.
It is possible to associate a highly constrained subset of relative 6 DoF poses between two 3D shapes, as long as the local surface orientation, the normal vector, is available at every surface point.
Local shape features can be used to find putative point correspondences between the models due to their ability to handle noisy and incomplete data.
However, this correspondence set is usually contaminated by outliers in practical scenarios, which has led to many past contributions based on robust detectors such as the Hough transform or RANSAC.
The key insight of our work is that a single correspondence between oriented points on the two models is constrained to cast votes in a 1 DoF rotational subgroup of the full group of poses, SE(3).
Kernel density estimation allows combining the set of votes efficiently to determine a full 6 DoF candidate pose between the models.
This modal pose with the highest density is stable under challenging conditions, such as noise, clutter, and occlusions, and provides the output estimate of our method.
We first analyze the robustness of our method in relation to noise and show that it handles high outlier rates much better than RANSAC for the task of 6 DoF pose estimation.
We then apply our method to four state of the art data sets for 3D object recognition that contain occluded and cluttered scenes.
Our method achieves perfect recall on two LIDAR data sets and outperforms competing methods on two RGB-D data sets, thus setting a new standard for general 3D object recognition using point cloud data.
An observer increases in relative entropy as it receives information from what it is observing.
In a system of only an observer and the observed, an increase in the relative entropy of the observer is a decrease in the relative entropy of the observed.
Linking together these directional entropy disequilibriums we show that NAND and NOR functionality arise in such networks at very low levels of complexity.
In this paper we present two different variants of method for symmetric matrix inversion, based on modified Gaussian elimination.
Both methods avoid computation of square roots and have a reduced machine time's spending.
Further, both of them can be used efficiently not only for positive (semi-) definite, but for any non-singular symmetric matrix inversion.
We use simulation to verify results, which represented in this paper.
In order to improve usability and safety, modern unmanned aerial vehicles (UAVs) are equipped with sensors to monitor the environment, such as laser-scanners and cameras.
One important aspect in this monitoring process is to detect obstacles in the flight path in order to avoid collisions.
Since a large number of consumer UAVs suffer from tight weight and power constraints, our work focuses on obstacle avoidance based on a lightweight stereo camera setup.
We use disparity maps, which are computed from the camera images, to locate obstacles and to automatically steer the UAV around them.
For disparity map computation we optimize the well-known semi-global matching (SGM) approach for the deployment on an embedded FPGA.
The disparity maps are then converted into simpler representations, the so called U-/V-Maps, which are used for obstacle detection.
Obstacle avoidance is based on a reactive approach which finds the shortest path around the obstacles as soon as they have a critical distance to the UAV.
One of the fundamental goals of our work was the reduction of development costs by closing the gap between application development and hardware optimization.
Hence, we aimed at using high-level synthesis (HLS) for porting our algorithms, which are written in C/C++, to the embedded FPGA.
We evaluated our implementation of the disparity estimation on the KITTI Stereo 2015 benchmark.
The integrity of the overall realtime reactive obstacle avoidance algorithm has been evaluated by using Hardware-in-the-Loop testing in conjunction with two flight simulators.
The holy Quran is the holy book of the Muslims.
It contains information about many domains.
Often people search for particular concepts of holy Quran based on the relations among concepts.
An ontological modeling of holy Quran can be useful in such a scenario.
In this paper, we have modeled nature related concepts of holy Quran using OWL (Web Ontology Language) / RDF (Resource Description Framework).
Our methodology involves identifying nature related concepts mentioned in holy Quran and identifying relations among those concepts.
These concepts and relations are represented as classes/instances and properties of an OWL ontology.
Later, in the result section it is shown that, using the Ontological model, SPARQL queries can retrieve verses and concepts of interest.
Thus, this modeling helps semantic search and query on the holy Quran.
In this work, we have used English translation of the holy Quran by Sahih International, Protege OWL Editor and for querying we have used SPARQL.
Deep hashing methods have received much attention recently, which achieve promising results by taking advantage of the strong representation power of deep networks.
However, most existing deep hashing methods learn a whole set of hashing functions independently, while ignore the correlations between different hashing functions that can promote the retrieval accuracy greatly.
Inspired by the sequential decision ability of deep reinforcement learning, we propose a new Deep Reinforcement Learning approach for Image Hashing (DRLIH).
Our proposed DRLIH approach models the hashing learning problem as a sequential decision process, which learns each hashing function by correcting the errors imposed by previous ones and promotes retrieval accuracy.
To the best of our knowledge, this is the first work to address hashing problem from deep reinforcement learning perspective.
The main contributions of our proposed DRLIH approach can be summarized as follows: (1) We propose a deep reinforcement learning hashing network.
In the proposed network, we utilize recurrent neural network (RNN) as agents to model the hashing functions, which take actions of projecting images into binary codes sequentially, so that the current hashing function learning can take previous hashing functions' error into account.
(2) We propose a sequential learning strategy based on proposed DRLIH.
We define the state as a tuple of internal features of RNN's hidden layers and image features, which can reflect history decisions made by the agents.
We also propose an action group method to enhance the correlation of hash functions in the same group.
Experiments on three widely-used datasets demonstrate the effectiveness of our proposed DRLIH approach.
Accurate information of inertial parameters is critical to motion planning and control of space robots.
Before the launch, only a rudimentary estimate of the inertial parameters is available from experiments and computer-aided design (CAD) models.
After the launch, on-orbit operations substantially alter the value of inertial parameters.
In this work, we propose a new momentum model-based method for identifying the minimal parameters of a space robot while on orbit.
Minimal parameters are combinations of the inertial parameters of the links and uniquely define the momentum and dynamic models.
Consequently, they are sufficient for motion planning and control of both the satellite and robotic arms mounted on it.
The key to the proposed framework is the unique formulation of momentum model in the linear form of minimal parameters.
Further, to estimate the minimal parameters, we propose a novel joint trajectory planning and optimization technique based on direction combinations of joints' velocity.
The efficacy of the identification framework is demonstrated on a 12 degrees-of-freedom, spatial, dual-arm space robot.
The methodology is developed for tree-type space robots, requires just the pose and twist data, and scalable with increasing number of joints.
Power grids are one of the most important components of infrastructure in today's world.
Every nation is dependent on the security and stability of its own power grid to provide electricity to the households and industries.
A malfunction of even a small part of a power grid can cause loss of productivity, revenue and in some cases even life.
Thus, it is imperative to design a system which can detect the health of the power grid and take protective measures accordingly even before a serious anomaly takes place.
To achieve this objective, we have set out to create an artificially intelligent system which can analyze the grid information at any given time and determine the health of the grid through the usage of sophisticated formal models and novel machine learning techniques like recurrent neural networks.
Our system simulates grid conditions including stimuli like faults, generator output fluctuations, load fluctuations using Siemens PSS/E software and this data is trained using various classifiers like SVM, LSTM and subsequently tested.
The results are excellent with our methods giving very high accuracy for the data.
This model can easily be scaled to handle larger and more complex grid architectures.
An undesirable side effect of reversible color space transformation, which consists of lifting steps, is that while removing correlation it contaminates transformed components with noise from other components.
To remove correlation without increasing noise, we integrate denoising into the lifting steps and obtain a reversible image component transformation.
For JPEG-LS, JPEG 2000, and JPEG XR algorithms in lossless mode, we find that the proposed method applied to the RDgDb color space transformation with a simple denoising filter is especially effective for images in the native optical resolutions of acquisition devices, but may lead to increased bitrates for typical images.
We also present an efficient estimator of image component transformation effects.
Improving the quality of end-of-life care for hospitalized patients is a priority for healthcare organizations.
Studies have shown that physicians tend to over-estimate prognoses, which in combination with treatment inertia results in a mismatch between patients wishes and actual care at the end of life.
We describe a method to address this problem using Deep Learning and Electronic Health Record (EHR) data, which is currently being piloted, with Institutional Review Board approval, at an academic medical center.
The EHR data of admitted patients are automatically evaluated by an algorithm, which brings patients who are likely to benefit from palliative care services to the attention of the Palliative Care team.
The algorithm is a Deep Neural Network trained on the EHR data from previous years, to predict all-cause 3-12 month mortality of patients as a proxy for patients that could benefit from palliative care.
Our predictions enable the Palliative Care team to take a proactive approach in reaching out to such patients, rather than relying on referrals from treating physicians, or conduct time consuming chart reviews of all patients.
We also present a novel interpretation technique which we use to provide explanations of the model's predictions.
Convolutional sparse representations are a form of sparse representation with a structured, translation invariant dictionary.
Most convolutional dictionary learning algorithms to date operate in batch mode, requiring simultaneous access to all training images during the learning process, which results in very high memory usage and severely limits the training data that can be used.
Very recently, however, a number of authors have considered the design of online convolutional dictionary learning algorithms that offer far better scaling of memory and computational cost with training set size than batch methods.
This paper extends our prior work, improving a number of aspects of our previous algorithm; proposing an entirely new one, with better performance, and that supports the inclusion of a spatial mask for learning from incomplete data; and providing a rigorous theoretical analysis of these methods.
We consider the problem of object recognition in 3D using an ensemble of attribute-based classifiers.
We propose two new concepts to improve classification in practical situations, and show their implementation in an approach implemented for recognition from point-cloud data.
First, the viewing conditions can have a strong influence on classification performance.
We study the impact of the distance between the camera and the object and propose an approach to fuse multiple attribute classifiers, which incorporates distance into the decision making.
Second, lack of representative training samples often makes it difficult to learn the optimal threshold value for best positive and negative detection rate.
We address this issue, by setting in our attribute classifiers instead of just one threshold value, two threshold values to distinguish a positive, a negative and an uncertainty class, and we prove the theoretical correctness of this approach.
Empirical studies demonstrate the effectiveness and feasibility of the proposed concepts.
To face future reliability challenges, it is necessary to quantify the risk of error in any part of a computing system.
To this goal, the Architectural Vulnerability Factor (AVF) has long been used for chips.
However, this metric is used for offline characterisation, which is inappropriate for memory.
We survey the literature and formalise one of the metrics used, the Memory Vulnerability Factor, and extend it to take into account false errors.
These are reported errors which would have no impact on the program if they were ignored.
We measure the False Error Aware MVF (FEA) and related metrics precisely in a cycle-accurate simulator, and compare them with the effects of injecting faults in a program's data, in native parallel runs.
Our findings show that MVF and FEA are the only two metrics that are safe to use at runtime, as they both consistently give an upper bound on the probability of incorrect program outcome.
FEA gives a tighter bound than MVF, and is the metric that correlates best with the incorrect outcome probability of all considered metrics.
Universities and research centers in Spain are subject to a national open access (OA) mandate and to their own OA institutional policies, if any, but compliance with these requirements has not been fully monitored yet.
We studied the degree of OA archiving of publications of 28 universities within the period 2012-2014.
Of these, 12 have an institutional OA mandate, 9 do not require but request or encourage OA of scholarly outputs, and 7 do not have a formal OA statement but are well known for their support of the OA movement.
The potential OA rate was calculated according to the publisher open access policies indicated in Sherpa/Romeo directory.
The universities showed an asymmetric distribution of 1% to 63% of articles archived in repositories that matched those indexed by the Web of Science in the same period, of which 1% to 35% were OA and the rest were closed access.
For articles on work carried out with public funding and subject to the Spanish Science law, the percentage was similar or slightly higher.
However, the analysis of potential OA showed that the figure could have reached 80% in some cases.
This means that the real proportion of articles in OA is far below what it could potentially be.
For most deep learning algorithms training is notoriously time consuming.
Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these.
Our method consists of two parts: First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes.
Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts.
Experimental results across 3 popular datasets (MNIST, CIFAR10, SVHN) show that this approach not only does not hurt classification performance but can result in even better performance than standard stochastic gradient descent training, paving the way to fast, hardware-friendly training of neural networks.
Network Functions Virtualization (NFV) aims to support service providers to deploy various services in a more agile and cost-effective way.
However, the softwarization and cloudification of network functions can result in severe congestion and low network performance.
In this paper, we propose a solution to address this issue.
We analyze and solve the online load balancing problem using multipath routing in NFV to optimize network performance in response to the dynamic changes of user demands.
In particular, we first formulate the optimization problem of load balancing as a mixed integer linear program for achieving the optimal solution.
We then develop the ORBIT algorithm that solves the online load balancing problem.
The performance guarantee of ORBIT is analytically proved in comparison with the optimal offline solution.
The experiment results on real-world datasets show that ORBIT performs very well for distributing traffic of each service demand across multipaths without knowledge of future demands, especially under high-load conditions.
Training state-of-the-art offline handwriting recognition (HWR) models requires large labeled datasets, but unfortunately such datasets are not available in all languages and domains due to the high cost of manual labeling.
We address this problem by showing how high resource languages can be leveraged to help train models for low resource languages.
We propose a transfer learning methodology where we adapt HWR models trained on a source language to a target language that uses the same writing script.
This methodology only requires labeled data in the source language, unlabeled data in the target language, and a language model of the target language.
The language model is used in a bootstrapping fashion to refine predictions in the target language for use as ground truth in training the model.
Using this approach we demonstrate improved transferability among French, English, and Spanish languages using both historical and modern handwriting datasets.
In the best case, transferring with the proposed methodology results in character error rates nearly as good as full supervised training.
While Wikipedia exists in 287 languages, its content is unevenly distributed among them.
In this work, we investigate the generation of open domain Wikipedia summaries in underserved languages using structured data from Wikidata.
To this end, we propose a neural network architecture equipped with copy actions that learns to generate single-sentence and comprehensible textual summaries from Wikidata triples.
We demonstrate the effectiveness of the proposed approach by evaluating it against a set of baselines on two languages of different natures: Arabic, a morphological rich language with a larger vocabulary than English, and Esperanto, a constructed language known for its easy acquisition.
We guess humans start acquiring grasping skills as early as at the infant stage by virtue of two key processes.
First, infants attempt to learn grasps for known objects by imitating humans.
Secondly, knowledge acquired during this process is reused in learning to grasp novel objects.
We argue that these processes of active and transfer learning boil down to a random search of grasps on an object, suitably biased by prior experience.
In this paper we introduce active learning of grasps for known objects as well as transfer learning of grasps for novel objects grounded on kernel adaptive, mode-hopping Markov Chain Monte Carlo.
Our experiments show promising applicability of our proposed learning methods.
Person-to-person evaluations are prevalent in all kinds of discourse and important for establishing reputations, building social bonds, and shaping public opinion.
Such evaluations can be analyzed separately using signed social networks and textual sentiment analysis, but this misses the rich interactions between language and social context.
To capture such interactions, we develop a model that predicts individual A's opinion of individual B by synthesizing information from the signed social network in which A and B are embedded with sentiment analysis of the evaluative texts relating A to B.
We prove that this problem is NP-hard but can be relaxed to an efficiently solvable hinge-loss Markov random field, and we show that this implementation outperforms text-only and network-only versions in two very different datasets involving community-level decision-making: the Wikipedia Requests for Adminship corpus and the Convote U.S. Congressional speech corpus.
While a number of touch-based visualization systems have appeared in recent years, relatively little work has been done to evaluate these systems.
The prevailing methods compare these systems to desktop-class applications or utilize traditional training-based usability studies.
We argue that existing studies, while useful, fail to address a key aspect of mobile application usage - initial impression and discoverability-driven usability.
Over the past few years, we have developed a tablet-based visualization system, Tangere, for analyzing tabular data in a multiple coordinated view configuration.
This article describes a discoverability-based user study of Tangere in which the system is compared to a commercially available visualization system for tablets - Tableau's Vizable.
The study highlights aspects of each system's design that resonate with study participants, and we reflect upon those findings to identify design principles for future tablet-based data visualization systems.
The popularity of ASR (automatic speech recognition) systems, like Google Voice, Cortana, brings in security concerns, as demonstrated by recent attacks.
The impacts of such threats, however, are less clear, since they are either less stealthy (producing noise-like voice commands) or requiring the physical presence of an attack device (using ultrasound).
In this paper, we demonstrate that not only are more practical and surreptitious attacks feasible but they can even be automatically constructed.
Specifically, we find that the voice commands can be stealthily embedded into songs, which, when played, can effectively control the target system through ASR without being noticed.
For this purpose, we developed novel techniques that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener.
Our research shows that this can be done automatically against real world ASR applications.
We also demonstrate that such CommanderSongs can be spread through Internet (e.g., YouTube) and radio, potentially affecting millions of ASR users.
We further present a new mitigation technique that controls this threat.
Stewards of social science data face a fundamental tension.
On one hand, they want to make their data accessible to as many researchers as possible to facilitate new discoveries.
At the same time, they want to restrict access to their data as much as possible in order to protect the people represented in the data.
In this paper, we provide a case study addressing this common tension in an uncommon setting: the Fragile Families Challenge, a scientific mass collaboration designed to yield insights that could improve the lives of disadvantaged children in the United States.
We describe our process of threat modeling, threat mitigation, and third-party guidance.
We also describe the ethical principles that formed the basis of our process.
We are open about our process and the trade-offs that we made in the hopes that others can improve on what we have done.
Community detection is one of the most studied problems on complex networks.
Although hundreds of methods have been proposed so far, there is still no universally accepted formal definition of what is a good community.
As a consequence, the problem of the evaluation and the comparison of the quality of the solutions produced by these algorithms is still an open question, despite constant progress on the topic.
In this article, we investigate how using a multi-criteria evaluation can solve some of the existing problems of community evaluation, in particular the question of multiple equally-relevant solutions of different granularity.
After exploring several approaches, we introduce a new quality function, called MDensity, and propose a method that can be related both to a widely used community detection metric, the Modularity, and to the Precision/Recall approach, ubiquitous in information retrieval.
This paper presents generalized probabilistic models for high-order projective dependency parsing and an algorithmic framework for learning these statistical models involving dependency trees.
Partition functions and marginals for high-order dependency trees can be computed efficiently, by adapting our algorithms which extend the inside-outside algorithm to higher-order cases.
To show the effectiveness of our algorithms, we perform experiments on three languages---English, Chinese and Czech, using maximum conditional likelihood estimation for model training and L-BFGS for parameter estimation.
Our methods achieve competitive performance for English, and outperform all previously reported dependency parsers for Chinese and Czech.
There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on.
It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations.
In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts.
We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory.
In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista.
We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations.
We further show that our robot can even manipulate objects it has never seen before.
This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, aiming to perform high-throughput inference.
A two-stage architecture tailored for any given CNN-FPGA pair is generated, consisting of a low- and high-precision unit in a cascade.
A confidence evaluation unit is employed to identify misclassified cases from the excessively low-precision unit and forward them to the high-precision unit for re-processing.
Experiments demonstrate that the proposed toolflow can achieve a performance boost up to 55% for VGG-16 and 48% for AlexNet over the baseline design for the same resource budget and accuracy, without the need of retraining the model or accessing the training data.
Classification systems typically act in isolation, meaning they are required to implicitly memorize the characteristics of all candidate classes in order to classify.
The cost of this is increased memory usage and poor sample efficiency.
We propose a model which instead verifies using reference images during the classification process, reducing the burden of memorization.
The model uses iterative nondifferentiable queries in order to classify an image.
We demonstrate that such a model is feasible to train and can match baseline accuracy while being more parameter efficient.
However, we show that finding the correct balance between image recognition and verification is essential to pushing the model towards desired behavior, suggesting that a pipeline of recognition followed by verification is a more promising approach.
Compressed sensing is a technique for finding sparse solutions to underdetermined linear systems.
This technique relies on properties of the sensing matrix such as the restricted isometry property.
Sensing matrices that satisfy this property with optimal parameters are mainly obtained via probabilistic arguments.
Deciding whether a given matrix satisfies the restricted isometry property is a non-trivial computational problem.
Indeed, we show in this paper that restricted isometry parameters cannot be approximated in polynomial time within any constant factor under the assumption that the hidden clique problem is hard.
Moreover, on the positive side we propose an improvement on the brute-force enumeration algorithm for checking the restricted isometry property.
Hyperparameters are critical in machine learning, as different hyperparameters often result in models with significantly different performance.
Hyperparameters may be deemed confidential because of their commercial value and the confidentiality of the proprietary algorithms that the learner uses to learn them.
In this work, we propose attacks on stealing the hyperparameters that are learned by a learner.
We call our attacks hyperparameter stealing attacks.
Our attacks are applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network.
We evaluate the effectiveness of our attacks both theoretically and empirically.
For instance, we evaluate our attacks on Amazon Machine Learning.
Our results demonstrate that our attacks can accurately steal hyperparameters.
We also study countermeasures.
Our results highlight the need for new defenses against our hyperparameter stealing attacks for certain machine learning algorithms.
Despite being one of the most basic tasks in software development, debugging is still performed in a mostly manual way, leading to high cost and low performance.
To address this problem, researchers have studied promising approaches, such as Spectrum-based Fault Localization (SFL) techniques, which pinpoint program elements more likely to contain faults.
This survey discusses the state-of-the-art of SFL, including the different techniques that have been proposed, the type and number of faults they address, the types of spectra they use, the programs they utilize in their validation, the testing data that support them, and their use at industrial settings.
Notwithstanding the advances, there are still challenges for the industry to adopt these techniques, which we analyze in this paper.
SFL techniques should propose new ways to generate reduced sets of suspicious entities, combine different spectra to fine-tune the fault localization ability, use strategies to collect fine-grained coverage levels from suspicious coarser levels for balancing execution costs and output precision, and propose new techniques to cope with multiple-fault programs.
Moreover, additional user studies are needed to understand better how SFL techniques can be used in practice.
We conclude by presenting a concept map about topics and challenges for future research in SFL.
In light of the 40th jubilee of Requirements Engineering (RE), roughly 40 experts met in Switzerland to discuss where our discipline stands today.
As of today, the common view is, indisputably, that RE as a discipline is stable and respected, as pointed out by Sarah Gregory when covering the seminar in her column to which articles like this one are invited to present ongoing research.
However, it is also evident that after 40 years of promising research, conducting research that industry needs is still an ongoing challenge.
Research that industry needs means research that solves industrial problems practitioners face; but do we really understand those problems?
Here, I want to recapitulate on this research challenge and outline an initiative, the Naming the Pain in Requirements Engineering Initiative, that aims at tackling this problem.
In this article, we quantitatively analyze how the term "fake news" is being shaped in news media in recent years.
We study the perception and the conceptualization of this term in the traditional media using eight years of data collected from news outlets based in 20 countries.
Our results not only corroborate previous indications of a high increase in the usage of the expression "fake news", but also show contextual changes around this expression after the United States presidential election of 2016.
Among other results, we found changes in the related vocabulary, in the mentioned entities, in the surrounding topics and in the contextual polarity around the term "fake news", suggesting that this expression underwent a change in perception and conceptualization after 2016.
These outcomes expand the understandings on the usage of the term "fake news", helping to comprehend and more accurately characterize this relevant social phenomenon linked to misinformation and manipulation.
Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attention.
However, existing reading comprehension datasets are mostly in English.
To add diversity in reading comprehension datasets, in this paper we propose a new Chinese reading comprehension dataset for accelerating related research in the community.
The proposed dataset contains two different types: cloze-style reading comprehension and user query reading comprehension, associated with large-scale training data as well as human-annotated validation and hidden test set.
Along with this dataset, we also hosted the first Evaluation on Chinese Machine Reading Comprehension (CMRC-2017) and successfully attracted tens of participants, which suggest the potential impact of this dataset.
When analyzing the genome, researchers have discovered that proteins bind to DNA based on certain patterns of the DNA sequence known as "motifs".
However, it is difficult to manually construct motifs due to their complexity.
Recently, externally learned memory models have proven to be effective methods for reasoning over inputs and supporting sets.
In this work, we present memory matching networks (MMN) for classifying DNA sequences as protein binding sites.
Our model learns a memory bank of encoded motifs, which are dynamic memory modules, and then matches a new test sequence to each of the motifs to classify the sequence as a binding or nonbinding site.
Recently, image-to-image translation has been made much progress owing to the success of conditional Generative Adversarial Networks (cGANs).
And some unpaired methods based on cycle consistency loss such as DualGAN, CycleGAN and DiscoGAN are really popular.
However, it's still very challenging for translation tasks with the requirement of high-level visual information conversion, such as photo-to-caricature translation that requires satire, exaggeration, lifelikeness and artistry.
We present an approach for learning to translate faces in the wild from the source photo domain to the target caricature domain with different styles, which can also be used for other high-level image-to-image translation tasks.
In order to capture global structure with local statistics while translation, we design a dual pathway model with one coarse discriminator and one fine discriminator.
For generator, we provide one extra perceptual loss in association with adversarial loss and cycle consistency loss to achieve representation learning for two different domains.
Also the style can be learned by the auxiliary noise input.
Experiments on photo-to-caricature translation of faces in the wild show considerable performance gain of our proposed method over state-of-the-art translation methods as well as its potential real applications.
After defining a pure-action profile in a nonatomic aggregative game, where players have specific compact convex pure-action sets and nonsmooth convex cost functions, as a square-integrable function, we characterize a Wardrop equilibrium as a solution to an infinite-dimensional generalized variational inequality.
We show the existence of Wardrop equilibrium and variational Wardrop equilibrium, a concept of equilibrium adapted to the presence of coupling constraints, in monotone nonatomic aggregative games.
The uniqueness of (variational) Wardrop equilibrium is proved for strictly or aggregatively strictly monotone nonatomic aggregative games.
We then show that, for a sequence of finite-player aggregative games with aggregative constraints, if the players' pure-action sets converge to those of a strongly (resp.aggregatively strongly) monotone nonatomic aggregative game, and the aggregative constraints in the finite-player games converge to the aggregative constraint of the nonatomic game, then a sequence of so-called variational Nash equilibria in these finite-player games converge to the variational Wardrop equilibrium in pure-action profile (resp.aggregate-action profile).
In particular, it allows the construction of an auxiliary sequence of games with finite-dimensional equilibria to approximate the infinite-dimensional equilibrium in such a nonatomic game.
Finally, we show how to construct auxiliary finite-player games for two general classes of nonatomic games.
Several studies assert that the random access procedure of the Long Term Evolution (LTE) cellular standard may not be effective whenever a massive number of simultaneous connection attempts are performed by terminals, as may happen in a typical Internet of Things or Smart City scenario.
Nevertheless, simulation studies in real deployment scenarios are missing because many system-level simulators do not implement the LTE random access procedure in detail.
In this paper, we propose a patch for the LTE module of ns-3, one of the most prominent open-source network simulators, to improve the accuracy of the routine that simulates the LTE Random Access Channel (RACH).
The patched version of the random access procedure is compared with the default one and the issues arising from massive simultaneous access from mobile terminals in LTE are assessed via a simulation campaign.
For the efficient execution of deep convolutional neural networks (CNN) on edge devices, various approaches have been presented which reduce the bit width of the network parameters down to 1 bit.
Binarization of the first layer was always excluded, as it leads to a significant error increase.
Here, we present the novel concept of binary input layer (BIL), which allows the usage of binary input data by learning bit specific binary weights.
The concept is evaluated on three datasets (PAMAP2, SVHN, CIFAR-10).
Our results show that this approach is in particular beneficial for multimodal datasets (PAMAP2) where it outperforms networks using full precision weights in the first layer by 1:92 percentage points (pp) while consuming only 2 % of the chip area.
We present Neural Wavetable, a proof-of-concept wavetable synthesizer that uses neural networks to generate playable wavetables.
The system can produce new, distinct waveforms through the interpolation of traditional wavetables in an autoencoder's latent space.
It is available as a VST/AU plugin for use in a Digital Audio Workstation.
The iterative decoding threshold of low-density parity-check (LDPC) codes over the binary erasure channel (BEC) fulfills an upper bound depending only on the variable and check nodes with minimum distance 2.
This bound is a consequence of the stability condition, and is here referred to as stability bound.
In this paper, a stability bound over the BEC is developed for doubly-generalized LDPC codes, where the variable and the check nodes can be generic linear block codes, assuming maximum a posteriori erasure correction at each node.
It is proved that in this generalized context as well the bound depends only on the variable and check component codes with minimum distance 2.
A condition is also developed, namely the derivative matching condition, under which the bound is achieved with equality.
We analyze the path arrival rate for an inroom radio channel with directive antennas.
The impulse response of this channel exhibits a transition from early separate components followed by a diffuse reverberation tail.
Under the assumption that the transmitter's (or receiver's) position and orientation are picked uniformly at random we derive an exact expression of the mean arrival rate for a rectangular room predicted by the mirror source theory.
The rate is quadratic in delay, inversely proportional to the room volume, and proportional to the product of beam coverage fractions of the transmitter and receiver antennas.
Making use of the exact formula, we characterize the onset of the diffuse tail by defining a "mixing time" as the point in time where the arrival rate exceeds one component per transmit pulse duration.
We also give an approximation for the power-delay spectrum.
It turns out that the power-delay spectrum is unaffected by the antenna directivity.
However, Monte Carlo simulations show that antenna directivity does indeed play an important role for the distribution of instantaneous mean delay and rms delay spread
This paper provides a technical introduction to the PATSTAT Register database, which contains bibliographical, procedural and legal status data on patent applications handled by the European Patent Office.
It presents eight MySQL queries that cover some of the most relevant aspects of the database for research purposes.
It targets academic researchers and practitioners who are familiar with the PATSTAT database and the MySQL language.
Novel scientific knowledge is constantly produced by the scientific community.
Understanding the level of novelty characterized by scientific literature is key for modeling scientific dynamics and analyzing the growth mechanisms of scientific knowledge.
Metrics derived from bibliometrics and citation analysis were effectively used to characterize the novelty in scientific development.
However, time is required before we can observe links between documents such as citation links or patterns derived from the links, which makes these techniques more effective for retrospective analysis than predictive analysis.
In this study, we present a new approach to measuring the novelty of a research topic in a scientific community over a specific period by tracking semantic changes of the terms and characterizing the research topic in their usage context.
The semantic changes are derived from the text data of scientific literature by temporal embedding learning techniques.
We validated the effects of the proposed novelty metric on predicting the future growth of scientific publications and investigated the relations between novelty and growth by panel data analysis applied in a large-scale publication dataset (MEDLINE/PubMed).
Key findings based on the statistical investigation indicate that the novelty metric has significant predictive effects on the growth of scientific literature and the predictive effects may last for more than ten years.
We demonstrated the effectiveness and practical implications of the novelty metric in three case studies.
Enabling technologies for energy sustainable Internet of Things (IoT) are of paramount importance since the proliferation of high data communication demands of low power network devices.
In this paper, we consider a Multiple Input Single Output (MISO) multicasting IoT system comprising of a multiantenna Transmitter (TX) simultaneously transferring information and power to low power and data hungry IoT Receivers (RXs).
Each IoT device is assumed to be equipped with Power Splitting (PS) hardware that enables Energy Harvesting (EH) and imposes an individual Quality of Service (QoS) constraint to the downlink communication.
We study the joint design of TX precoding and IoT PS ratios for the considered MISO Simultaneous Wireless Information and Power Transfer (SWIPT) multicasting IoT system with the objective of maximizing the minimum harvested energy among IoT, while satisfying their individual QoS requirements.
In our novel EH fairness maximization formulation, we adopt a generic Radio Frequency (RF) EH model capturing practical rectification operation, and resulting in a nonconvex optimization problem.
For this problem, we first present an equivalent semi-definite relaxation formulation and then prove it possesses unique global optimality.
We also derive tight upper and lower bounds on the globally optimal solution that are exploited in obtaining low complexity algorithmic implementations for the targeted joint design.
Analytical expressions for the optimal TX beamforming directions, power allocation, and IoT PS ratios are also presented.
Our representative numerical results including comparisons with benchmark designs corroborate the usefulness of proposed framework and provide useful insights on the interplay of critical system parameters.
In network tomography, one goal is to identify a small set of failed links in a network, by sending a few packets through the network and seeing which reach their destination.
This problem can be seen as a variant of combinatorial group testing, which has been studied before under the moniker "graph-constrained group testing."
The main contribution of this work is to show that for most graphs, the "constraints" imposed by the underlying network topology are no constraint at all.
That is, the number of tests required to identify the failed links in "graph-constrained" group testing is near-optimal even for the corresponding group testing problem with no graph constraints.
Our approach is based on a simple randomized construction of tests, to analyze our construction, we prove new results about the size of giant components in randomly sparsified graphs.
Finally, we provide empirical results which suggest that our connected-subgraph tests perform better not just in theory but also in practice, and in particular perform better on a real-world network topology.
A general framework of spatio-spectral segmentation for multi-spectral images is introduced in this paper.
The method is based on classification-driven stochastic watershed (WS) by Monte Carlo simulations, and it gives more regular and reliable contours than standard WS.
The present approach is decomposed into several sequential steps.
First, a dimensionality-reduction stage is performed using the factor-correspondence analysis method.
In this context, a new way to select the factor axes (eigenvectors) according to their spatial information is introduced.
Then, a spectral classification produces a spectral pre-segmentation of the image.
Subsequently, a probability density function (pdf) of contours containing spatial and spectral information is estimated by simulation using a stochastic WS approach driven by the spectral classification.
The pdf of the contours is finally segmented by a WS controlled by markers from a regularization of the initial classification.
A number of algorithms for computing the simulation preorder are available.
Let Sigma denote the state space, -> the transition relation and Psim the partition of Sigma induced by simulation equivalence.
The algorithms by Henzinger, Henzinger, Kopke and by Bloom and Paige run in O(|Sigma||->|)-time and, as far as time-complexity is concerned, they are the best available algorithms.
However, these algorithms have the drawback of a space complexity that is more than quadratic in the size of the state space.
The algorithm by Gentilini, Piazza, Policriti--subsequently corrected by van Glabbeek and Ploeger--appears to provide the best compromise between time and space complexity.
Gentilini et al.'s algorithm runs in O(|Psim|^2|->|)-time while the space complexity is in O(|Psim|^2 + |Sigma|log|Psim|).
We present here a new efficient simulation algorithm that is obtained as a modification of Henzinger et al.'s algorithm and whose correctness is based on some techniques used in applications of abstract interpretation to model checking.
Our algorithm runs in O(|Psim||->|)-time and O(|Psim||Sigma|log|Sigma|)-space.
Thus, this algorithm improves the best known time bound while retaining an acceptable space complexity that is in general less than quadratic in the size of the state space.
An experimental evaluation showed good comparative results with respect to Henzinger, Henzinger and Kopke's algorithm.
Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets.
This paper improves their performance by applying rank pooling to each trajectory, encoding the temporal evolution of deep learning features computed along the trajectory.
This leads to Evolution-Preserving Trajectory (EPT) descriptors, a novel type of video descriptor that significantly outperforms Trajectory-pooled Deep-learning Descriptors.
EPT descriptors are defined based on dense trajectories, and they provide complimentary benefits to video descriptors that are not based on trajectories.
In particular, we show that the combination of EPT descriptors and VideoDarwin leads to state-of-the-art performance on Hollywood2 and UCF101 datasets.
With the growing economy, e-learning consequently gained increasing attention as it conveys knowledge globally with improved interactivity, assistance, and reduced costs.
For the past few years, accidental challenges have become the severe problem with railway units due to irresponsibility, lack of knowledge and improper guidance of station controllers (learners).
While focusing on e-learning technologies railway units failed to admit learner's need, cultural diversity and background skills by creating ethnically impartial e-learning environments, which resulted in inadequate training and degraded performance.
The purpose of this study is to understand the vision of a global diverse group of station traffic controllers about e-learning courses developed by their individual railway units.
The opinions of these officials have been verified by questionnaires on the basis of course organization, course accuracy, course effectiveness, course relevance, course productivity and course interactivity.
The results obtained show that the developed e-learning course was highly helpful, interactive, creative, and user-friendly for learners.
This lead to making e-learning conquered among independent learners.
The task of person re-identification has recently received rising attention due to the high performance achieved by new methods based on deep learning.
In particular, in the context of video-based re-identification, many state-of-the-art works have explored the use of Recurrent Neural Networks (RNNs) to process input sequences.
In this work, we revisit this tool by deriving an approximation which reveals the small effect of recurrent connections, leading to a much simpler feed-forward architecture.
Using the same parameters as the recurrent version, our proposed feed-forward architecture obtains very similar accuracy.
More importantly, our model can be combined with a new training process to significantly improve re-identification performance.
Our experiments demonstrate that the proposed models converge substantially faster than recurrent ones, with accuracy improvements by up to 5% on two datasets.
The performance achieved is better or on par with other RNN-based person re-identification techniques.
SQL declaratively specifies what the desired output of a query is.
This work shows that a non-standard interpretation of the SQL semantics can, instead, disclose where a piece of the output originated in the input and why that piece found its way into the result.
We derive such data provenance for very rich SQL dialects (including recursion, windowed aggregates, and user-defined functions) at the fine-grained level of individual table cells.
The approach is non-invasive and implemented as a compositional source-level SQL rewrite: an input SQL query is transformed into its own interpreter that wields data dependencies instead of regular values.
We deliberately design this transformation to preserve the shape of both data and query, which allows provenance derivation to scale to complex queries without overwhelming the underlying database system.
A solution to the problem of asymptotically optimum perfect universal steganography of finite memoryless sources with a passive warden is provided, which is then extended to contemplate a distortion constraint.
The solution rests on the fact that Slepian's Variant I permutation coding implements first-order perfect universal steganography of finite host signals with optimum embedding rate.
The duality between perfect universal steganography with asymptotically optimum embedding rate and lossless universal source coding with asymptotically optimum compression rate is evinced in practice by showing that permutation coding can be implemented by means of adaptive arithmetic coding.
Next, a distortion constraint between the host signal and the information-carrying signal is considered.
Such a constraint is essential whenever real-world host signals with memory (e.g., images, audio, or video) are decorrelated to conform to the memoryless assumption.
The constrained version of the problem requires trading off embedding rate and distortion.
Partitioned permutation coding is shown to be a practical way to implement this trade-off, performing close to an unattainable upper bound on the rate-distortion function of the problem.
High-utility Itemset Mining (HUIM) finds itemsets from a transaction database with utility no less than a user-defined threshold where the utility of an itemset is defined as the sum of the utilities of its items.
In this paper, we introduce the notion of generalized utility functions that need not be the sum of individual utilities.
In particular, we study subadditive monotone (SM) utility functions and prove that it generalizes the HUIM problem mentioned above.
Moving on to HUIM algorithms, the existing algorithms use upper-bounds like `Transaction Weighted Utility' and `Exact-Utility, Remaining Utility' for efficient search-space exploration.
We derive analogous and tighter upper-bounds for SM utility functions and explain how existing HUIM algorithms of different classes can be adapted using our upper bound.
We experimentally compared adaptations of some of the latest algorithms and point out some caveats that should be kept in mind while handling general utility functions.
Acute kidney injury (AKI) in critically ill patients is associated with significant morbidity and mortality.
Development of novel methods to identify patients with AKI earlier will allow for testing of novel strategies to prevent or reduce the complications of AKI.
We developed data-driven prediction models to estimate the risk of new AKI onset.
We generated models from clinical notes within the first 24 hours following intensive care unit (ICU) admission extracted from Medical Information Mart for Intensive Care III (MIMIC-III).
From the clinical notes, we generated clinically meaningful word and concept representations and embeddings, respectively.
Five supervised learning classifiers and knowledge-guided deep learning architecture were used to construct prediction models.
The best configuration yielded a competitive AUC of 0.779.
Our work suggests that natural language processing of clinical notes can be applied to assist clinicians in identifying the risk of incident AKI onset in critically ill patients upon admission to the ICU.
The complexity of the graph isomorphism problem for trapezoid graphs has been open over a decade.
This paper shows that the problem is GI-complete.
More precisely, we show that the graph isomorphism problem is GI-complete for comparability graphs of partially ordered sets with interval dimension 2 and height 3.
In contrast, the problem is known to be solvable in polynomial time for comparability graphs of partially ordered sets with interval dimension at most 2 and height at most 2.
The log-rank conjecture is one of the fundamental open problems in communication complexity.
It speculates that the deterministic communication complexity of any two-party function is equal to the log of the rank of its associated matrix, up to polynomial factors.
Despite much research, we still know very little about this conjecture.
Recently, there has been renewed interest in this conjecture and its relations to other fundamental problems in complexity theory.
This survey describes some of the recent progress, and hints at potential directions for future research.
Despite the advancements in search engine features, ranking methods, technologies, and the availability of programmable APIs, current-day open-access digital libraries still rely on crawl-based approaches for acquiring their underlying document collections.
In this paper, we propose a novel search-driven framework for acquiring documents for scientific portals.
Within our framework, publicly-available research paper titles and author names are used as queries to a Web search engine.
Next, research papers and sources of research papers are identified from the search results using accurate classification modules.
Our experiments highlight not only the performance of our individual classifiers but also the effectiveness of our overall Search/Crawl framework.
Indeed, we were able to obtain approximately 0.665 million research documents through our fully-automated framework using about 0.076 million queries.
These prolific results position Web search as an effective alternative to crawl methods for acquiring both the actual documents and seed URLs for future crawls.
We introduce EigenRec; a versatile and efficient Latent-Factor framework for Top-N Recommendations that includes the well-known PureSVD algorithm as a special case.
EigenRec builds a low dimensional model of an inter-item proximity matrix that combines a similarity component, with a scaling operator, designed to control the influence of the prior item popularity on the final model.
Seeing PureSVD within our framework provides intuition about its inner workings, exposes its inherent limitations, and also, paves the path towards painlessly improving its recommendation performance.
A comprehensive set of experiments on the MovieLens and the Yahoo datasets based on widely applied performance metrics, indicate that EigenRec outperforms several state-of-the-art algorithms, in terms of Standard and Long-Tail recommendation accuracy, exhibiting low susceptibility to sparsity, even in its most extreme manifestations -- the Cold-Start problems.
At the same time EigenRec has an attractive computational profile and it can apply readily in large-scale recommendation settings.
We propose a bridge between functional and object-oriented programming in the first-year curriculum.
Traditionally, curricula that begin with functional programming transition to a professional, usually object-oriented, language in the second course.
This transition poses obstacles for students, and often results in confusing the details of development environments, syntax, and libraries with the fundamentals of OO programming that the course should focus on.
Instead, we propose to begin the second course with a sequence of custom teaching languages which minimize the transition from the first course, and allow students to focus on core ideas.
After working through the sequence of pedagogical languages, we then transition to Java, at which point students have a strong command of the basic principles.
We have 3 years of experience with this course, with notable success.
Word2vec (Mikolov et al., 2013) has proven to be successful in natural language processing by capturing the semantic relationships between different words.
Built on top of single-word embeddings, paragraph vectors (Le and Mikolov, 2014) find fixed-length representations for pieces of text with arbitrary lengths, such as documents, paragraphs, and sentences.
In this work, we propose a novel interpretation for neural-network-based paragraph vectors by developing an unsupervised generative model whose maximum likelihood solution corresponds to traditional paragraph vectors.
This probabilistic formulation allows us to go beyond point estimates of parameters and to perform Bayesian posterior inference.
We find that the entropy of paragraph vectors decreases with the length of documents, and that information about posterior uncertainty improves performance in supervised learning tasks such as sentiment analysis and paraphrase detection.
The spread of ideas in the scientific community is often viewed as a competition, in which good ideas spread further because of greater intrinsic fitness, and publication venue and citation counts correlate with importance and impact.
However, relatively little is known about how structural factors influence the spread of ideas, and specifically how where an idea originates might influence how it spreads.
Here, we investigate the role of faculty hiring networks, which embody the set of researcher transitions from doctoral to faculty institutions, in shaping the spread of ideas in computer science, and the importance of where in the network an idea originates.
We consider comprehensive data on the hiring events of 5032 faculty at all 205 Ph.D.-granting departments of computer science in the U.S. and Canada, and on the timing and titles of 200,476 associated publications.
Analyzing five popular research topics, we show empirically that faculty hiring can and does facilitate the spread of ideas in science.
Having established such a mechanism, we then analyze its potential consequences using epidemic models to simulate the generic spread of research ideas and quantify the impact of where an idea originates on its longterm diffusion across the network.
We find that research from prestigious institutions spreads more quickly and completely than work of similar quality originating from less prestigious institutions.
Our analyses establish the theoretical trade-offs between university prestige and the quality of ideas necessary for efficient circulation.
Our results establish faculty hiring as an underlying mechanism that drives the persistent epistemic advantage observed for elite institutions, and provide a theoretical lower bound for the impact of structural inequality in shaping the spread of ideas in science.
Purpose.
To obtain the interference immunity of the data exchange by spread spectrum signals with variable entropy of the telemetric information data exchange with autonomous mobile robots.
Methodology.
The results have been obtained by the theoretical investigations and have been confirmed by the modeling experiments.
Findings.
The interference immunity in form of dependence of bit error probability on normalized signal/noise ratio of the data exchange by spread spectrum signals with variable entropy has been obtained.It has been proved that the interference immunity factor (needed normalized signal/noise ratio) is at least 2 dB better under condition of equal time complexity as compared with correlation processing methods of orthogonal signals.
Originality.
For the first time the interference immunity in form of dependence of bit error probability on normalized signal/noise ratio of the data exchange by spread spectrum signals with variable entropy has been obtained.
Practical value.
The obtained results prove the feasibility of using variable entropy spread spectrum signals data exchange method in the distributed telemetric information processing systems in specific circumstances.
We study two-player games played on the infinite graph of sentential forms induced by a context-free grammar (that comes with an ownership partitioning of the non-terminals).
The winning condition is inclusion of the derived terminal word in the language of a finite automaton.
Our contribution is a new algorithm to decide the winning player and to compute her strategy.
It is based on a novel representation of all plays starting in a non-terminal.
The representation uses the domain of Boolean formulas over the transition monoid of the target automaton.
The elements of the monoid are essentially procedure summaries, and our approach can be seen as the first summary-based algorithm for the synthesis of recursive programs.
We show that our algorithm has optimal (doubly exponential) time complexity, that it is compatible with recent antichain optimizations, and that it admits a lazy evaluation strategy.
Our preliminary experiments indeed show encouraging results, indicating a speed up of three orders of magnitude over a competitor.
Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities "solve" the exploding gradient problem, we show that this is not the case in general and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice.
We explain why exploding gradients occur and highlight the *collapsing domain problem*, which can arise in architectures that avoid exploding gradients.
ResNets have significantly lower gradients and thus can circumvent the exploding gradient problem, enabling the effective training of much deeper networks.
We show this is a direct consequence of the Pythagorean equation.
By noticing that *any neural network is a residual network*, we devise the *residual trick*, which reveals that introducing skip connections simplifies the network mathematically, and that this simplicity may be the major cause for their success.
Solar energy generation requires efficient monitoring and management in moving towards technologies for net-zero energy buildings.
This paper presents a dependable control system based on the Internet of Things (IoT) to control and manage the energy flow of renewable energy collected by solar panels within a microgrid.
Data for optimal control include not only measurements from local sensors but also meteorological information retrieved in real-time from online sources.
For system fault tolerance across the whole distributed control system featuring multiple controllers, dependable controllers are developed to control and optimise the tracking performance of photovoltaic arrays to maximally capture solar radiation and maintain system resilience and reliability in real time despite failures of one or more redundant controllers due to a problem with communication, hardware or cybersecurity.
Experimental results have been obtained to evaluate the validity of the proposed approach.
cryptographic hash function is a deterministic procedure that compresses an arbitrary block of numerical data and returns a fixed-size bit string.
There exist many hash functions: MD5, HAVAL, SHA, ...
It was reported that these hash functions are not longer secure.
Our work is focused in the construction of a new hash function based on composition of functions.
The construction used the NP-completeness of Three-dimensional contingency tables and the relaxation of the constraint that a hash function should also be a compression function.
Detection of Alzheimer's Disease (AD) from neuroimaging data such as MRI through machine learning have been a subject of intense research in recent years.
Recent success of deep learning in computer vision have progressed such research further.
However, common limitations with such algorithms are reliance on a large number of training images, and requirement of careful optimization of the architecture of deep networks.
In this paper, we attempt solving these issues with transfer learning, where state-of-the-art architectures such as VGG and Inception are initialized with pre-trained weights from large benchmark datasets consisting of natural images, and the fully-connected layer is re-trained with only a small number of MRI images.
We employ image entropy to select the most informative slices for training.
Through experimentation on the OASIS MRI dataset, we show that with training size almost 10 times smaller than the state-of-the-art, we reach comparable or even better performance than current deep-learning based methods.
Cloud computing is a cost-effective way for start-up life sciences laboratories to store and manage their data.
However, in many instances the data stored over the cloud could be redundant which makes cloud-based data management inefficient and costly because one has to pay for every byte of data stored over the cloud.
Here, we tested efficient management of data generated by an electron cryo microscopy (cryoEM) lab on a cloud-based environment.
The test data was obtained from cryoEM repository EMPIAR.
All the images were subjected to an in-house parallelized version of principal component analysis.
An efficient cloud-based MapReduce modality was used for parallelization.
We showed that large data in order of terabytes could be efficiently reduced to its minimal essential self in a cost-effective scalable manner.
Furthermore, on-spot instance on Amazon EC2 was shown to reduce costs by a margin of about 27 percent.
This approach could be scaled to data of any large volume and type.
Sentence simplification aims to simplify the content and structure of complex sentences, and thus make them easier to interpret for human readers, and easier to process for downstream NLP applications.
Recent advances in neural machine translation have paved the way for novel approaches to the task.
In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification.
Our experiments demonstrate the effectiveness of our approach on different simplification datasets, both in terms of automatic evaluation measures and human judgments.
Change detection is one of the most challenging issues when analyzing remotely sensed images.
Comparing several multi-date images acquired through the same kind of sensor is the most common scenario.
Conversely, designing robust, flexible and scalable algorithms for change detection becomes even more challenging when the images have been acquired by two different kinds of sensors.
This situation arises in case of emergency under critical constraints.
This paper presents, to the best of authors' knowledge, the first strategy to deal with optical images characterized by dissimilar spatial and spectral resolutions.
Typical considered scenarios include change detection between panchromatic or multispectral and hyperspectral images.
The proposed strategy consists of a 3-step procedure: i) inferring a high spatial and spectral resolution image by fusion of the two observed images characterized one by a low spatial resolution and the other by a low spectral resolution, ii) predicting two images with respectively the same spatial and spectral resolutions as the observed images by degradation of the fused one and iii) implementing a decision rule to each pair of observed and predicted images characterized by the same spatial and spectral resolutions to identify changes.
The performance of the proposed framework is evaluated on real images with simulated realistic changes.
In this paper we address the challenge of assessing the quality of Wikipedia pages using scores derived from edit contribution and contributor authoritativeness measures.
The hypothesis is that pages with significant contributions from authoritative contributors are likely to be high-quality pages.
Contributions are quantified using edit longevity measures and contributor authoritativeness is scored using centrality metrics in either the Wikipedia talk or co-author networks.
The results suggest that it is useful to take into account the contributor authoritativeness when assessing the information quality of Wikipedia content.
The percentile visualization of the quality scores provides some insights about the anomalous articles, and can be used to help Wikipedia editors to identify Start and Stub articles that are of relatively good quality.
We propose Hilbert transform (HT) and analytic signal (AS) construction for signals over graphs.
This is motivated by the popularity of HT, AS, and modulation analysis in conventional signal processing, and the observation that complementary insight is often obtained by viewing conventional signals in the graph setting.
Our definitions of HT and AS use a conjugate-symmetry-like property exhibited by the graph Fourier transform (GFT).
We show that a real graph signal (GS) can be represented using smaller number of GFT coefficients than the signal length.
We show that the graph HT (GHT) and graph AS (GAS) operations are linear and shift-invariant over graphs.
Using the GAS, we define the amplitude, phase, and frequency modulations for a graph signal (GS).
Further, we use convex optimization to develop an alternative definition of envelope for a GS.
We illustrate the proposed concepts by showing applications to synthesized and real-world signals.
For example, we show that the GHT is suitable for anomaly detection/analysis over networks and that GAS reveals complementary information in speech signals.
For any company, multiple channels are available for reaching a population in order to market its products.
Some of the most well-known channels are (a) mass media advertisement, (b) recommendations using social advertisement, and (c) viral marketing using social networks.
The company would want to maximize its reach while also accounting for simultaneous marketing of competing products, where the product marketings may not be independent.
In this direction, we propose and analyze a multi-featured generalization of the classical linear threshold model.
We hence develop a framework for integrating the considered marketing channels into the social network, and an approach for allocating budget among these channels.
Website can be easily design but to efficient user navigation is not a easy task since user behavior is keep changing and developer view is quite different from what user wants, so to improve navigation one way is reorganization of website structure.
For reorganization here proposed strategy is farthest first traversal clustering algorithm perform clustering on two numeric parameters and for finding frequent traversal path of user Apriori algorithm is used.
Our aim is to perform reorganization with fewer changes in website structure.
Latent periodic elements in genomes play important roles in genomic functions.
Many complex periodic elements in genomes are difficult to be detected by commonly used digital signal processing (DSP).
We present a novel method to compute the periodic power spectrum of a DNA sequence based on the nucleotide distributions on periodic positions of the sequence.
The method directly calculates full periodic spectrum of a DNA sequence rather than frequency spectrum by Fourier transform.
The magnitude of the periodic power spectrum reflects the strength of the periodicity signals, thus, the algorithm can capture all the latent periodicities in DNA sequences.
We apply this method on detection of latent periodicities in different genome elements, including exons and microsatellite DNA sequences.
The results show that the method minimizes the impact of spectral leakage, captures a much broader latent periodicities in genomes, and outperforms the conventional Fourier transform.
Spatiotemporal forecasting has various applications in neuroscience, climate and transportation domain.
Traffic forecasting is one canonical example of such learning task.
The task is challenging due to (1) complex spatial dependency on road networks, (2) non-linear temporal dynamics with changing road conditions and (3) inherent difficulty of long-term forecasting.
To address these challenges, we propose to model the traffic flow as a diffusion process on a directed graph and introduce Diffusion Convolutional Recurrent Neural Network (DCRNN), a deep learning framework for traffic forecasting that incorporates both spatial and temporal dependency in the traffic flow.
Specifically, DCRNN captures the spatial dependency using bidirectional random walks on the graph, and the temporal dependency using the encoder-decoder architecture with scheduled sampling.
We evaluate the framework on two real-world large scale road network traffic datasets and observe consistent improvement of 12% - 15% over state-of-the-art baselines.
Surface parameterizations have been widely applied to computer graphics and digital geometry processing.
In this paper, we propose a novel stretch energy minimization (SEM) algorithm for the computation of equiareal parameterizations of simply connected open surfaces with a very small area distortion and a highly improved computational efficiency.
In addition, the existence of nontrivial limit points of the SEM algorithm is guaranteed under some mild assumptions of the mesh quality.
Numerical experiments indicate that the efficiency, accuracy, and robustness of the proposed SEM algorithm outperform other state-of-the-art algorithms.
Applications of the SEM on surface remeshing and surface registration for simply connected open surfaces are demonstrated thereafter.
Thanks to the SEM algorithm, the computations for these applications can be carried out efficiently and robustly.
Over the past two decades, High-Performance Computing (HPC) communities have developed many models for delivering education aiming to help students understand and harness the power of parallel and distributed computing.
Most of these courses either lack a hands-on component or heavily focus on theoretical characterization behind complex algorithms.
To bridge the gap between application and scientific theory, NVIDIA Deep Learning Institute (DLI) (www.nvidia.com/dli) has designed an on-line education and training platform that helps students, developers, and engineers solve real-world problems in a wide range of domains using deep learning and accelerated computing.
DLI's accelerated computing course content starts with the fundamentals of accelerating applications with CUDA and OpenACC in addition to other courses in training and deploying neural networks for deep learning.
Advanced and domain-specific courses in deep learning are also available.
The online platform enables students to use the latest AI frameworks, SDKs, and GPU-accelerated technologies on fully-configured GPU servers in the cloud so the focus is more on learning and less on environment setup.
Students are offered project-based assessment and certification at the end of some courses.
To support academics and university researchers teaching accelerated computing and deep learning, the DLI University Ambassador Program enables educators to teach free DLI courses to university students, faculty, and researchers.
Fairness in algorithmic decision-making processes is attracting increasing concern.
When an algorithm is applied to human-related decision-making an estimator solely optimizing its predictive power can learn biases on the existing data, which motivates us the notion of fairness in machine learning. while several different notions are studied in the literature, little studies are done on how these notions affect the individuals.
We demonstrate such a comparison between several policies induced by well-known fairness criteria, including the color-blind (CB), the demographic parity (DP), and the equalized odds (EO).
We show that the EO is the only criterion among them that removes group-level disparity.
Empirical studies on the social welfare and disparity of these policies are conducted.
Most previous works on opinion modeling lack the simultaneous study of individual mental activity and group behavior.
This paper is motivated to propose an agent-based online opinion formation model based on attitude change theory, group behavior theory and evolutionary game theory in the perspective of sociology and psychology.
In this model, there are three factors influencing the persuasion process, including credibility of the leaders, characteristic of the recipient, and group environment.
The proposed model is applied to Twitter to analyze the influence of topic type, parameter changing, and opinion leaders on opinion formation.
Experimental results show that the opinion evolution of controversial topic shows greater uncertainty and sustainability.
The ratio of benefit to cost has a significant impact on opinion formation and a moderate ratio will result in the longest relaxation time or most unified global opinions.
Furthermore, celebrities with a large number of followers are more capable of influencing public opinion than experts.
This paper enriches the researches on opinion formation modeling, and the results provide managerial insights for business on public relations and market prediction.
GPUs are dedicated processors used for complex calculations and simulations and they can be effectively used for tropical algebra computations.
Tropical algebra is based on max-plus algebra and min-plus algebra.
In this paper we proposed and designed a library based on Tropical Algebra which is used to provide standard vector and matrix operations namely Basic Tropical Algebra Subroutines (BTAS).
The testing of BTAS library is conducted by implementing the sequential version of Floyd Warshall Algorithm on CPU and furthermore parallel version on GPU.
The developed library for tropical algebra delivered extensively better results on a less expensive GPU as compared to the same on CPU.
The potential number of drug like small molecules is estimated to be between 10^23 and 10^60 while current databases of known compounds are orders of magnitude smaller with approximately 10^8 compounds.
This discrepancy has led to an interest in generating virtual libraries using hand crafted chemical rules and fragment based methods to cover a larger area of chemical space and generate chemical libraries for use in in silico drug discovery endeavors.
Here it is explored to what extent a recurrent neural network with long short term memory cells can figure out sensible chemical rules and generate synthesizable molecules by being trained on existing compounds encoded as SMILES.
The networks can to a high extent generate novel, but chemically sensible molecules.
The properties of the molecules are tuned by training on two different datasets consisting of fragment like molecules and drug like molecules.
The produced molecules and the training databases have very similar distributions of molar weight, predicted logP, number of hydrogen bond acceptors and donors, number of rotatable bonds and topological polar surface area when compared to their respective training sets.
The compounds are for the most cases synthesizable as assessed with SA score and Wiley ChemPlanner.
Machine vision applications are low cost and high precision measurement systems which are frequently used in production lines.
With these systems that provide contactless control and measurement, production facilities are able to reach high production numbers without errors.
Machine vision operations such as product counting, error control, dimension measurement can be performed through a camera.
In this paper, a machine vision application is proposed, which can perform object-independent product counting.
The proposed approach is based on Otsu thresholding and Hough transformation and performs automatic counting independently of product type and color.
Basically one camera is used in the system.
Through this camera, an image of the products passing through a conveyor is taken and various image processing algorithms are applied to these images.
In this approach using images obtained from a real experimental setup, a real-time machine vision application was installed.
As a result of the experimental studies performed, it has been determined that the proposed approach gives fast, accurate and reliable results.
Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks.
In this paper, we describe a staged hybrid model combining Recurrent Convolutional Neural Networks (RCNN) with highway layers.
The highway network module is incorporated in the middle takes the output of the bi-directional Recurrent Neural Network (Bi-RNN) module in the first stage and provides the Convolutional Neural Network (CNN) module in the last stage with the input.
The experiment shows that our model outperforms common neural network models (CNN, RNN, Bi-RNN) on a sentiment analysis task.
Besides, the analysis of how sequence length influences the RCNN with highway layers shows that our model could learn good representation for the long text.
We develop a static complexity analysis for a higher-order functional language with structural list recursion.
The complexity of an expression is a pair consisting of a cost and a potential.
The former is defined to be the size of the expression's evaluation derivation in a standard big-step operational semantics.
The latter is a measure of the "future" cost of using the value of that expression.
A translation function tr maps target expressions to complexities.
Our main result is the following Soundness Theorem: If t is a term in the target language, then the cost component of tr(t) is an upper bound on the cost of evaluating t. The proof of the Soundness Theorem is formalized in Coq, providing certified upper bounds on the cost of any expression in the target language.
Existing works for extracting navigation objects from webpages focus on navigation menus, so as to reveal the information architecture of the site.
However, web 2.0 sites such as social networks, e-commerce portals etc. are making the understanding of the content structure in a web site increasingly difficult.
Dynamic and personalized elements such as top stories, recommended list in a webpage are vital to the understanding of the dynamic nature of web 2.0 sites.
To better understand the content structure in web 2.0 sites, in this paper we propose a new extraction method for navigation objects in a webpage.
Our method will extract not only the static navigation menus, but also the dynamic and personalized page-specific navigation lists.
Since the navigation objects in a webpage naturally come in blocks, we first cluster hyperlinks into different blocks by exploiting spatial locations of hyperlinks, the hierarchical structure of the DOM-tree and the hyperlink density.
Then we identify navigation objects from those blocks using the SVM classifier with novel features such as anchor text lengths etc.
Experiments on real-world data sets with webpages from various domains and styles verified the effectiveness of our method.
Normalization methods are a central building block in the deep learning toolbox.
They accelerate and stabilize training, while decreasing the dependence on manually tuned learning rate schedules.
When learning from multi-modal distributions, the effectiveness of batch normalization (BN), arguably the most prominent normalization method, is reduced.
As a remedy, we propose a more flexible approach: by extending the normalization to more than a single mean and variance, we detect modes of data on-the-fly, jointly normalizing samples that share common features.
We demonstrate that our method outperforms BN and other widely used normalization techniques in several experiments, including single and multi-task datasets.
Cloud computing provides engineers or scientists a place to run complex computing tasks.
Finding a workflow's deployment configuration in a cloud environment is not easy.
Traditional workflow scheduling algorithms were based on some heuristics, e.g. reliability greedy, cost greedy, cost-time balancing, etc., or more recently, the meta-heuristic methods, such as genetic algorithms.
These methods are very slow and not suitable for rescheduling in the dynamic cloud environment.
This paper introduces RIOT (Randomized Instance Order Types), a stochastic based method for workflow scheduling.
RIOT groups the tasks in the workflow into virtual machines via a probability model and then uses an effective surrogate-based method to assess a large amount of potential scheduling.
Experiments in dozens of study cases showed that RIOT executes tens of times faster than traditional methods while generating comparable results to other methods.
Given a large population, it is an intensive task to gather individual preferences over a set of alternatives and arrive at an aggregate or collective preference of the population.
We show that social network underlying the population can be harnessed to accomplish this task effectively, by sampling preferences of a small subset of representative nodes.
We first develop a Facebook app to create a dataset consisting of preferences of nodes and the underlying social network, using which, we develop models that capture how preferences are distributed among nodes in a typical social network.
We hence propose an appropriate objective function for the problem of selecting best representative nodes.
We devise two algorithms, namely, Greedy-min which provides a performance guarantee for a wide class of popular voting rules, and Greedy-sum which exhibits excellent performance in practice.
We compare the performance of these proposed algorithms against random-polling and popular centrality measures, and provide a detailed analysis of the obtained results.
Our analysis suggests that selecting representatives using social network information is advantageous for aggregating preferences related to personal topics (e.g., lifestyle), while random polling with a reasonable sample size is good enough for aggregating preferences related to social topics (e.g., government policies).
On a constant quest for inspiration, designers can become more effective with tools that facilitate their creative process and let them overcome design fixation.
This paper explores the practicality of applying neural style transfer as an emerging design tool for generating creative digital content.
To this aim, the present work explores a well-documented neural style transfer algorithm (Johnson 2016) in four experiments on four relevant visual parameters: number of iterations, learning rate, total variation, content vs. style weight.
The results allow a pragmatic recommendation of parameter configuration (number of iterations: 200 to 300, learning rate: 2e-1 to 4e-1, total variation: 1e-4 to 1e-8, content weights vs. style weights: 50:100 to 200:100) that saves extensive experimentation time and lowers the technical entry barrier.
With this rule-of-thumb insight, visual designers can effectively apply deep learning to create artistic visual variations of digital content.
This could enable designers to leverage AI for creating design works as state-of-the-art.
In this correspondence, we illustrate among other things the use of the stationarity property of the set of capacity-achieving inputs in capacity calculations.
In particular, as a case study, we consider a bit-patterned media recording channel model and formulate new lower and upper bounds on its capacity that yield improvements over existing results.
Inspired by the observation that the new bounds are tight at low noise levels, we also characterize the capacity of this model as a series expansion in the low-noise regime.
The key to these results is the realization of stationarity in the supremizing input set in the capacity formula.
While the property is prevalent in capacity formulations in the ergodic-theoretic literature, we show that this realization is possible in the Shannon-theoretic framework where a channel is defined as a sequence of finite-dimensional conditional probabilities, by defining a new class of consistent stationary and ergodic channels.
A self-learning optimal control algorithm for episodic fixed-horizon manufacturing processes with time-discrete control actions is proposed and evaluated on a simulated deep drawing process.
The control model is built during consecutive process executions under optimal control via reinforcement learning, using the measured product quality as reward after each process execution.
Prior model formulation, which is required by state-of-the-art algorithms from model predictive control and approximate dynamic programming, is therefore obsolete.
This avoids several difficulties namely in system identification, accurate modelling, and runtime complexity, that arise when dealing with processes subject to nonlinear dynamics and stochastic influences.
Instead of using pre-created process and observation models, value function-based reinforcement learning algorithms build functions of expected future reward, which are used to derive optimal process control decisions.
The expectation functions are learned online, by interacting with the process.
The proposed algorithm takes stochastic variations of the process conditions into account and is able to cope with partial observability.
A Q-learning-based method for adaptive optimal control of partially observable episodic fixed-horizon manufacturing processes is developed and studied.
The resulting algorithm is instantiated and evaluated by applying it to a simulated stochastic optimal control problem in metal sheet deep drawing.
We present a general framework and method for simultaneous detection and segmentation of an object in a video that moves (or comes into view of the camera) at some unknown time in the video.
The method is an online approach based on motion segmentation, and it operates under dynamic backgrounds caused by a moving camera or moving nuisances.
The goal of the method is to detect and segment the object as soon as it moves.
Due to stochastic variability in the video and unreliability of the motion signal, several frames are needed to reliably detect the object.
The method is designed to detect and segment with minimum delay subject to a constraint on the false alarm rate.
The method is derived as a problem of Quickest Change Detection.
Experiments on a dataset show the effectiveness of our method in minimizing detection delay subject to false alarm constraints.
Coronary heart disease is one of the top rank leading cause of mortality in the world which can be because of plaque burden inside the arteries.
Intravascular Ultrasound (IVUS) has been recognized as power- ful imaging technology which captures the real time and high resolution images of the coronary arteries and can be used for the analysis of these plaques.
The IVUS segmentation involves the extraction of two arterial walls components namely, lumen and media.
In this paper, we investi- gate the effectiveness of Convolutional Neural Networks including U-Net to segment ultrasound scans of arteries.
In particular, the proposed seg- mentation network was built based on the the U-Net with the VGG16 encoder.
Experiments were done for evaluating the proposed segmen- tation architecture which show promising quantitative and qualitative results.
Imitation learning has traditionally been applied to learn a single task from demonstrations thereof.
The requirement of structured and isolated demonstrations limits the scalability of imitation learning approaches as they are difficult to apply to real-world scenarios, where robots have to be able to execute a multitude of tasks.
In this paper, we propose a multi-modal imitation learning framework that is able to segment and imitate skills from unlabelled and unstructured demonstrations by learning skill segmentation and imitation learning jointly.
The extensive simulation results indicate that our method can efficiently separate the demonstrations into individual skills and learn to imitate them using a single multi-modal policy.
The video of our experiments is available at http://sites.google.com/view/nips17intentiongan
In this paper we analyze the Friedkin-Johnsen model of opinions when the coefficients weighting the agent susceptibilities to interpersonal influence approach 1.
We will show that in this case, under suitable assumptions, the model converges to a quasi-consensus condition among the agents.
In general the achieved consensus value will be different to the one obtained by the corresponding DeGroot model
The steady increase in the volume of indicators of compromise (IoC) as well as their volatile nature makes their processing challenging.
Once compromised infrastructures are cleaned up, threat actors are moving to on to other target infrastructures or simply changing attack strategies.
To ease the evaluation of IoCs as well as to harness the combined analysis capabilities, threat intelligence sharing platforms were introduced in order to foster collaboration on a community level.
In this paper, the open-source threat intelligence platform MISP is used to implement and showcase a generic scoring model for decaying IoCs shared within MISP communities matching their heterogeneous objectives.
The model takes into account existing meta-information shared along with indicators of compromise,facilitating the decision making process for machines in regards to the validity of the shared indicator of compromise.
The model is applied on common use-cases that are normally encountered during incident response.
Motivated by a project to create a system for people who are deaf or hard-of-hearing that would use automatic speech recognition (ASR) to produce real-time text captions of spoken English during in-person meetings with hearing individuals, we have augmented a transcript of the Switchboard conversational dialogue corpus with an overlay of word-importance annotations, with a numeric score for each word, to indicate its importance to the meaning of each dialogue turn.
Further, we demonstrate the utility of this corpus by training an automatic word importance labeling model; our best performing model has an F-score of 0.60 in an ordinal 6-class word-importance classification task with an agreement (concordance correlation coefficient) of 0.839 with the human annotators (agreement score between annotators is 0.89).
Finally, we discuss our intended future applications of this resource, particularly for the task of evaluating ASR performance, i.e.
creating metrics that predict ASR-output caption text usability for DHH users better thanWord Error Rate (WER).
Boolean satisfiability (SAT) has an extensive application domain in computer science, especially in electronic design automation applications.
Circuit synthesis, optimization, and verification problems can be solved by transforming original problems to SAT problems.
However, the SAT problem is known as NP-complete, which means there is no efficient method to solve it.
Therefore, an efficient SAT solver to enhance the performance is always desired.
We propose a hardware acceleration method for SAT problems.
By surveying the properties of SAT problems and the decoding of low-density parity-check (LDPC) codes, a special class of error-correcting codes, we discover that both of them are constraint satisfaction problems.
The belief propagation algorithm has been successfully applied to the decoding of LDPC, and the corresponding decoder hardware designs are extensively studied.
Therefore, we proposed a belief propagation based algorithm to solve SAT problems.
With this algorithm, the SAT solver can be accelerated by hardware.
A software simulator is implemented to verify the proposed algorithm and the performance improvement is estimated.
Our experiment results show that time complexity does not increase with the size of SAT problems and the proposed method can achieve at least 30x speedup compared to MiniSat.
In this paper we give a compact presentation of the theory of abstract spaces for convolutional codes and convolutional encoders, and show a connection between them that seems to be missing in the literature.
We use it for a short proof of two facts: the size of a convolutional encoder of a polynomial matrix is at least its inner degree, and the minimal encoder has the size of the external degree if the matrix is reduced.
Conference publications in computer science (CS) have attracted scholarly attention due to their unique status as a main research outlet unlike other science fields where journals are dominantly used for communicating research findings.
One frequent research question has been how different conference and journal publications are, considering a paper as a unit of analysis.
This study takes an author-based approach to analyze publishing patterns of 517,763 scholars who have ever published both in CS conferences and journals for the last 57 years, as recorded in DBLP.
The analysis shows that the majority of CS scholars tend to make their scholarly debut, publish more papers, and collaborate with more coauthors in conferences than in journals.
Importantly, conference papers seem to serve as a distinct channel of scholarly communication, not a mere preceding step to journal publications: coauthors and title words of authors across conferences and journals tend not to overlap much.
This study corroborates findings of previous studies on this topic from a distinctive perspective and suggests that conference authorship in CS calls for more special attention from scholars and administrators outside CS who have focused on journal publications to mine authorship data and evaluate scholarly performance.
Despite the growing attention of researcher, healthcare managers and policy makers, data gathering and information management practices are largely untheorized areas.
In this work are presented and discussed some early-stage conceptualizations: Patient-Generated Health Data (PGHD), Observations of Daily Living (ODLs) and Personal Health Information Management (PHIM).
As I shall try to demonstrate, these labels are not neutral rather they underpin quite different perspectives with respect to health, patient-doctor relationship, and the status of data.
Modeling fashion compatibility is challenging due to its complexity and subjectivity.
Existing work focuses on predicting compatibility between product images (e.g. an image containing a t-shirt and an image containing a pair of jeans).
However, these approaches ignore real-world 'scene' images (e.g. selfies); such images are hard to deal with due to their complexity, clutter, variations in lighting and pose (etc.) but on the other hand could potentially provide key context (e.g. the user's body type, or the season) for making more accurate recommendations.
In this work, we propose a new task called 'Complete the Look', which seeks to recommend visually compatible products based on scene images.
We design an approach to extract training data for this task, and propose a novel way to learn the scene-product compatibility from fashion or interior design images.
Our approach measures compatibility both globally and locally via CNNs and attention mechanisms.
Extensive experiments show that our method achieves significant performance gains over alternative systems.
Human evaluation and qualitative analysis are also conducted to further understand model behavior.
We hope this work could lead to useful applications which link large corpora of real-world scenes with shoppable products.
A new characteristic of paired nodes in a directed weight complex network is considered.
A method (named as K-method) of the characteristics calculation for complex networks is proposed.
The method is based on transforming the initial network with the subsequent application of the Kirchhoff rules.
The scope of the method for sparse complex networks is proposed.
The nodes of these complex networks are concepts of the real world, and the connections have a cause-effect character of the so-called "cognitive maps".
Two new characteristics of concept nodes having a semantic interpretation are proposed, namely "pressure" and "influence" taking into account the influence of all nodes on each other.
In this paper, we study the implications of the commonplace assumption that most social media studies make with respect to the nature of message shares (such as retweets) as a predominantly positive interaction.
By analyzing two large longitudinal Brazilian Twitter datasets containing 5 years of conversations on two polarizing topics - Politics and Sports - we empirically demonstrate that groups holding antagonistic views can actually retweet each other more often than they retweet other groups.
We show that assuming retweets as endorsement interactions can lead to misleading conclusions with respect to the level of antagonism among social communities, and that this apparent paradox is explained in part by the use of retweets to quote the original content creator out of the message's original temporal context, for humor and criticism purposes.
As a consequence, messages diffused on online media can have their polarity reversed over time, what poses challenges for social and computer scientists aiming to classify and track opinion groups on online media.
On the other hand, we found that the time users take to retweet a message after it has been originally posted can be a useful signal to infer antagonism in social platforms, and that surges of out-of-context retweets correlate with sentiment drifts triggered by real-world events.
We also discuss how such evidences can be embedded in sentiment analysis models.
Network technologies are traditionally centered on wireline solutions.
Wireless broadband technologies nowadays provide unlimited broadband usage to users that have been previously offered simply to wireline users.
In this paper, we discuss some of the upcoming standards of one of the emerging wireless broadband technology i.e.IEEE 802.11.
The newest and the emerging standards fix technology issues or add functionality that will be expected to overcome many of the current standing problems with IEEE 802.11.
Spreadsheets that are informally created are harder to test than they should be.
Simple cross-foot checks or being easily readable are modest but attainable goals for every spreadsheet developer.
This paper lists some tips on building self-checking into a spreadsheet in order to provide more confidence to the reader that a spreadsheet is robust.
We study a semi-supervised learning method based on the similarity graph and RegularizedLaplacian.
We give convenient optimization formulation of the Regularized Laplacian method and establishits various properties.
In particular, we show that the kernel of the methodcan be interpreted in terms of discrete and continuous time random walks and possesses several importantproperties of proximity measures.
Both optimization and linear algebra methods can be used for efficientcomputation of the classification functions.
We demonstrate on numerical examples that theRegularized Laplacian method is competitive with respect to the other state of the art semi-supervisedlearning methods.
Historically studies of behaviour on networks have focused on the behaviour of individuals (node-based) or on the aggregate behaviour of the entire network.
We propose a new method to decompose a temporal network into macroscale components and to analyse the behaviour of these components, or collectives of nodes, across time.
This method utilises all available information in the temporal network (i.e. no temporal aggregation), combining both topological and temporal structure using temporal motifs and inter-event times.
This allows us create an embedding of a temporal network in order to describe behaviour over time and at different timescales.
We illustrate this method using an example of digital communication data collected from an online social network.
With the growing demand of real-time traffic monitoring nowadays, software-based image processing can hardly meet the real-time data processing requirement due to the serial data processing nature.
In this paper, the implementation of a hardware-based feature detection and networking system prototype for real-time traffic monitoring as well as data transmission is presented.
The hardware architecture of the proposed system is mainly composed of three parts: data collection, feature detection, and data transmission.
Overall, the presented prototype can tolerate a high data rate of about 60 frames per second.
By integrating the feature detection and data transmission functions, the presented system can be further developed for various VANET application scenarios to improve road safety and traffic efficiency.
For example, detection of vehicles that violate traffic rules, parking enforcement, etc.
Video semantic segmentation has been one of the research focus in computer vision recently.
It serves as a perception foundation for many fields such as robotics and autonomous driving.
The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods.
Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets.
They have been the standard datasets for comparison among semantic segmentation methods.
In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid.
Our UAV dataset consists of 30 video sequences capturing high resolution images.
In total, 300 images have been densely labelled with 8 classes for urban scene understanding task.
Our dataset brings out new challenges.
We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction.
We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.
In this paper, we propose a single-agent modal logic framework for reasoning about goal-direct "knowing how" based on ideas from linguistics, philosophy, modal logic and automated planning.
We first define a modal language to express "I know how to guarantee phi given psi" with a semantics not based on standard epistemic models but labelled transition systems that represent the agent's knowledge of his own abilities.
A sound and complete proof system is given to capture the valid reasoning patterns about "knowing how" where the most important axiom suggests its compositional nature.
Neural machine translation (NMT) models are usually trained with the word-level loss using the teacher forcing algorithm, which not only evaluates the translation improperly but also suffers from exposure bias.
Sequence-level training under the reinforcement framework can mitigate the problems of the word-level loss, but its performance is unstable due to the high variance of the gradient estimation.
On these grounds, we present a method with a differentiable sequence-level training objective based on probabilistic n-gram matching which can avoid the reinforcement framework.
In addition, this method performs greedy search in the training which uses the predicted words as context just as at inference to alleviate the problem of exposure bias.
Experiment results on the NIST Chinese-to-English translation tasks show that our method significantly outperforms the reinforcement-based algorithms and achieves an improvement of 1.5 BLEU points on average over a strong baseline system.
In this work, we investigated the contribution of the glottal waveform in human vocal emotion expressing.
Seven emotional states including moderate and intense versions of three emotional families as anger, joy, and sadness, plus a neutral state are considered, with speech samples in Mandarin Chinese.
The glottal waveform extracted from speech samples of different emotion states are first analyzed in both time domain and frequency domain to discover their differences.
Comparative emotion classifications are then taken out based on features extracted from original whole speech signal and only glottal wave signal.
In experiments of generation of a performance-driven hierarchical classifier architecture, and pairwise classification on individual emotional states, the low difference between accuracies obtained from speech signal and glottal signal proved that a majority of emotional cues in speech could be conveyed through glottal waveform.
The best distinguishable emotional pair by glottal waveform is intense anger against moderate sadness, with the accuracy of 92.45%.
It is also concluded in this work that glottal waveform represent better valence cues than arousal cues of emotion.
Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches.
Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights.
In this work, we show that some recurrent neural networks also share this connection to WFSAs.
We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs.
We show that several recent neural models use rational recurrences.
Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs.
We present one such model, which performs better than two recent baselines on language modeling and text classification.
Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.
Rapid miniaturization and cost reduction of computing, along with the availability of wearable and implantable physiological sensors have led to the growth of human Body Area Network (BAN) formed by a network of such sensors and computing devices.
One promising application of such a network is wearable health monitoring where the collected data from the sensors would be transmitted and analyzed to assess the health of a person.
Typically, the devices in a BAN are connected through wireless (WBAN), which suffers from energy inefficiency due to the high-energy consumption of wireless transmission.
Human Body Communication (HBC) uses the relatively low loss human body as the communication medium to connect these devices, promising order(s) of magnitude better energy-efficiency and built-in security compared to WBAN.
In this paper, we demonstrate a health monitoring device and system built using Commercial-Off-The- Shelf (COTS) sensors and components, that can collect data from physiological sensors and transmit it through a) intra-body HBC to another device (hub) worn on the body or b) upload health data through HBC-based human-machine interaction to an HBC capable machine.
The system design constraints and signal transfer characteristics for the implemented HBC-based wearable health monitoring system are measured and analyzed, showing reliable connectivity with >8x power savings compared to Bluetooth lowenergy (BTLE).
In the paper, the control problem with limitations on the magnitude and rate of the control action in aircraft control systems, is studied.
Existence of hidden oscillations in the case of actuator position and rate limitations is demonstrated by the examples of piloted aircraft pilot involved oscillations (PIO) phenomenon and the airfoil flutter suppression system.
In the real world, a learning system could receive an input that looks nothing like anything it has seen during training, and this can lead to unpredictable behaviour.
We thus need to know whether any given input belongs to the population distribution of the training data to prevent unpredictable behaviour in deployed systems.
A recent surge of interest on this problem has led to the development of sophisticated techniques in the deep learning literature.
However, due to the absence of a standardized problem formulation or an exhaustive evaluation, it is not evident if we can rely on these methods in practice.
What makes this problem different from a typical supervised learning setting is that we cannot model the diversity of out-of-distribution samples in practice.
The distribution of outliers used in training may not be the same as the distribution of outliers encountered in the application.
Therefore, classical approaches that learn inliers vs. outliers with only two datasets can yield optimistic results.
We introduce OD-test, a three-dataset evaluation scheme as a practical and more reliable strategy to assess progress on this problem.
The OD-test benchmark provides a straightforward means of comparison for methods that address the out-of-distribution sample detection problem.
We present an exhaustive evaluation of a broad set of methods from related areas on image classification tasks.
Furthermore, we show that for realistic applications of high-dimensional images, the existing methods have low accuracy.
Our analysis reveals areas of strength and weakness of each method.
Abstract Meaning Representation (AMR) annotation efforts have mostly focused on English.
In order to train parsers on other languages, we propose a method based on annotation projection, which involves exploiting annotations in a source language and a parallel corpus of the source language and a target language.
Using English as the source language, we show promising results for Italian, Spanish, German and Chinese as target languages.
Besides evaluating the target parsers on non-gold datasets, we further propose an evaluation method that exploits the English gold annotations and does not require access to gold annotations for the target languages.
This is achieved by inverting the projection process: a new English parser is learned from the target language parser and evaluated on the existing English gold standard.
In this paper, a new offline actor-critic learning algorithm is introduced: Sampled Policy Gradient (SPG).
SPG samples in the action space to calculate an approximated policy gradient by using the critic to evaluate the samples.
This sampling allows SPG to search the action-Q-value space more globally than deterministic policy gradient (DPG), enabling it to theoretically avoid more local optima.
SPG is compared to Q-learning and the actor-critic algorithms CACLA and DPG in a pellet collection task and a self play environment in the game Agar.io.
The online game Agar.io has become massively popular on the internet due to intuitive game design and the ability to instantly compete against players around the world.
From the point of view of artificial intelligence this game is also very intriguing: The game has a continuous input and action space and allows to have diverse agents with complex strategies compete against each other.
The experimental results show that Q-Learning and CACLA outperform a pre-programmed greedy bot in the pellet collection task, but all algorithms fail to outperform this bot in a fighting scenario.
The SPG algorithm is analyzed to have great extendability through offline exploration and it matches DPG in performance even in its basic form without extensive sampling.
We introduce a universe of regular datatypes with variable binding information, for which we define generic formation and elimination (i.e. induction /recursion) operators.
We then define a generic alpha-equivalence relation over the types of the universe based on name-swapping, and derive iteration and induction principles which work modulo alpha-conversion capturing Barendregt's Variable Convention.
We instantiate the resulting framework so as to obtain the Lambda Calculus and System F, for which we derive substitution operations and substitution lemmas for alpha-conversion and substitution composition.
The whole work is carried out in Constructive Type Theory and machine-checked by the system Agda.
While RANSAC-based methods are robust to incorrect image correspondences (outliers), their hypothesis generators are not robust to correct image correspondences (inliers) with positional error (noise).
This slows down their convergence because hypotheses drawn from a minimal set of noisy inliers can deviate significantly from the optimal model.
This work addresses this problem by introducing ANSAC, a RANSAC-based estimator that accounts for noise by adaptively using more than the minimal number of correspondences required to generate a hypothesis.
ANSAC estimates the inlier ratio (the fraction of correct correspondences) of several ranked subsets of candidate correspondences and generates hypotheses from them.
Its hypothesis-generation mechanism prioritizes the use of subsets with high inlier ratio to generate high-quality hypotheses.
ANSAC uses an early termination criterion that keeps track of the inlier ratio history and terminates when it has not changed significantly for a period of time.
The experiments show that ANSAC finds good homography and fundamental matrix estimates in a few iterations, consistently outperforming state-of-the-art methods.
Automated program repair techniques, which target to generating correct patches for real world defects automatically, have gained a lot of attention in the last decade.
Many different techniques and tools have been proposed and developed.
However, even the most sophisticated program repair techniques can only repair a small portion of defects while producing a lot of incorrect patches.
A possible reason for this low performance is that the test suites of real world programs are usually too weak to guarantee the behavior of the program.
To understand to what extent defects can be fixed with weak test suites, we analyzed 50 real world defects from Defects4J, in which we found that up to 84% of them could be correctly fixed.
This result suggests that there is plenty of space for current automated program repair techniques to improve.
Furthermore, we summarized seven fault localization strategies and seven patch generation strategies that were useful in localizing and fixing these defects, and compared those strategies with current repair techniques.
The results indicate potential directions to improve automatic program repair in the future research.
One of the activities of the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) is fostering Virtual Biodiversity Expeditions (VBEs) by bringing domain scientists and cyber infrastructure specialists together as a team.
Over the past few years PRAGMA members have been collaborating on virtualizing the Lifemapper software.
Virtualization and cloud computing have introduced great flexibility and efficiency into IT projects.
Virtualization provides application scalability, maximizes resources utilization, and creates a more efficient, agile, and automated infrastructure.
However, there are downsides to the complexity inherent in these environments, including the need for special techniques to deploy cluster hosts, dependence on virtual environments, and challenging application installation, management, and configuration.
In this paper, we report on progress of the Lifemapper virtualization framework focused on a reproducible and highly configurable infrastructure capable of fast deployment.
A key contribution of this work is describing the practical experience in taking a complex, clustered, domain-specific, data analysis and simulation system and making it available to operate on a variety of system configurations.
Uses of this portability range from whole cluster replication to teaching and experimentation on a single laptop.
System virtualization is used to practically define and make portable the full application stack, including all of its complex set of supporting software.
The need for customizable properties in autonomous robotic platforms, such as in-home nursing care for the elderly and parallel implementations of human-to-machine control interfaces creates an opportunity to introduce methods deploying commonly available mobile devices running robotic command applications in managed code.
This paper will discuss a human-to-machine interface and demonstrate a prototype consisting of a mobile device running a configurable application communicating with a mobile robot using a managed, type-safe language, C#.NET, over Bluetooth.
Programs that transform other programs often require access to the internal structure of the program to be transformed.
This is at odds with the usual extensional view of functional programming, as embodied by the lambda calculus and SK combinator calculus.
The recently-developed SF combinator calculus offers an alternative, intensional model of computation that may serve as a foundation for developing principled languages in which to express intensional computation, including program transformation.
Until now there have been no static analyses for reasoning about or verifying programs written in SF-calculus.
We take the first step towards remedying this by developing a formulation of the popular control flow analysis 0CFA for SK-calculus and extending it to support SF-calculus.
We prove its correctness and demonstrate that the analysis is invariant under the usual translation from SK-calculus into SF-calculus.
Wearable robotic hand rehabilitation devices can allow greater freedom and flexibility than their workstation-like counterparts.
However, the field is generally lacking effective methods by which the user can operate the device: such controls must be effective, intuitive, and robust to the wide range of possible impairment patterns.
Even when focusing on a specific condition, such as stroke, the variety of encountered upper limb impairment patterns means that a single sensing modality, such as electromyography (EMG), might not be sufficient to enable controls for a broad range of users.
To address this significant gap, we introduce a multimodal sensing and interaction paradigm for an active hand orthosis.
In our proof-of-concept implementation, EMG is complemented by other sensing modalities, such as finger bend and contact pressure sensors.
We propose multimodal interaction methods that utilize this sensory data as input, and show they can enable tasks for stroke survivors who exhibit different impairment patterns.
We believe that robotic hand orthoses developed as multimodal sensory platforms with help address some of the key challenges in physical interaction with the user.
In recent years, sharing of security information among organizations, particularly information on both successful and failed security breaches, has been proposed as a method for improving the state of cybersecurity.
However, there is a conflict between individual and social goals in these agreements: despite the benefits of making such information available, the associated disclosure costs (e.g., drop in market value and loss of reputation) act as a disincentive for firms' full disclosure.
In this work, we take a game theoretic approach to understanding firms' incentives for disclosing their security information given such costs.
We propose a repeated game formulation of these interactions, allowing for the design of inter-temporal incentives (i.e., conditioning future cooperation on the history of past interactions).
Specifically, we show that a rating/assessment system can play a key role in enabling the design of appropriate incentives for supporting cooperation among firms.
We further show that in the absence of a monitor, similar incentives can be designed if participating firms are provided with a communication platform, through which they can share their beliefs about others' adherence to the agreement.
Recently there has been significant interest in training machine-learning models at low precision: by reducing precision, one can reduce computation and communication by one order of magnitude.
We examine training at reduced precision, both from a theoretical and practical perspective, and ask: is it possible to train models at end-to-end low precision with provable guarantees?
Can this lead to consistent order-of-magnitude speedups?
We present a framework called ZipML to answer these questions.
For linear models, the answer is yes.
We develop a simple framework based on one simple but novel strategy called double sampling.
Our framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quantization would introduce significant bias.
We validate our framework across a range of applications, and show that it enables an FPGA prototype that is up to 6.5x faster than an implementation using full 32-bit precision.
We further develop a variance-optimal stochastic quantization strategy and show that it can make a significant difference in a variety of settings.
When applied to linear models together with double sampling, we save up to another 1.7x in data movement compared with uniform quantization.
When training deep networks with quantized models, we achieve higher accuracy than the state-of-the-art XNOR-Net.
Finally, we extend our framework through approximation to non-linear models, such as SVM.
We show that, although using low-precision data induces bias, we can appropriately bound and control the bias.
We find in practice 8-bit precision is often sufficient to converge to the correct solution.
Interestingly, however, in practice we notice that our framework does not always outperform the naive rounding approach.
We discuss this negative result in detail.
This paper proposes an image-processing-based method for personalization of calorie consumption assessment during exercising.
An experiment is carried out where several actions are required in an exercise called broadcast gymnastics, especially popular in Japan and China.
We use Kinect, which captures body actions by separating the body into joints and segments that contain them, to monitor body movements to test the velocity of each body joint and capture the subject's image for calculating the mass of each body joint that differs for each subject.
By a kinetic energy formula, we obtain the kinetic energy of each body joint, and calories consumed during exercise are calculated in this process.
We evaluate the performance of our method by benchmarking it to Fitbit, a smart watch well-known for health monitoring during exercise.
The experimental results in this paper show that our method outperforms a state-of-the-art calorie assessment method, which we base on and improve, in terms of the error rate from Fitbit's ground-truth values.
This paper proposes a novel entropy encoding technique for lossless data compression.
Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of the message.
Commercial data compression standards make use of Huffman or arithmetic coding at some stage of the compression process.
In the proposed method, like arithmetic coding entire string is mapped to an integer but is not based on fractional numbers.
Unlike both arithmetic and Huffman coding no prior entropy model of the source is required.
Simple intuitive algorithm based on multinomial coefficients is developed for entropy encoding that adoptively uses low number of bits for more frequent symbols.
Correctness of the algorithm is demonstrated by an example.
Microfluidic devices are utilized to control and direct flow behavior in a wide variety of applications, particularly in medical diagnostics.
A particularly popular form of microfluidics -- called inertial microfluidic flow sculpting -- involves placing a sequence of pillars to controllably deform an initial flow field into a desired one.
Inertial flow sculpting can be formally defined as an inverse problem, where one identifies a sequence of pillars (chosen, with replacement, from a finite set of pillars, each of which produce a specific transformation) whose composite transformation results in a user-defined desired transformation.
Endemic to most such problems in engineering, inverse problems are usually quite computationally intractable, with most traditional approaches based on search and optimization strategies.
In this paper, we pose this inverse problem as a Reinforcement Learning (RL) problem.
We train a DoubleDQN agent to learn from this environment.
The results suggest that learning is possible using a DoubleDQN model with the success frequency reaching 90% in 200,000 episodes and the rewards converging.
While most of the results are obtained by fixing a particular target flow shape to simplify the learning problem, we later demonstrate how to transfer the learning of an agent based on one target shape to another, i.e. from one design to another and thus be useful for a generic design of a flow shape.
The emergence of smartwatches poses new challenges to information security.
Although there are mature touch-based authentication methods for smartphones, the effectiveness of using these methods on smartwatches is still unclear.
We conducted a user study (n=16) to evaluate how authentication methods (PIN and Pattern), UIs (Square and Circular), and display sizes (38mm and 42mm) affect authentication accuracy, speed, and security.
Circular UIs are tailored to smartwatches with fewer UI elements.
Results show that 1) PIN is more accurate and secure than Pattern; 2) Pattern is much faster than PIN; 3) Square UIs are more secure but less accurate than Circular UIs; 4) display size does not affect accuracy or speed, but security; 5) Square PIN is the most secure method of all.
The study also reveals a security concern that participants' favorite method is not the best in any of the measures.
We finally discuss implications for future touch-based smartwatch authentication design.
Point-Of-Interest (POI) recommendation aims to mine a user's visiting history and find her/his potentially preferred places.
Although location recommendation methods have been studied and improved pervasively, the challenges w.r.t employing various influences including temporal aspect still remain.
Inspired by the fact that time includes numerous granular slots (e.g. minute, hour, day, week and etc.
), in this paper, we define a new problem to perform recommendation through exploiting all diversified temporal factors.
In particular, we argue that most existing methods only focus on a limited number of time-related features and neglect others.
Furthermore, considering a specific granularity (e.g. time of a day) in recommendation cannot always apply to each user or each dataset.
To address the challenges, we propose a probabilistic generative model, named after Multi-aspect Time-related Influence (MATI) to promote POI recommendation.
We also develop a novel optimization algorithm based on Expectation Maximization (EM).
Our MATI model firstly detects a user's temporal multivariate orientation using her check-in log in Location-based Social Networks(LBSNs).
It then performs recommendation using temporal correlations between the user and proposed locations.
Our method is adaptable to various types of recommendation systems and can work efficiently in multiple time-scales.
Extensive experimental results on two large-scale LBSN datasets verify the effectiveness of our method over other competitors.
The emergence of academic search engines (mainly Google Scholar and Microsoft Academic Search) that aspire to index the entirety of current academic knowledge has revived and increased interest in the size of the academic web.
The main objective of this paper is to propose various methods to estimate the current size (number of indexed documents) of Google Scholar (May 2014) and to determine its validity, precision and reliability.
To do this, we present, apply and discuss three empirical methods: an external estimate based on empirical studies of Google Scholar coverage, and two internal estimate methods based on direct, empty and absurd queries, respectively.
The results, despite providing disparate values, place the estimated size of Google Scholar at around 160 to 165 million documents.
However, all the methods show considerable limitations and uncertainties due to inconsistencies in the Google Scholar search functionalities.
Monitoring the number of insect pests is a crucial component in pheromone-based pest management systems.
In this paper, we propose an automatic detection pipeline based on deep learning for identifying and counting pests in images taken inside field traps.
Applied to a commercial codling moth dataset, our method shows promising performance both qualitatively and quantitatively.
Compared to previous attempts at pest detection, our approach uses no pest-specific engineering which enables it to adapt to other species and environments with minimal human effort.
It is amenable to implementation on parallel hardware and therefore capable of deployment in settings where real-time performance is required.
The IoT area has grown significantly in the last few years and is expected to reach a gigantic amount of 50 billion devices by 2020.
The appearance of serverless architectures, specifically highlighting FaaS, raises the question of the of using such in IoT environments.
Combining IoT with a serverless architectural design can be effective when trying to make use of the local processing power that exists in a local network of IoT devices and creating a fog layer that leverages computational capabilities that are closer to the end-user.
In this approach, which is placed between the device and the serverless function, when a device requests for the execution of a serverless function will decide based on previous metrics of execution if the serverless function should be executed locally, in the fog layer of a local network of IoT devices, or if it should be executed remotely, in one of the available cloud servers.
Therefore, this approach allows to dynamically allocating functions to the most suitable layer.
Binary classification is one of the most common problem in machine learning.
It consists in predicting whether a given element belongs to a particular class.
In this paper, a new algorithm for binary classification is proposed using a hypergraph representation.
Each element to be classified is partitioned according to its interactions with the training set.
For each class, a seminorm over the training set partition is learnt to represent the distribution of evidence supporting this class.
The method is agnostic to data representations, can work with multiple data sources or in non-metric spaces, and accommodates with missing values.
As a result, it drastically reduces the need for data preprocessing or feature engineering.
Empirical validation demonstrates its high potential on a wide range of well-known datasets and the results are compared to the state-of-the-art.
The time complexity is given and empirically validated.
Its capacity to provide good performances without hyperparameter tuning compared to standard classification methods is studied.
Finally, the limitation of the model space is discussed, and some potential solutions proposed.
The notion of o-polynomial comes from finite projective geometry.
In 2011 and later, it has been shown that those objects play an important role in symmetric cryptography and coding theory to design bent Boolean functions, bent vectorial Boolean functions, semi-bent functions and to construct good linear codes.
In this note, we characterize o-polynomials by the Walsh transform of the associated vectorial functions.
The ubiquity of online fashion shopping demands effective recommendation services for customers.
In this paper, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user.
To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion.
More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step.
Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships.
Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM.
The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit.
We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods.
Revision control is a vital component in the collaborative development of artifacts such as software code and multimedia.
While revision control has been widely deployed for text files, very few attempts to control the versioning of binary files can be found in the literature.
This can be inconvenient for graphics applications that use a significant amount of binary data, such as images, videos, meshes, and animations.
Existing strategies such as storing whole files for individual revisions or simple binary deltas, respectively consume significant storage and obscure semantic information.
To overcome these limitations, in this paper we present a revision control system for digital images that stores revisions in form of graphs.
Besides, being integrated with Git, our revision control system also facilitates artistic creation processes in common image editing and digital painting workflows.
A preliminary user study demonstrates the usability of the proposed system.
The paper investigates epistemic properties of information flow under communication protocols with a given topological structure of the communication network.
The main result is a sound and complete logical system that describes all such properties.
The system consists of a variation of the multi-agent epistemic logic S5 extended by a new network-specific Gateway axiom.
With the advancement of research in word sense disambiguation and deep learning, large sense-annotated datasets are increasingly important for training supervised systems.
However, gathering high-quality sense-annotated data for as many instances as possible is an arduous task.
This has led to the proliferation of automatic and semi-automatic methods for overcoming the so-called knowledge-acquisition bottleneck.
In this paper we present an overview of currently available sense-annotated corpora, both manually and automatically constructed, for various languages and resources (i.e.WordNet, Wikipedia, BabelNet).
General statistics and specific features of each sense-annotated dataset are also provided.
Non-extractive fish abundance estimation with the aid of visual analysis has drawn increasing attention.
Unstable illumination, ubiquitous noise and low frame rate video capturing in the underwater environment, however, make conventional tracking methods unreliable.
In this paper, we present a multiple fish tracking system for low-contrast and low-frame-rate stereo videos with the use of a trawl-based underwater camera system.
An automatic fish segmentation algorithm overcomes the low-contrast issues by adopting a histogram backprojection approach on double local-thresholded images to ensure an accurate segmentation on the fish shape boundaries.
Built upon a reliable feature-based object matching method, a multiple-target tracking algorithm via a modified Viterbi data association is proposed to overcome the poor motion continuity and frequent entrance/exit of fish targets under low-frame-rate scenarios.
In addition, a computationally efficient block-matching approach performs successful stereo matching, which enables an automatic fish-body tail compensation to greatly reduce segmentation error and allows for an accurate fish length measurement.
Experimental results show that an effective and reliable tracking performance for multiple live fish with underwater stereo cameras is achieved.
Shadow detection and shadow removal are fundamental and challenging tasks, requiring an understanding of the global image semantics.
This paper presents a novel deep neural network design for shadow detection and removal by analyzing the image context in a direction-aware manner.
To achieve this, we first formulate the direction-aware attention mechanism in a spatial recurrent neural network (RNN) by introducing attention weights when aggregating spatial context features in the RNN.
By learning these weights through training, we can recover direction-aware spatial context (DSC) for detecting and removing shadows.
This design is developed into the DSC module and embedded in a convolutional neural network (CNN) to learn the DSC features in different levels.
Moreover, we design a weighted cross entropy loss to make effective the training for shadow detection and further adopt the network for shadow removal by using a Euclidean loss function and formulating a color transfer function to address the color and luminosity inconsistency in the training pairs.
We employ two shadow detection benchmark datasets and two shadow removal benchmark datasets, and perform various experiments to evaluate our method.
Experimental results show that our method clearly outperforms state-of-the-art methods for both shadow detection and shadow removal.
In an increasingly polarized world, demagogues who reduce complexity down to simple arguments based on emotion are gaining in popularity.
Are opinions and online discussions falling into demagoguery?
In this work, we aim to provide computational tools to investigate this question and, by doing so, explore the nature and complexity of online discussions and their space of opinions, uncovering where each participant lies.
More specifically, we present a modeling framework to construct latent representations of opinions in online discussions which are consistent with human judgements, as measured by online voting.
If two opinions are close in the resulting latent space of opinions, it is because humans think they are similar.
Our modeling framework is theoretically grounded and establishes a surprising connection between opinions and voting models and the sign-rank of a matrix.
Moreover, it also provides a set of practical algorithms to both estimate the dimension of the latent space of opinions and infer where opinions expressed by the participants of an online discussion lie in this space.
Experiments on a large dataset from Yahoo!
News, Yahoo!
Finance, Yahoo!
Sports, and the Newsroom app suggest that unidimensional opinion models may often be unable to accurately represent online discussions, provide insights into human judgements and opinions, and show that our framework is able to circumvent language nuances such as sarcasm or humor by relying on human judgements instead of textual analysis.
The Dynamic Scalability of resources, a problem in Infrastructure as a Service (IaaS) has been the hotspot for research and industry communities.
The heterogeneous and dynamic nature of the Cloud workloads depends on the Quality of Service (QoS) allocation of appropriate workloads to appropriate resources.
A workload is an abstraction of work that instance or set of instances that are going to perform.
Running a web service or being a Hadoop data node is valid workloads.
The efficient management of dynamic nature resources can be done with the help of workloads.
Until workload is considered a fundamental capability, the Cloud resources cannot be utilized in an efficient manner.
In this paper, different workloads have been identified and categorized along with their characteristics and constraints.
The metrics based on Quality of Service (QoS) requirements have been identified for each workload and have been analyzed for creating better application design.
Today it is crucial for organizations to pay even greater attention on quality management as the importance of this function in achieving ultimate business objectives is increasingly becoming clearer.
Importance of the Quality Management Function in achieving basic need by ensuring compliance with Capability Maturity Model Integrated or International Organization for Standardization is a basic demand from business nowadays.
However, Quality Management Function and its processes need to be made much more mature to prevent delivery outages and to achieve business excellence through their review and auditing capability.
Many organizations now face challenges in determining the maturity of the Quality Management group along with the service offered by them and the right way to elevate the maturity of the same.
The objective of this whitepaper is to propose a new model, the Audit Maturity Model which will provide organizations with a measure of their maturity in quality management in the perspective of auditing, along with recommendations for preventing delivery outage, and identifying risk to achieve business excellence.
This will enable organizations to assess Quality Management maturity higher than basic hygiene and will also help them to identify gaps and to take corrective actions for achieving higher maturity levels.
Hence the objective is to envisage a new auditing model as a part of organisation quality management function which can be a guide for them to achieve higher level of maturity and ultimately help to achieve delivery and business excellence.
Distant pointing is still not efficient, accurate or flexible enough for many applications, although many researchers have focused on it.
To improve upon distant pointing, we propose MPP3D, which is especially suitable for high-resolution displays.
MPP3D uses two dimensions of hand positioning to move a pointer, and it also uses the third dimension to adjust the precision of the movement.
Based on the idea of MPP3D, we propose four techniques which combine two ways of mapping and two techniques for precision adjustment.
We further provide three types of mapping scheme and visual feedback for each technique.
The potential of the proposed techniques was investigated through experimentation.
The results show that these techniques were competent for usual computer operations with a cursor, and the adjustment for pointing precision was beneficial for both pointing efficiency and accuracy.
We present a framework for learning efficient holistic representation for handwritten word images.
The proposed method uses a deep convolutional neural network with traditional classification loss.
The major strengths of our work lie in: (i) the efficient usage of synthetic data to pre-train a deep network, (ii) an adapted version of ResNet-34 architecture with region of interest pooling (referred as HWNet v2) which learns discriminative features with variable sized word images, and (iii) realistic augmentation of training data with multiple scales and elastic distortion which mimics the natural process of handwriting.
We further investigate the process of fine-tuning at various layers to reduce the domain gap between synthetic and real domain and also analyze the in-variances learned at different layers using recent visualization techniques proposed in literature.
Our representation leads to state of the art word spotting performance on standard handwritten datasets and historical manuscripts in different languages with minimal representation size.
On the challenging IAM dataset, our method is first to report an mAP above 0.90 for word spotting with a representation size of just 32 dimensions.
Further more, we also present results on printed document datasets in English and Indic scripts which validates the generic nature of the proposed framework for learning word image representation.
We investigate the recovery of signals exhibiting a sparse representation in a general (i.e., possibly redundant or incomplete) dictionary that are corrupted by additive noise admitting a sparse representation in another general dictionary.
This setup covers a wide range of applications, such as image inpainting, super-resolution, signal separation, and recovery of signals that are impaired by, e.g., clipping, impulse noise, or narrowband interference.
We present deterministic recovery guarantees based on a novel uncertainty relation for pairs of general dictionaries and we provide corresponding practicable recovery algorithms.
The recovery guarantees we find depend on the signal and noise sparsity levels, on the coherence parameters of the involved dictionaries, and on the amount of prior knowledge about the signal and noise support sets.
Point clouds obtained from 3D scans are typically sparse, irregular, and noisy, and required to be consolidated.
In this paper, we present the first deep learning based edge-aware technique to facilitate the consolidation of point clouds.
We design our network to process points grouped in local patches, and train it to learn and help consolidate points, deliberately for edges.
To achieve this, we formulate a regression component to simultaneously recover 3D point coordinates and point-to-edge distances from upsampled features, and an edge-aware joint loss function to directly minimize distances from output points to 3D meshes and to edges.
Compared with previous neural network based works, our consolidation is edge-aware.
During the synthesis, our network can attend to the detected sharp edges and enable more accurate 3D reconstructions.
Also, we trained our network on virtual scanned point clouds, demonstrated the performance of our method on both synthetic and real point clouds, presented various surface reconstruction results, and showed how our method outperforms the state-of-the-arts.
In Graph Theory a number of results were devoted to studying the computational complexity of the number modulo 2 of a graph's edge set decompositions of various kinds, first of all including its Hamiltonian decompositions, as well as the number modulo 2 of, say, Hamiltonian cycles/paths etc.
While the problems of finding a Hamiltonian decomposition and Hamiltonian cycle are NP-complete, counting these objects modulo 2 in polynomial time is yet possible for certain types of regular undirected graphs.
Some of the most known examples are the theorems about the existence of an even number of Hamiltonian decompositions in a 4-regular graph and an even number of such decompositions where two given edges e and g belong to different cycles (Thomason, 1978), as well as an even number of Hamiltonian cycles passing through any given edge in a regular odd-degreed graph (Smith's theorem).
The present article introduces a new algebraic technique which generalizes the notion of counting modulo 2 via applying fields of Characteristic 2 and determinants and, for instance, allows to receive a polynomial-time formula for the number modulo 2 of a 4-regular bipartite graph's Hamiltonian decompositions such that a given edge and a given path of length 2 belong to different Hamiltonian cycles - hence refining/extending (in a computational sense) Thomason's result for bipartite graphs.
This technique also provides a polynomial-time calculation of the number modulo 2 of a graph's edge set decompositions into simple cycles each containing at least one element of a given set of its edges what is a similar kind of extension of Thomason's theorem as well.
As cloud computing is increasingly transforming the information technology landscape, organizations and businesses are exhibiting strong interest in Software-as-a-Service (SaaS) offerings that can help them increase business agility and reduce their operational costs.
They increasingly demand services that can meet their functional and non-functional requirements.
Given the plethora and the variety of SaaS offerings, we propose, in this paper, a framework for SaaS provisioning, which relies on brokered Service Level agreements (SLAs), between service consumers and SaaS providers.
The Cloud Service Broker (CSB) helps service consumers find the right SaaS providers that can fulfil their functional and non-functional requirements.
The proposed selection algorithm ranks potential SaaS providers by matching their offerings against the requirements of the service consumer using an aggregate utility function.
Furthermore, the CSB is in charge of conducting SLA negotiation with selected SaaS providers, on behalf of service consumers, and performing SLA compliance monitoring.
We consider the well-studied partial sums problem in succint space where one is to maintain an array of n k-bit integers subject to updates such that partial sums queries can be efficiently answered.
We present two succint versions of the Fenwick Tree - which is known for its simplicity and practicality.
Our results hold in the encoding model where one is allowed to reuse the space from the input data.
Our main result is the first that only requires nk + o(n) bits of space while still supporting sum/update in O(log_b n) / O(b log_b n) time where 2 <= b <= log^O(1) n. The second result shows how optimal time for sum/update can be achieved while only slightly increasing the space usage to nk + o(nk) bits.
Beyond Fenwick Trees, the results are primarily based on bit-packing and sampling - making them very practical - and they also allow for simple optimal parallelization.
Building structures can allow a robot to surmount large obstacles, expanding the set of areas it can reach.
This paper presents a planning algorithm to automatically determine what structures a construction-capable robot must build in order to traverse its entire environment.
Given an environment, a set of building blocks, and a robot capable of building structures, we seek a optimal set of structures (using a minimum number of building blocks) that could be built to make the entire environment traversable with respect to the robot's movement capabilities.
We show that this problem is NP-Hard, and present a complete, optimal algorithm that solves it using a branch-and-bound strategy.
The algorithm runs in exponential time in the worst case, but solves typical problems with practical speed.
In hardware experiments, we show that the algorithm solves 3D maps of real indoor environments in about one minute, and that the structures selected by the algorithm allow a robot to traverse the entire environment.
An accompanying video is available online at https://youtu.be/B9WM557NP44.
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells.
In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries.
This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task.
This paper proposes an active skeleton formulation for bacteria modeling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness, orientation), an improved boundary accuracy in noisy images, and a natural bacteria-centered coordinate system that permits the intrinsic location of molecular components inside the cell.
Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimizing an energy function which incorporates bacteria shape constraints.
Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modeling cigar-shaped bacteria like Escherichia coli.
The Image-J plugin of the proposed method can be found online at http://fluobactracker.inrialpes.fr.
In this paper, the problem of joint caching and resource allocation is investigated for a network of cache-enabled unmanned aerial vehicles (UAVs) that service wireless ground users over the LTE licensed and unlicensed (LTE-U) bands.
The considered model focuses on users that can access both licensed and unlicensed bands while receiving contents from either the cache units at the UAVs directly or via content server-UAV-user links.
This problem is formulated as an optimization problem which jointly incorporates user association, spectrum allocation, and content caching.
To solve this problem, a distributed algorithm based on the machine learning framework of liquid state machine (LSM) is proposed.
Using the proposed LSM algorithm, the cloud can predict the users' content request distribution while having only limited information on the network's and users' states.
The proposed algorithm also enables the UAVs to autonomously choose the optimal resource allocation strategies that maximize the number of users with stable queues depending on the network states.
Based on the users' association and content request distributions, the optimal contents that need to be cached at UAVs as well as the optimal resource allocation are derived.
Simulation results using real datasets show that the proposed approach yields up to 33.3% and 50.3% gains, respectively, in terms of the number of users that have stable queues compared to two baseline algorithms: Q-learning with cache and Q-learning without cache.
The results also show that LSM significantly improves the convergence time of up to 33.3% compared to conventional learning algorithms such as Q-learning.
This paper presents a word-entity duet framework for utilizing knowledge bases in ad-hoc retrieval.
In this work, the query and documents are modeled by word-based representations and entity-based representations.
Ranking features are generated by the interactions between the two representations, incorporating information from the word space, the entity space, and the cross-space connections through the knowledge graph.
To handle the uncertainties from the automatically constructed entity representations, an attention-based ranking model AttR-Duet is developed.
With back-propagation from ranking labels, the model learns simultaneously how to demote noisy entities and how to rank documents with the word-entity duet.
Evaluation results on TREC Web Track ad-hoc task demonstrate that all of the four-way interactions in the duet are useful, the attention mechanism successfully steers the model away from noisy entities, and together they significantly outperform both word-based and entity-based learning to rank systems.
Computerization of research activities led to the creation of large specialized information resources, platforms, services and software to support scientific research.
However, their shortcomings do not allow to fully realizing the comprehensive support of scientific activity, and the absence of a single entry point to divide the scientific community fragmented groups interests.
The article based on analysing the existing solutions and approaches to the tools of information and communication technologies of various types of scientific activity, and taking into account the research lifecycle proposed and formulated the basic principles of designing and implementing an integrated information system to support scientific research.
Clustering is crucial for many computer vision applications such as robust tracking, object detection and segmentation.
This work presents a real-time clustering technique that takes advantage of the unique properties of event-based vision sensors.
Since event-based sensors trigger events only when the intensity changes, the data is sparse, with low redundancy.
Thus, our approach redefines the well-known mean-shift clustering method using asynchronous events instead of conventional frames.
The potential of our approach is demonstrated in a multi-target tracking application using Kalman filters to smooth the trajectories.
We evaluated our method on an existing dataset with patterns of different shapes and speeds, and a new dataset that we collected.
The sensor was attached to the Baxter robot in an eye-in-hand setup monitoring real-world objects in an action manipulation task.
Clustering accuracy achieved an F-measure of 0.95, reducing the computational cost by 88% compared to the frame-based method.
The average error for tracking was 2.5 pixels and the clustering achieved a consistent number of clusters along time.
Sparse representation of structured signals requires modelling strategies that maintain specific signal properties, in addition to preserving original information content and achieving simpler signal representation.
Therefore, the major design challenge is to introduce adequate problem formulations and offer solutions that will efficiently lead to desired representations.
In this context, sparse representation of covariance and precision matrices, which appear as feature descriptors or mixture model parameters, respectively, will be in the main focus of this paper.
Mesh labeling is the key problem of classifying the facets of a 3D mesh with a label among a set of possible ones.
State-of-the-art methods model mesh labeling as a Markov Random Field over the facets.
These algorithms map image segmentations to the mesh by minimizing an energy function that comprises a data term, a smoothness terms, and class-specific priors.
The latter favor a labeling with respect to another depending on the orientation of the facet normals.
In this paper we propose a novel energy term that acts as a prior, but does not require any prior knowledge about the scene nor scene-specific relationship among classes.
It bootstraps from a coarse mapping of the 2D segmentations on the mesh, and it favors the facets to be labeled according to the statistics of the mesh normals in their neighborhood.
We tested our approach against five different datasets and, even if we do not inject prior knowledge, our method adapts to the data and overcomes the state-of-the-art.
Fingerprint-based indoor localization methods are promising due to the high availability of deployed access points and compatibility with commercial-off-the-shelf user devices.
However, to train regression models for localization, an extensive site survey is required to collect fingerprint data from the target areas.
In this paper, we consider the problem of informative path planning (IPP) to find the optimal walk for site survey subject to a budget constraint.
IPP for location fingerprint collection is related to the well-known orienteering problem (OP) but is more challenging due to edge-based non-additive rewards and revisits.
Given the NP-hardness of IPP, we propose two heuristic approaches: a Greedy algorithm and a genetic algorithm.
We show through experimental data collected from two indoor environments with different characteristics that the two algorithms have low computation complexity, can generally achieve higher utility and lower localization errors compared to the extension of two state-of-the-art approaches to OP.
Assessing the magnitude of cause-and-effect relations is one of the central challenges found throughout the empirical sciences.
The problem of identification of causal effects is concerned with determining whether a causal effect can be computed from a combination of observational data and substantive knowledge about the domain under investigation, which is formally expressed in the form of a causal graph.
In many practical settings, however, the knowledge available for the researcher is not strong enough so as to specify a unique causal graph.
Another line of investigation attempts to use observational data to learn a qualitative description of the domain called a Markov equivalence class, which is the collection of causal graphs that share the same set of observed features.
In this paper, we marry both approaches and study the problem of causal identification from an equivalence class, represented by a partial ancestral graph (PAG).
We start by deriving a set of graphical properties of PAGs that are carried over to its induced subgraphs.
We then develop an algorithm to compute the effect of an arbitrary set of variables on an arbitrary outcome set.
We show that the algorithm is strictly more powerful than the current state of the art found in the literature.
A bag-of-words based probabilistic classifier is trained using regularized logistic regression to detect vandalism in the English Wikipedia.
Isotonic regression is used to calibrate the class membership probabilities.
Learning curve, reliability, ROC, and cost analysis are performed.
Current neural network-based classifiers are susceptible to adversarial examples even in the black-box setting, where the attacker only has query access to the model.
In practice, the threat model for real-world systems is often more restrictive than the typical black-box model where the adversary can observe the full output of the network on arbitrarily many chosen inputs.
We define three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partial-information setting, and the label-only setting.
We develop new attacks that fool classifiers under these more restrictive threat models, where previous methods would be impractical or ineffective.
We demonstrate that our methods are effective against an ImageNet classifier under our proposed threat models.
We also demonstrate a targeted black-box attack against a commercial classifier, overcoming the challenges of limited query access, partial information, and other practical issues to break the Google Cloud Vision API.
Context is an essential capability for robots that are to be as adaptive as possible in challenging environments.
Although there are many context modeling efforts, they assume a fixed structure and number of contexts.
In this paper, we propose an incremental deep model that extends Restricted Boltzmann Machines.
Our model gets one scene at a time, and gradually extends the contextual model when necessary, either by adding a new context or a new context layer to form a hierarchy.
We show on a scene classification benchmark that our method converges to a good estimate of the contexts of the scenes, and performs better or on-par on several tasks compared to other incremental models or non-incremental models.
Answering visual questions need acquire daily common knowledge and model the semantic connection among different parts in images, which is too difficult for VQA systems to learn from images with the only supervision from answers.
Meanwhile, image captioning systems with beam search strategy tend to generate similar captions and fail to diversely describe images.
To address the aforementioned issues, we present a system to have these two tasks compensate with each other, which is capable of jointly producing image captions and answering visual questions.
In particular, we utilize question and image features to generate question-related captions and use the generated captions as additional features to provide new knowledge to the VQA system.
For image captioning, our system attains more informative results in term of the relative improvements on VQA tasks as well as competitive results using automated metrics.
Applying our system to the VQA tasks, our results on VQA v2 dataset achieve 65.8% using generated captions and 69.1% using annotated captions in validation set and 68.4% in the test-standard set.
Further, an ensemble of 10 models results in 69.7% in the test-standard split.
In complex inferential tasks like question answering, machine learning models must confront two challenges: the need to implement a compositional reasoning process, and, in many applications, the need for this reasoning process to be interpretable to assist users in both development and prediction.
Existing models designed to produce interpretable traces of their decision-making process typically require these traces to be supervised at training time.
In this paper, we present a novel neural modular approach that performs compositional reasoning by automatically inducing a desired sub-task decomposition without relying on strong supervision.
Our model allows linking different reasoning tasks though shared modules that handle common routines across tasks.
Experiments show that the model is more interpretable to human evaluators compared to other state-of-the-art models: users can better understand the model's underlying reasoning procedure and predict when it will succeed or fail based on observing its intermediate outputs.
We consider the problem of blind identification and equalization of single-input multiple-output (SIMO) nonlinear channels.
Specifically, the nonlinear model consists of multiple single-channel Wiener systems that are excited by a common input signal.
The proposed approach is based on a well-known blind identification technique for linear SIMO systems.
By transforming the output signals into a reproducing kernel Hilbert space (RKHS), a linear identification problem is obtained, which we propose to solve through an iterative procedure that alternates between canonical correlation analysis (CCA) to estimate the linear parts, and kernel canonical correlation (KCCA) to estimate the memoryless nonlinearities.
The proposed algorithm is able to operate on systems with as few as two output channels, on relatively small data sets and on colored signals.
Simulations are included to demonstrate the effectiveness of the proposed technique.
We consider the problem of maximizing the harvested power in Multiple Input Multiple Output (MIMO) Simultaneous Wireless Information and Power Transfer (SWIPT) systems with power splitting reception.
Different from recently proposed designs, we target with our novel problem formulation at the jointly optimal transmit precoding and receive uniform power splitting (UPS) ratio maximizing the harvested power, while ensuring that the Quality-of-Service (QoS) requirement of the MIMO link is satisfied.
We assume generic practical Radio Frequency (RF) Energy Harvesting (EH) receive operation that results in a non-convex optimization problem for the design parameters, which we then solve optimally after formulating it in an equivalent generalized convex form.
Our representative results including comparisons of achievable EH gains with benchmark schemes provide key insights on various system parameters.
Meningioma brain tumour discrimination is challenging as many histological patterns are mixed between the different subtypes.
In clinical practice, dominant patterns are investigated for signs of specific meningioma pathology; however the simple observation could result in inter- and intra-observer variation due to the complexity of the histopathological patterns.
Also employing a computerised feature extraction approach applied at a single resolution scale might not suffice in accurately delineating the mixture of histopathological patterns.
In this work we propose a novel multiresolution feature extraction approach for characterising the textural properties of the different pathological patterns (i.e. mainly cell nuclei shape, orientation and spatial arrangement within the cytoplasm).
The pattern textural properties are characterised at various scales and orientations for an improved separability between the different extracted features.
The Gabor filter energy output of each magnitude response was combined with four other fixed-resolution texture signatures (2 model-based and 2 statistical-based) with and without cell nuclei segmentation.
The highest classification accuracy of 95% was reported when combining the Gabor filters energy and the meningioma subimage fractal signature as a feature vector without performing any prior cell nuceli segmentation.
This indicates that characterising the cell-nuclei self-similarity properties via Gabor filters can assists in achieving an improved meningioma subtype classification, which can assist in overcoming variations in reported diagnosis.
We design monitor optimisations for detectEr, a runtime-verification tool synthesising systems of concurrent monitors from correctness properties for Erlang programs.
We implement these optimisations as part of the existing tool and show that they yield considerably lower runtime overheads when compared to the unoptimised monitor synthesis.
This paper discusses two existing approaches to the correlation analysis between automatic evaluation metrics and human scores in the area of natural language generation.
Our experiments show that depending on the usage of a system- or sentence-level correlation analysis, correlation results between automatic scores and human judgments are inconsistent.
Data communication in cloud-based distributed stream data analytics often involves a collection of parallel and pipelined TCP flows.
As the standard TCP congestion control mechanism is designed for achieving "fairness" among competing flows and is agnostic to the application layer contexts, the bandwidth allocation among a set of TCP flows traversing bottleneck links often leads to sub-optimal application-layer performance measures, e.g., stream processing throughput or average tuple complete latency.
Motivated by this and enabled by the rapid development of the Software-Defined Networking (SDN) techniques, in this paper, we re-investigate the design space of the bandwidth allocation problem and propose a cross-layer framework which utilizes the additional information obtained from the application layer and provides on-the-fly and dynamic bandwidth adjustment algorithms for helping the stream analytics applications achieving better performance during the runtime.
We implement a prototype cross-layer bandwidth allocation framework based on a popular open-source distributed stream processing platform, Apache Storm, together with the OpenDaylight controller, and carry out extensive experiments with real-world analytical workloads on top of a local cluster consisting of 10 workstations interconnected by a SDN-enabled switch.
The experiment results clearly validate the effectiveness and efficiency of our proposed framework and algorithms.
We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model.
Generative autoencoders are those which are trained to softly enforce a prior on the latent distribution learned by the inference model.
We call the distribution to which the inference model maps observed samples, the learned latent distribution, which may not be consistent with the prior.
We formulate a Markov chain Monte Carlo (MCMC) sampling process, equivalent to iteratively decoding and encoding, which allows us to sample from the learned latent distribution.
Since, the generative model learns to map from the learned latent distribution, rather than the prior, we may use MCMC to improve the quality of samples drawn from the generative model, especially when the learned latent distribution is far from the prior.
Using MCMC sampling, we are able to reveal previously unseen differences between generative autoencoders trained either with or without a denoising criterion.
Nowadays, many vegetables are grown insidegreenhouses in which environment is controlled and nutrition can be supplied through water supply using electrical pump, namely fertigation.
Dosage of nutrition in water for many vegetable plants are also known so that by controllingwater supply all the needsfor the plants to grow are available.
Furthermore, water supply can be controlled using electrical pump which is activated according to theplants conditionin relation with water supply.
In order to supply water and nutrition in the right amount and time, plants condition can be observed using a CCD camera attached to image processing facilitiesto develop a speaking plant approach.
In this study, plants development during their growing periodare observedusing image processing.
Three populationsof tomato plants, with less, enough, and exceeded nutrition in water,are captured using a CCD camera every three days, and the images were analyzed using a developed computer program for the heightof plants.
The results showed that the development of the plants can be monitored using this method.
After that, the responseof plant growth in the same condition was monitored, and the responsewas used as input for the fertigation system to turn electrical pump automatically on and off, so the fertigation system could maintain the growth of the plants.
In this work, we investigate an efficient numerical approach for solving higher order statistical methods for blind and semi-blind signal recovery from non-ideal channels.
We develop numerical algorithms based on convex optimization relaxation for minimization of higher order statistical cost functions.
The new formulation through convex relaxation overcomes the local convergence problem of existing gradient descent based algorithms and applies to several well-known cost functions for effective blind signal recovery including blind equalization and blind source separation in both single-input-single-output (SISO) and multi-input-multi-output (MIMO) systems.
We also propose a fourth order pilot based cost function that benefits from this approach.
The simulation results demonstrate that our approach is suitable for short-length packet data transmission using only a few pilot symbols.
We propose a data-dependent denoising procedure to restore noisy images.
Different from existing denoising algorithms which search for patches from either the noisy image or a generic database, the new algorithm finds patches from a database that contains only relevant patches.
We formulate the denoising problem as an optimal filter design problem and make two contributions.
First, we determine the basis function of the denoising filter by solving a group sparsity minimization problem.
The optimization formulation generalizes existing denoising algorithms and offers systematic analysis of the performance.
Improvement methods are proposed to enhance the patch search process.
Second, we determine the spectral coefficients of the denoising filter by considering a localized Bayesian prior.
The localized prior leverages the similarity of the targeted database, alleviates the intensive Bayesian computation, and links the new method to the classical linear minimum mean squared error estimation.
We demonstrate applications of the proposed method in a variety of scenarios, including text images, multiview images and face images.
Experimental results show the superiority of the new algorithm over existing methods.
For popular websites most important concern is to handle incoming load dynamically among web servers, so that they can respond to their client without any wait or failure.
Different websites use different strategies to distribute load among web servers but most of the schemes concentrate on only one factor that is number of requests, but none of the schemes consider the point that different type of requests will require different level of processing efforts to answer, status record of all the web servers that are associated with one domain name and mechanism to handle a situation when one of the servers is not working.
Therefore, there is a fundamental need to develop strategy for dynamic load allocation on web side.
In this paper, an effort has been made to introduce a cluster based frame work to solve load distribution problem.
This framework aims to distribute load among clusters on the basis of their operational capabilities.
Moreover, the experimental results are shown with the help of example, algorithm and analysis of the algorithm.
Lack of moderation in online communities enables participants to incur in personal aggression, harassment or cyberbullying, issues that have been accentuated by extremist radicalisation in the contemporary post-truth politics scenario.
This kind of hostility is usually expressed by means of toxic language, profanity or abusive statements.
Recently Google has developed a machine-learning-based toxicity model in an attempt to assess the hostility of a comment; unfortunately, it has been suggested that said model can be deceived by adversarial attacks that manipulate the text sequence of the comment.
In this paper we firstly characterise such adversarial attacks as using obfuscation and polarity transformations.
The former deceives by corrupting toxic trigger content with typographic edits, whereas the latter deceives by grammatical negation of the toxic content.
Then, we propose a two--stage approach to counter--attack these anomalies, bulding upon a recently proposed text deobfuscation method and the toxicity scoring model.
Lastly, we conducted an experiment with approximately 24000 distorted comments, showing how in this way it is feasible to restore toxicity of the adversarial variants, while incurring roughly on a twofold increase in processing time.
Even though novel adversary challenges would keep coming up derived from the versatile nature of written language, we anticipate that techniques combining machine learning and text pattern recognition methods, each one targeting different layers of linguistic features, would be needed to achieve robust detection of toxic language, thus fostering aggression--free digital interaction.
Urban and Bierman introduced a calculus of proof terms for the sequent calculus LK with a strongly normalizing reduction relation.
We extend this calculus to simply-typed higher-order logic with inferences for induction and equality, albeit without strong normalization.
We implement thiscalculus in GAPT, our library for proof transformations.
Evaluating the normalization on both artificial and real-world benchmarks, we show that this algorithm is typically several orders of magnitude faster than the existing Gentzen-like cut-reduction, and an order of magnitude faster than any other cut-elimination procedure implemented in GAPT.
In this paper we describe an end to end Neural Model for Named Entity Recognition NER) which is based on Bi-Directional RNN-LSTM.
Almost all NER systems for Hindi use Language Specific features and handcrafted rules with gazetteers.
Our model is language independent and uses no domain specific features or any handcrafted rules.
Our models rely on semantic information in the form of word vectors which are learnt by an unsupervised learning algorithm on an unannotated corpus.
Our model attained state of the art performance in both English and Hindi without the use of any morphological analysis or without using gazetteers of any sort.
Plastic surgery and disguise variations are two of the most challenging co-variates of face recognition.
The state-of-art deep learning models are not sufficiently successful due to the availability of limited training samples.
In this paper, a novel framework is proposed which transfers fundamental visual features learnt from a generic image dataset to supplement a supervised face recognition model.
The proposed algorithm combines off-the-shelf supervised classifier and a generic, task independent network which encodes information related to basic visual cues such as color, shape, and texture.
Experiments are performed on IIITD plastic surgery face dataset and Disguised Faces in the Wild (DFW) dataset.
Results showcase that the proposed algorithm achieves state of the art results on both the datasets.
Specifically on the DFW database, the proposed algorithm yields over 87% verification accuracy at 1% false accept rate which is 53.8% better than baseline results computed using VGGFace.
The detection of a volumetric attack involves collecting statistics on the network traffic, and identifying suspicious activities.
We assume that available statistical information includes the number of packets and the number of bytes passed per flow.
We apply methods of machine learning to detect malicious traffic.
A prototype project is implemented as a module for the Floodlight controller.
The prototype was tested on the Mininet simulation platform.
The simulated topology includes a number of edge switches, a connected graph of core switches, and a number of server and user hosts.
The server hosts run simple web servers.
The user hosts simulate web clients.
The controller employs Dijkstra's algorithm to find the best flow in the graph.
The controller periodically polls the edge switches and provides current and historical statistics on each active flow.
The streaming analytics evaluates the traffic volume and detects volumetric attacks.
Citizen Broadband Radio Service band (3550 - 3700 GHz) is seen as one of the key frequency bands to enable improvements in performance of wireless broadband and cellular systems.
A careful study of interference caused by a secondary cellular communication system coexisting with an incumbent naval radar is required to establish a pragmatic protection distance, which not only protects the incumbent from harmful interference but also increases the spectrum access opportunity for the secondary system.
In this context, this paper investigates the co-channel and adjacent channel coexistence of a ship-borne naval radar and a wide-area cellular communication system and presents the analysis of interference caused by downlink transmission in the cellular system on the naval radar for different values of radar protection distance.
The results of such analysis suggest that maintaining a protection distance of 30 km from the radar will ensure the required INR protection criterion of -6 dB at the radar receiver with > 0.9 probability, even when the secondary network operates in the same channel as the radar.
Novel power control algorithms to assign operating powers to the coexisting cellular devices are also proposed to further reduce the protection distance from radar while still meeting the radar INR protection requirement.
Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that the (initially unknown) action costs are determined by motion plans in a corresponding continuous space.
However, a task-motion plan can be sensitive to unexpected domain uncertainty and changes, leading to suboptimal behaviors or execution failures.
In this paper, we propose a novel framework, TMP-RL, which is an integration of TMP and reinforcement learning (RL) from the execution experience, to solve the problem of robust task-motion planning in dynamic and uncertain domains.
TMP-RL features two nested planning-learning loops.
In the inner TMP loop, the robot generates a low-cost, feasible task-motion plan by iteratively planning in the discrete space and updating relevant action costs evaluated by the motion planner in continuous space.
In the outer loop, the plan is executed, and the robot learns from the execution experience via model-free RL, to further improve its task-motion plans.
RL in the outer loop is more accurate to the current domain but also more expensive, and using less costly task and motion planning leads to a jump-start for learning in the real world.
Our approach is evaluated on a mobile service robot conducting navigation tasks in an office area.
Results show that TMP-RL approach significantly improves adaptability and robustness (in comparison to TMP methods) and leads to rapid convergence (in comparison to task planning (TP)-RL methods).
We also show that TMP-RL can reuse learned values to smoothly adapt to new scenarios during long-term deployments.
In January 2015 we distributed an online survey about failures in robotics and intelligent systems across robotics researchers.
The aim of this survey was to find out which types of failures currently exist, what their origins are, and how systems are monitored and debugged - with a special focus on performance bugs.
This report summarizes the findings of the survey.
As of today, abuse is a pressing issue to participants and administrators of Online Social Networks (OSN).
Abuse in Twitter can spawn from arguments generated for influencing outcomes of a political election, the use of bots to automatically spread misinformation, and generally speaking, activities that deny, disrupt, degrade or deceive other participants and, or the network.
Given the difficulty in finding and accessing a large enough sample of abuse ground truth from the Twitter platform, we built and deployed a custom crawler that we use to judiciously collect a new dataset from the Twitter platform with the aim of characterizing the nature of abusive users, a.k.a abusive birds, in the wild.
We provide a comprehensive set of features based on users' attributes, as well as social-graph metadata.
The former includes metadata about the account itself, while the latter is computed from the social graph among the sender and the receiver of each message.
Attribute-based features are useful to characterize user's accounts in OSN, while graph-based features can reveal the dynamics of information dissemination across the network.
In particular, we derive the Jaccard index as a key feature to reveal the benign or malicious nature of directed messages in Twitter.
To the best of our knowledge, we are the first to propose such a similarity metric to characterize abuse in Twitter.
We develop new polynomial methods for studying systems of word equations.
We use them to improve some earlier results and to analyze how sizes of systems of word equations satisfying certain independence properties depend on the lengths of the equations.
These methods give the first nontrivial upper bounds for the sizes of the systems.
Quadrature sampling has been widely applied in coherent radar systems to extract in-phase and quadrature (I and Q) components in the received radar signal.
However, the sampling is inefficient because the received signal contains only a small number of significant target signals.
This paper incorporates the compressive sampling (CS) theory into the design of the quadrature sampling system, and develops a quadrature compressive sampling (QuadCS) system to acquire the I and Q components with low sampling rate.
The QuadCS system first randomly projects the received signal into a compressive bandpass signal and then utilizes the quadrature sampling to output compressive I and Q components.
The compressive outputs are used to reconstruct the I and Q components.
To understand the system performance, we establish the frequency domain representation of the QuadCS system.
With the waveform-matched dictionary, we prove that the QuadCS system satisfies the restricted isometry property with overwhelming probability.
For K target signals in the observation interval T, simulations show that the QuadCS requires just O(Klog(BT/K)) samples to stably reconstruct the signal, where B is the signal bandwidth.
The reconstructed signal-to-noise ratio decreases by 3dB for every octave increase in the target number K and increases by 3dB for every octave increase in the compressive bandwidth.
Theoretical analyses and simulations verify that the proposed QuadCS is a valid system to acquire the I and Q components in the received radar signals.
Rule-based modelling allows to represent molecular interactions in a compact and natural way.
The underlying molecular dynamics, by the laws of stochastic chemical kinetics, behaves as a continuous-time Markov chain.
However, this Markov chain enumerates all possible reaction mixtures, rendering the analysis of the chain computationally demanding and often prohibitive in practice.
We here describe how it is possible to efficiently find a smaller, aggregate chain, which preserves certain properties of the original one.
Formal methods and lumpability notions are used to define algorithms for automated and efficient construction of such smaller chains (without ever constructing the original ones).
We here illustrate the method on an example and we discuss the applicability of the method in the context of modelling large signalling pathways.
Hand gesture recognition possesses extensive applications in virtual reality, sign language recognition, and computer games.
The direct interface of hand gestures provides us a new way for communicating with the virtual environment.
In this paper a novel and real-time approach for hand gesture recognition system is presented.
In the suggested method, first, the hand gesture is extracted from the main image by the image segmentation and morphological operation and then is sent to feature extraction stage.
In feature extraction stage the Cross-correlation coefficient is applied on the gesture to recognize it.
In the result part, the proposed approach is applied on American Sign Language (ASL) database and the accuracy rate obtained 98.34%.
Through a series of examples, we illustrate some important drawbacks that the action logic framework suffers from in its ability to represent the dynamics of information updates.
We argue that these problems stem from the fact that the action model, a central construct designed to encode agents' uncertainty about actions, is itself effectively common knowledge amongst the agents.
In response to these difficulties, we motivate and propose an alternative semantics that avoids them by (roughly speaking) endogenizing the action model.
We discuss the relationship to action logic, and provide a sound and complete axiomatization.
Building multi-turn information-seeking conversation systems is an important and challenging research topic.
Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications.
Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications.
To alleviate these problems, we study transfer learning for multi-turn information seeking conversations in this paper.
We first propose an efficient and effective multi-turn conversation model based on convolutional neural networks.
After that, we extend our model to adapt the knowledge learned from a resource-rich domain to enhance the performance.
Finally, we deployed our model in an industrial chatbot called AliMe Assist (https://consumerservice.taobao.com/online-help) and observed a significant improvement over the existing online model.
We present a method to determine Fashion DNA, coordinate vectors locating fashion items in an abstract space.
Our approach is based on a deep neural network architecture that ingests curated article information such as tags and images, and is trained to predict sales for a large set of frequent customers.
In the process, a dual space of customer style preferences naturally arises.
Interpretation of the metric of these spaces is straightforward: The product of Fashion DNA and customer style vectors yields the forecast purchase likelihood for the customer-item pair, while the angle between Fashion DNA vectors is a measure of item similarity.
Importantly, our models are able to generate unbiased purchase probabilities for fashion items based solely on article information, even in absence of sales data, thus circumventing the "cold-start problem" of collaborative recommendation approaches.
Likewise, it generalizes easily and reliably to customers outside the training set.
We experiment with Fashion DNA models based on visual and/or tag item data, evaluate their recommendation power, and discuss the resulting article similarities.
To analyze the failure risk of asynchronous digital circuits the time-parameter is introduced into the Boolean algebra replacing the arithmetic operations by logical operations.
There considered an example of construction of signals passing through the logical elements, using the described below mathematical apparatus.
In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection.
The proposed framework is aiming to address two limits of the existing CNN based methods.
First, region-based CNN methods lack sufficient context to accurately locate salient object since they deal with each region independently.
Second, pixel-based CNN methods suffer from blurry boundaries due to the presence of convolutional and pooling layers.
Motivated by these, we first propose an end-to-end edge-preserved neural network based on Fast R-CNN framework (named RegionNet) to efficiently generate saliency map with sharp object boundaries.
Later, to further improve it, multi-scale spatial context is attached to RegionNet to consider the relationship between regions and the global scenes.
Furthermore, our method can be generally applied to RGB-D saliency detection by depth refinement.
The proposed framework achieves both clear detection boundary and multi-scale contextual robustness simultaneously for the first time, and thus achieves an optimized performance.
Experiments on six RGB and two RGB-D benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance.
The surprising results of Karp, Vazirani and Vazirani and (respectively) Buchbinder et al are examples where rather simple randomizations provide provably better approximations than the corresponding deterministic counterparts for online bipartite matching and (respectively) unconstrained non-monotone submodular maximization.
We show that seemingly strong extensions of the deterministic online computation model can at best match the performance of naive randomization.
More specifically, for bipartite matching, we show that in the priority model (allowing very general ways to order the input stream), we cannot improve upon the trivial 1/2-approximation achieved by any greedy maximal matching algorithm and likewise cannot improve upon this approximation by any log n/log log n number of online algorithms running in parallel.
The latter result yields an improved log log n - log log log n lower bound for the number of advice bits needed.
For max-sat, we adapt the recent de-randomization approach of Buchbinder and Feldman applied to the Buchbinbder et al algorithm for max-sat to obtain a deterministic 3/4-approximation algorithm using width 2n parallelism.
In order to improve upon this approximation, we show that exponential width parallelism of online algorithms is necessary (in a model that is more general than what is needed for the width 2n algorithm).
This paper focuses on a novel and challenging vision task, dense video captioning, which aims to automatically describe a video clip with multiple informative and diverse caption sentences.
The proposed method is trained without explicit annotation of fine-grained sentence to video region-sequence correspondence, but is only based on weak video-level sentence annotations.
It differs from existing video captioning systems in three technical aspects.
First, we propose lexical fully convolutional neural networks (Lexical-FCN) with weakly supervised multi-instance multi-label learning to weakly link video regions with lexical labels.
Second, we introduce a novel submodular maximization scheme to generate multiple informative and diverse region-sequences based on the Lexical-FCN outputs.
A winner-takes-all scheme is adopted to weakly associate sentences to region-sequences in the training phase.
Third, a sequence-to-sequence learning based language model is trained with the weakly supervised information obtained through the association process.
We show that the proposed method can not only produce informative and diverse dense captions, but also outperform state-of-the-art single video captioning methods by a large margin.
Real-world optimisation problems are often dynamic.
Previously good solutions must be updated or replaced due to changes in objectives and constraints.
It is often claimed that evolutionary algorithms are particularly suitable for dynamic optimisation because a large population can contain different solutions that may be useful in the future.
However, rigorous theoretical demonstrations for how populations in dynamic optimisation can be essential are sparse and restricted to special cases.
This paper provides theoretical explanations of how populations can be essential in evolutionary dynamic optimisation in a general and natural setting.
We describe a natural class of dynamic optimisation problems where a sufficiently large population is necessary to keep track of moving optima reliably.
We establish a relationship between the population-size and the probability that the algorithm loses track of the optimum.
Due to various green initiatives, renewable energy will be massively incorporated into the future smart grid.
However, the intermittency of the renewables may result in power imbalance, thus adversely affecting the stability of a power system.
Frequency regulation may be used to maintain the power balance at all times.
As electric vehicles (EVs) become popular, they may be connected to the grid to form a vehicle-to-grid (V2G) system.
An aggregation of EVs can be coordinated to provide frequency regulation services.
However, V2G is a dynamic system where the participating EVs come and go independently.
Thus it is not easy to estimate the regulation capacities for V2G.
In a preliminary study, we modeled an aggregation of EVs with a queueing network, whose structure allows us to estimate the capacities for regulation-up and regulation-down, separately.
The estimated capacities from the V2G system can be used for establishing a regulation contract between an aggregator and the grid operator, and facilitating a new business model for V2G.
In this paper, we extend our previous development by designing a smart charging mechanism which can adapt to given characteristics of the EVs and make the performance of the actual system follow the analytical model.
In this paper, we focus on image inpainting task, aiming at recovering the missing area of an incomplete image given the context information.
Recent development in deep generative models enables an efficient end-to-end framework for image synthesis and inpainting tasks, but existing methods based on generative models don't exploit the segmentation information to constrain the object shapes, which usually lead to blurry results on the boundary.
To tackle this problem, we propose to introduce the semantic segmentation information, which disentangles the inter-class difference and intra-class variation for image inpainting.
This leads to much clearer recovered boundary between semantically different regions and better texture within semantically consistent segments.
Our model factorizes the image inpainting process into segmentation prediction (SP-Net) and segmentation guidance (SG-Net) as two steps, which predict the segmentation labels in the missing area first, and then generate segmentation guided inpainting results.
Experiments on multiple public datasets show that our approach outperforms existing methods in optimizing the image inpainting quality, and the interactive segmentation guidance provides possibilities for multi-modal predictions of image inpainting.
We introduce a seemingly impossible task: given only an audio clip of someone speaking, decide which of two face images is the speaker.
In this paper we study this, and a number of related cross-modal tasks, aimed at answering the question: how much can we infer from the voice about the face and vice versa?
We study this task "in the wild", employing the datasets that are now publicly available for face recognition from static images (VGGFace) and speaker identification from audio (VoxCeleb).
These provide training and testing scenarios for both static and dynamic testing of cross-modal matching.
We make the following contributions: (i) we introduce CNN architectures for both binary and multi-way cross-modal face and audio matching, (ii) we compare dynamic testing (where video information is available, but the audio is not from the same video) with static testing (where only a single still image is available), and (iii) we use human testing as a baseline to calibrate the difficulty of the task.
We show that a CNN can indeed be trained to solve this task in both the static and dynamic scenarios, and is even well above chance on 10-way classification of the face given the voice.
The CNN matches human performance on easy examples (e.g. different gender across faces) but exceeds human performance on more challenging examples (e.g. faces with the same gender, age and nationality).
The rapid uptake of mobile devices and the rising popularity of mobile applications and services pose unprecedented demands on mobile and wireless networking infrastructure.
Upcoming 5G systems are evolving to support exploding mobile traffic volumes, agile management of network resource to maximize user experience, and extraction of fine-grained real-time analytics.
Fulfilling these tasks is challenging, as mobile environments are increasingly complex, heterogeneous, and evolving.
One potential solution is to resort to advanced machine learning techniques to help managing the rise in data volumes and algorithm-driven applications.
The recent success of deep learning underpins new and powerful tools that tackle problems in this space.
In this paper we bridge the gap between deep learning and mobile and wireless networking research, by presenting a comprehensive survey of the crossovers between the two areas.
We first briefly introduce essential background and state-of-the-art in deep learning techniques with potential applications to networking.
We then discuss several techniques and platforms that facilitate the efficient deployment of deep learning onto mobile systems.
Subsequently, we provide an encyclopedic review of mobile and wireless networking research based on deep learning, which we categorize by different domains.
Drawing from our experience, we discuss how to tailor deep learning to mobile environments.
We complete this survey by pinpointing current challenges and open future directions for research.
It has been shown that increasing model depth improves the quality of neural machine translation.
However, different architectural variants to increase model depth have been proposed, and so far, there has been no thorough comparative study.
In this work, we describe and evaluate several existing approaches to introduce depth in neural machine translation.
Additionally, we explore novel architectural variants, including deep transition RNNs, and we vary how attention is used in the deep decoder.
We introduce a novel "BiDeep" RNN architecture that combines deep transition RNNs and stacked RNNs.
Our evaluation is carried out on the English to German WMT news translation dataset, using a single-GPU machine for both training and inference.
We find that several of our proposed architectures improve upon existing approaches in terms of speed and translation quality.
We obtain best improvements with a BiDeep RNN of combined depth 8, obtaining an average improvement of 1.5 BLEU over a strong shallow baseline.
We release our code for ease of adoption.
We prove that every 1-planar graph G has a z-parallel visibility representation, i.e., a 3D visibility representation in which the vertices are isothetic disjoint rectangles parallel to the xy-plane, and the edges are unobstructed z-parallel visibilities between pairs of rectangles.
In addition, the constructed representation is such that there is a plane that intersects all the rectangles, and this intersection defines a bar 1-visibility representation of G.
Gaussian mixture alignment is a family of approaches that are frequently used for robustly solving the point-set registration problem.
However, since they use local optimisation, they are susceptible to local minima and can only guarantee local optimality.
Consequently, their accuracy is strongly dependent on the quality of the initialisation.
This paper presents the first globally-optimal solution to the 3D rigid Gaussian mixture alignment problem under the L2 distance between mixtures.
The algorithm, named GOGMA, employs a branch-and-bound approach to search the space of 3D rigid motions SE(3), guaranteeing global optimality regardless of the initialisation.
The geometry of SE(3) was used to find novel upper and lower bounds for the objective function and local optimisation was integrated into the scheme to accelerate convergence without voiding the optimality guarantee.
The evaluation empirically supported the optimality proof and showed that the method performed much more robustly on two challenging datasets than an existing globally-optimal registration solution.
Accurate prediction of students knowledge is a fundamental building block of personalized learning systems.
Here, we propose a novel ensemble model to predict student knowledge gaps.
Applying our approach to student trace data from the online educational platform Duolingo we achieved highest score on both evaluation metrics for all three datasets in the 2018 Shared Task on Second Language Acquisition Modeling.
We describe our model and discuss relevance of the task compared to how it would be setup in a production environment for personalized education.
Content marketing is todays one of the most remarkable approaches in the context of marketing processes of companies.
Value of this kind of marketing has improved in time, thanks to the latest developments regarding to computer and communication technologies.
Nowadays, especially social media based platforms have a great importance on enabling companies to design multimedia oriented, interactive content.
But on the other hand, there is still something more to do for improved content marketing approaches.
In this context, objective of this study is to focus on intelligent content marketing, which can be done by using artificial intelligence.
Artificial Intelligence is todays one of the most remarkable research fields and it can be used easily as multidisciplinary.
So, this study has aimed to discuss about its potential on improving content marketing.
In detail, the study has enabled readers to improve their awareness about the intersection point of content marketing and artificial intelligence.
Furthermore, the authors have introduced some example models of intelligent content marketing, which can be achieved by using current Web technologies and artificial intelligence techniques.
Image quality assessment (IQA) is traditionally classified into full-reference (FR) IQA and no-reference (NR) IQA according to whether the original image is required.
Although NR-IQA is widely used in practical applications, room for improvement still remains because of the lack of the reference image.
Inspired by the fact that in many applications, such as parameter selection, a series of distorted images are available, the authors propose a novel comparison-based image quality assessment (C-IQA) method.
The new comparison-based framework parallels FR-IQA by requiring two input images, and resembles NR-IQA by not using the original image.
As a result, the new comparison-based approach has more application scenarios than FR-IQA does, and takes greater advantage of the accessible information than the traditional single-input NR-IQA does.
Further, C-IQA is compared with other state-of-the-art NR-IQA methods on two widely used IQA databases.
Experimental results show that C-IQA outperforms the other NR-IQA methods for parameter selection, and the parameter trimming framework combined with C-IQA saves the computation of iterative image reconstruction up to 80%.
We investigate the maximum coding rate for a given average blocklength and error probability over a K-user discrete memoryless broadcast channel for the scenario where a common message is transmitted using variable-length stop-feedback codes.
For the point-to-point case, Polyanskiy et al.(2011) demonstrated that variable-length coding combined with stop-feedback significantly increases the speed of convergence of the maximum coding rate to capacity.
This speed-up manifests itself in the absence of a square-root penalty in the asymptotic expansion of the maximum coding rate for large blocklengths, i.e., zero dispersion.
In this paper, we present nonasymptotic achievability and converse bounds on the maximum coding rate of the common-message K-user discrete memoryless broadcast channel, which strengthen and generalize the ones reported in Trillingsgaard et al.(2015) for the two-user case.
An asymptotic analysis of these bounds reveals that zero dispersion cannot be achieved for certain common-message broadcast channels (e.g., the binary symmetric broadcast channel).
Furthermore, we identify conditions under which our converse and achievability bounds are tight up to the second order.
Through numerical evaluations, we illustrate that our second-order expansions approximate accurately the maximum coding rate and that the speed of convergence to capacity is indeed slower than for the point-to-point case.
With the popularity of mobile devices and the development of geo-positioning technology, location-based services (LBS) attract much attention and top-k spatial keyword queries become increasingly complex.
It is common to see that clients issue a query to find a restaurant serving pizza and steak, low in price and noise level particularly.
However, most of prior works focused only on the spatial keyword while ignoring these independent numerical attributes.
In this paper we demonstrate, for the first time, the Attributes-Aware Spatial Keyword Query (ASKQ), and devise a two-layer hybrid index structure called Quad-cluster Dual-filtering R-Tree (QDR-Tree).
In the keyword cluster layer, a Quad-Cluster Tree (QC-Tree) is built based on the hierarchical clustering algorithm using kernel k-means to classify keywords.
In the spatial layer, for each leaf node of the QC-Tree, we attach a Dual-Filtering R-Tree (DR-Tree) with two filtering algorithms, namely, keyword bitmap-based and attributes skyline-based filtering.
Accordingly, efficient query processing algorithms are proposed.
Through theoretical analysis, we have verified the optimization both in processing time and space consumption.
Finally, massive experiments with real-data demonstrate the efficiency and effectiveness of QDR-Tree.
A During last two decades, there has been a prolific growth in the chaos based image encryption algorithms.
Up to an extent these algorithms have been able to provide an alternative to exchange large media files (images and videos) over the networks in a secure way.
However, there have been some issues with the implementation of chaos based image ciphers in practice.
One of them is reduced/small key space due to the fact that chaotic behavior is only observed for certain range of system parameters/initial conditions of the chaotic system used in such algorithms.
To overcome this difficulty, we propose a simple, efficient and robust image encryption algorithm based on combined applications of quasigroups and chaotic standard map.
The proposed image cipher is based on the popular substitution-diffusion architecture (Shanon) where a quasigroup of order 256 and chaotic standard map have been used for the substitution and permutation of image pixels respectively.
Due to the introduction of quasigroup as part of the secret key along with the parameter and initial conditions of the chaotic standard map, the key space has been increased significantly.
The proposed image cipher is very fast due to the fact that the substitution based on the quasigroup operations is very simple and can be executed easily through the lookup table operations on Latin squares (which are Cayley operation tables of quasigroups) and the permutation is performed row-by-row as well as column-by-column using the pseudo random number sequences gener-ated through the chaotic standard map.
The security and performance have been analyzed through the histograms, correlation coefficients, information entropy, key sensitivity analysis, differential analysis, key space analysis etc. and the results prove the efficiency and robustness of the proposed image cipher against the possible security threats.
In cloud computing systems slow processing nodes, often referred to as "stragglers", can significantly extend the computation time.
Recent results have shown that error correction coding can be used to reduce the effect of stragglers.
In this work we introduce a scheme that, in addition to using error correction to distribute mixed jobs across nodes, is also able to exploit the work completed by all nodes, including stragglers.
We first consider vector-matrix multiplication and apply maximum distance separable (MDS) codes to small blocks of sub-matrices.
The worker nodes process blocks sequentially, working block-by-block, transmitting partial per-block results to the master as they are completed.
Sub-blocking allows a more continuous completion process, which thereby allows us to exploit the work of a much broader spectrum of processors and reduces computation time.
We then apply this technique to matrix-matrix multiplication using product code.
In this case, we show that the order of computing sub-tasks is a new degree of design freedom that can be exploited to reduce computation time further.
We propose a novel approach to analyze the finishing time, which is different from typical order statistics.
Simulation results show that the expected computation time decreases by a factor of at least two in compared to previous methods.
This paper is devoted to the factorization of multivariate polynomials into products of linear forms, a problem which has applications to differential algebra, to the resolution of systems of polynomial equations and to Waring decomposition (i.e., decomposition in sums of d-th powers of linear forms; this problem is also known as symmetric tensor decomposition).
We provide three black box algorithms for this problem.
Our main contribution is an algorithm motivated by the application to Waring decomposition.
This algorithm reduces the corresponding factorization problem to simultaenous matrix diagonalization, a standard task in linear algebra.
The algorithm relies on ideas from invariant theory, and more specifically on Lie algebras.
Our second algorithm reconstructs a factorization from several bi-variate projections.
Our third algorithm reconstructs it from the determination of the zero set of the input polynomial, which is a union of hyperplanes.
The goal of this work is to improve images of traffic scenes that are degraded by natural causes such as fog, rain and limited visibility during the night.
For these applications, it is next to impossible to get pixel perfect pairs of the same scene, with and without the degrading conditions.
This makes it unsuitable for conventional supervised learning approaches, however, it is easy to collect unpaired images of the scenes in a perfect and in a degraded condition.
To enhance the images taken in a poor visibility condition, domain transfer models can be trained to transform an image from the degraded to the clear domain.
A well-known concept for unsupervised domain transfer are cycle-consistent generative adversarial models.
Unfortunately, the resulting generators often change the structure of the scene.
This causes an undesirable change in the semantics.
We propose three ways to cope with this problem depending on the type of degradation.
A multiple-input multiple-output (MIMO) version of the dirty paper channel is studied, where the channel input and the dirt experience the same fading process and the fading channel state is known at the receiver (CSIR).
This represents settings where signal and interference sources are co-located, such as in the broadcast channel.
First, a variant of Costa's dirty paper coding (DPC) is presented, whose achievable rates are within a constant gap to capacity for all signal and dirt powers.
Additionally, a lattice coding and decoding scheme is proposed, whose decision regions are independent of the channel realizations.
Under Rayleigh fading, the gap to capacity of the lattice coding scheme vanishes with the number of receive antennas, even at finite Signal-to-Noise Ratio (SNR).
Thus, although the capacity of the fading dirty paper channel remains unknown, this work shows it is not far from its dirt-free counterpart.
The insights from the dirty paper channel directly lead to transmission strategies for the two-user MIMO broadcast channel (BC), where the transmitter emits a superposition of desired and undesired (dirt) signals with respect to each receiver.
The performance of the lattice coding scheme is analyzed under different fading dynamics for the two users, showing that high-dimensional lattices achieve rates close to capacity.
SRAM-based FPGAs are increasingly popular in the aerospace industry due to their field programmability and low cost.
However, they suffer from cosmic radiation induced Single Event Upsets (SEUs).
In safety-critical applications, the dependability of the design is a prime concern since failures may have catastrophic consequences.
An early analysis of the relationship between dependability metrics, performability-area trade-off, and different mitigation techniques for such applications can reduce the design effort while increasing the design confidence.
This paper introduces a novel methodology based on probabilistic model checking, for the analysis of the reliability, availability, safety and performance-area tradeoffs of safety-critical systems for early design decisions.
Starting from the high-level description of a system, a Markov reward model is constructed from the Control Data Flow Graph (CDFG) and a component characterization library targeting FPGAs.
The proposed model and exhaustive analysis capture all the failure states (based on the fault detection coverage) and repairs possible in the system.
We present quantitative results based on an FIR filter circuit to illustrate the applicability of the proposed approach and to demonstrate that a wide range of useful dependability and performability properties can be analyzed using the proposed methodology.
The modeling results show the relationship between different mitigation techniques and fault detection coverage, exposing their direct impact on the design for early decisions.
This paper considers a scenario in which a source-destination pair needs to establish a confidential connection against an external eavesdropper, aided by the interference generated by another source-destination pair that exchanges public messages.
The goal is to compute the maximum achievable secrecy degrees of freedom (S.D.o.F) region of a MIMO two-user wiretap network.
First, a cooperative secrecy transmission scheme is proposed, whose feasible set is shown to achieve all S.D.o.F. pairs on the S.D.o.F. region boundary.
In this way, the determination of the S.D.o.F. region is reduced to a problem of maximizing the S.D.o.F. pair over the proposed transmission scheme.
The maximum achievable S.D.o.F. region boundary points are obtained in closed form, and the construction of the precoding matrices achieving the maximum S.D.o.F. region boundary is provided.
The obtained analytical expressions clearly show the relation between the maximum achievable S.D.o.F. region and the number of antennas at each terminal.
The synchronizing word of deterministic automaton is a word in the alphabet of colors (considered as letters) of its edges that maps the automaton to a single state.
A coloring of edges of a directed graph is synchronizing if the coloring turns the graph into deterministic finite automaton possessing a synchronizing word.
The road coloring problem is a problem of synchronizing coloring of directed finite strongly connected graph with constant outdegree of all its vertices if the greatest common divisor of lengths of all its cycles is one.
The problem was posed by Adler, Goodwyn and Weiss over 30 years ago and evoked a noticeable interest among the specialists in theory of graphs, deterministic automata and symbolic dynamics.
The problem is described even in "Wikipedia" - the popular Internet Encyclopedia.
The positive solution of the road coloring problem is presented.
A paraphrase is a restatement of the meaning of a text in other words.
Paraphrases have been studied to enhance the performance of many natural language processing tasks.
In this paper, we propose a novel task iParaphrasing to extract visually grounded paraphrases (VGPs), which are different phrasal expressions describing the same visual concept in an image.
These extracted VGPs have the potential to improve language and image multimodal tasks such as visual question answering and image captioning.
How to model the similarity between VGPs is the key of iParaphrasing.
We apply various existing methods as well as propose a novel neural network-based method with image attention, and report the results of the first attempt toward iParaphrasing.
In horizontal collaborations, carriers form coalitions in order to perform parts of their logistics operations jointly.
By exchanging transportation requests among each other, they can operate more efficiently and in a more sustainable way.
Collaborative vehicle routing has been extensively discussed in the literature.
We identify three major streams of research: (i) centralized collaborative planning, (ii) decentralized planning without auctions, and (ii) auction-based decentralized planning.
For each of them we give a structured overview on the state of knowledge and discuss future research directions.
Estimation of social influence in networks can be substantially biased in observational studies due to homophily and network correlation in exposure to exogenous events.
Randomized experiments, in which the researcher intervenes in the social system and uses randomization to determine how to do so, provide a methodology for credibly estimating of causal effects of social behaviors.
In addition to addressing questions central to the social sciences, these estimates can form the basis for effective marketing and public policy.
In this review, we discuss the design space of experiments to measure social influence through combinations of interventions and randomizations.
We define an experiment as combination of (1) a target population of individuals connected by an observed interaction network, (2) a set of treatments whereby the researcher will intervene in the social system, (3) a randomization strategy which maps individuals or edges to treatments, and (4) a measurement of an outcome of interest after treatment has been assigned.
We review experiments that demonstrate potential experimental designs and we evaluate their advantages and tradeoffs for answering different types of causal questions about social influence.
We show how randomization also provides a basis for statistical inference when analyzing these experiments.
Nested relational query languages have been explored extensively, and underlie industrial language-integrated query systems such as Microsoft's LINQ.
However, relational databases do not natively support nested collections in query results.
This can lead to major performance problems: if programmers write queries that yield nested results, then such systems typically either fail or generate a large number of queries.
We present a new approach to query shredding, which converts a query returning nested data to a fixed number of SQL queries.
Our approach, in contrast to prior work, handles multiset semantics, and generates an idiomatic SQL:1999 query directly from a normal form for nested queries.
We provide a detailed description of our translation and present experiments showing that it offers comparable or better performance than a recent alternative approach on a range of examples.
Suppose that Alice and Bob are given each an infinite string, and they want to decide whether their two strings are in a given relation.
How much communication do they need?
How can communication be even defined and measured for infinite strings?
In this article, we propose a formalism for a notion of infinite communication complexity, prove that it satisfies some natural properties and coincides, for relevant applications, with the classical notion of amortized communication complexity.
More-over, an application is given for tackling some conjecture about tilings and multidimensional sofic shifts.
The word E transformed everything is this world, as well as the whole globe itself.
To a great extend this helps for eco friendly green world.
In educational field, electronic medium has played a major role.
It influenced and changed almost every component of it to electronic medium like e-book, online courses, etc.
Throughout the world, leading universities are offering online courses voluntarily.
Generally we refer to it as Massive Online Open Courses (MOOCs).
There are many debates going on related to success and consequences of MOOCs.
Many are highlighting that these courses are self-paced, economical, and provide quality training to all irrespective of geographical constraints.
But many other academic people go against these points and keep listing many other disadvantages of MOOCs.
This paper explores the basics of MOOCs at the initial section.
Following section will deal with advantages and disadvantages of MOOCs in general.
We the researchers collected the details about the awareness of MOOCs among teachers and students in a higher education institution in Oman.
We have also collected the details about MOOCs implementation and usage within Oman educational society.
Based on the collected information, we have evaluated and presented the findings about MOOCs impact in Oman higher education.
We have felt that doing appropriate improvements in MOOCs may become an imperative medium in Oman educational institutions.
The suggestions are listed in the discussion and recommendation section.
An intriguing open question is whether measurements made on Big Data recording human activities can yield us high-fidelity proxies of socio-economic development and well-being.
Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data?
In this paper, we design a data-driven analytical framework that uses mobility measures and social measures extracted from mobile phone data to estimate indicators for socio-economic development and well-being.
We discover that the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits (i) significant correlation with two different socio-economic indicators and (ii) the highest importance in predictive models built to predict the socio-economic indicators.
Our analytical framework opens an interesting perspective to study human behavior through the lens of Big Data by means of new statistical indicators that quantify and possibly "nowcast" the well-being and the socio-economic development of a territory.
An automated approach to text readability assessment is essential to a language and can be a powerful tool for improving the understandability of texts written and published in that language.
However, the Persian language, which is spoken by over 110 million speakers, lacks such a system.
Unlike other languages such as English, French, and Chinese, very limited research studies have been carried out to build an accurate and reliable text readability assessment system for the Persian language.
In the present research, the first Persian dataset for text readability assessment was gathered and the first model for Persian text readability assessment using machine learning was introduced.
The experiments showed that this model was accurate and could assess the readability of Persian texts with a high degree of confidence.
The results of this study can be used in a number of applications such as medical and educational text readability evaluation and have the potential to be the cornerstone of future studies in Persian text readability assessment.
Deep Learning NLP domain lacks procedures for the analysis of model robustness.
In this paper we propose a framework which validates robustness of any Question Answering model through model explainers.
We propose that a robust model should transgress the initial notion of semantic similarity induced by word embeddings to learn a more human-like understanding of meaning.
We test this property by manipulating questions in two ways: swapping important question word for 1) its semantically correct synonym and 2) for word vector that is close in embedding space.
We estimate importance of words in asked questions with Locally Interpretable Model Agnostic Explanations method (LIME).
With these two steps we compare state-of-the-art Q&A models.
We show that although accuracy of state-of-the-art models is high, they are very fragile to changes in the input.
Moreover, we propose 2 adversarial training scenarios which raise model sensitivity to true synonyms by up to 7% accuracy measure.
Our findings help to understand which models are more stable and how they can be improved.
In addition, we have created and published a new dataset that may be used for validation of robustness of a Q&A model.
In this report, some cosmological correlation functions are used to evaluate the differential performance between C2075 and P100 GPU cards.
In the past, the correlation functions used in this work have been widely studied and exploited on some previous GPU architectures.
The analysis of the performance indicates that a speedup in the range from 13 to 15 is achieved without any additional optimization process for the P100 card.
We analyse a quantum-like Bayesian Network that puts together cause/effect relationships and semantic similarities between events.
These semantic similarities constitute acausal connections according to the Synchronicity principle and provide new relationships to quantum like probabilistic graphical models.
As a consequence, beliefs (or any other event) can be represented in vector spaces, in which quantum parameters are determined by the similarities that these vectors share between them.
Events attached by a semantic meaning do not need to have an explanation in terms of cause and effect.
An adversarial example is an example that has been adjusted to produce a wrong label when presented to a system at test time.
To date, adversarial example constructions have been demonstrated for classifiers, but not for detectors.
If adversarial examples that could fool a detector exist, they could be used to (for example) maliciously create security hazards on roads populated with smart vehicles.
In this paper, we demonstrate a construction that successfully fools two standard detectors, Faster RCNN and YOLO.
The existence of such examples is surprising, as attacking a classifier is very different from attacking a detector, and that the structure of detectors - which must search for their own bounding box, and which cannot estimate that box very accurately - makes it quite likely that adversarial patterns are strongly disrupted.
We show that our construction produces adversarial examples that generalize well across sequences digitally, even though large perturbations are needed.
We also show that our construction yields physical objects that are adversarial.
The renewable energies prediction and particularly global radiation forecasting is a challenge studied by a growing number of research teams.
This paper proposes an original technique to model the insolation time series based on combining Artificial Neural Network (ANN) and Auto-Regressive and Moving Average (ARMA) model.
While ANN by its non-linear nature is effective to predict cloudy days, ARMA techniques are more dedicated to sunny days without cloud occurrences.
Thus, three hybrids models are suggested: the first proposes simply to use ARMA for 6 months in spring and summer and to use an optimized ANN for the other part of the year; the second model is equivalent to the first but with a seasonal learning; the last model depends on the error occurred the previous hour.
These models were used to forecast the hourly global radiation for five places in Mediterranean area.
The forecasting performance was compared among several models: the 3 above mentioned models, the best ANN and ARMA for each location.
In the best configuration, the coupling of ANN and ARMA allows an improvement of more than 1%, with a maximum in autumn (3.4%) and a minimum in winter (0.9%) where ANN alone is the best.
Performance evaluation is a key issue for designers and users of Database Management Systems (DBMSs).
Performance is generally assessed with software benchmarks that help, e.g., test architectural choices, compare different technologies or tune a system.
In the particular context of data warehousing and On-Line Analytical Processing (OLAP), although the Transaction Processing Performance Council (TPC) aims at issuing standard decision-support benchmarks, few benchmarks do actually exist.
We present in this chapter the Data Warehouse Engineering Benchmark (DWEB), which allows generating various ad-hoc synthetic data warehouses and workloads.
DWEB is fully parameterized to fulfill various data warehouse design needs.
However, two levels of parameterization keep it relatively easy to tune.
We also expand on our previous work on DWEB by presenting its new Extract, Transform, and Load (ETL) feature as well as its new execution protocol.
A Java implementation of DWEB is freely available on-line, which can be interfaced with most existing relational DMBSs.
To the best of our knowledge, DWEB is the only easily available, up-to-date benchmark for data warehouses.
We propose an efficient and scalable method for incrementally building a dense, semantically annotated 3D map in real-time.
The proposed method assigns class probabilities to each region, not each element (e.g., surfel and voxel), of the 3D map which is built up through a robust SLAM framework and incrementally segmented with a geometric-based segmentation method.
Differently from all other approaches, our method has a capability of running at over 30Hz while performing all processing components, including SLAM, segmentation, 2D recognition, and updating class probabilities of each segmentation label at every incoming frame, thanks to the high efficiency that characterizes the computationally intensive stages of our framework.
By utilizing a specifically designed CNN to improve the frame-wise segmentation result, we can also achieve high accuracy.
We validate our method on the NYUv2 dataset by comparing with the state of the art in terms of accuracy and computational efficiency, and by means of an analysis in terms of time and space complexity.
Guided troubleshooting is an inherent task in the domain of technical support services.
When a customer experiences an issue with the functioning of a technical service or a product, an expert user helps guide the customer through a set of steps comprising a troubleshooting procedure.
The objective is to identify the source of the problem through a set of diagnostic steps and observations, and arrive at a resolution.
Procedures containing these set of diagnostic steps and observations in response to different problems are common artifacts in the body of technical support documentation.
The ability to use machine learning and linguistics to understand and leverage these procedures for applications like intelligent chatbots or robotic process automation, is crucial.
Existing research on question answering or intelligent chatbots does not look within procedures or deep-understand them.
In this paper, we outline a system for mining procedures from technical support documents.
We create models for solving important subproblems like extraction of procedures, identifying decision points within procedures, identifying blocks of instructions corresponding to these decision points and mapping instructions within a decision block.
We also release a dataset containing our manual annotations on publicly available support documents, to promote further research on the problem.
The paper presents three self-stabilizing protocols for basic fair and reliable link communication primitives.
We assume a link-register communication model under read/write atomicity, where every process can read from but cannot write into its neighbours' registers.
The first primitive guarantees that any process writes a new value in its register(s) only after all its neighbours have read the previous value, whatever the initial scheduling of processes' actions.
The second primitive implements a "weak rendezvous" communication mechanism by using an alternating bit protocol: whenever a process consecutively writes n values (possibly the same ones) in a register, each neighbour is guaranteed to read each value from the register at least once.
On the basis of the previous protocol, the third primitive implements a "quasi rendezvous": in words, this primitive ensures furthermore that there exists exactly one reading between two writing operations All protocols are self-stabilizing and run in asynchronous arbitrary networks.
The goal of the paper is in handling each primitive by a separate procedure, which can be used as a "black box" in more involved self-stabilizing protocols.
This short text summarizes the work in biology proposed in our book, Perspectives on Organisms, where we analyse the unity proper to organisms by looking at it from different viewpoints.
We discuss the theoretical roles of biological time, complexity, theoretical symmetries, singularities and critical transitions.
We explicitly borrow from the conclusions in some key chapters and introduce them by a reflection on "incompleteness", also proposed in the book.
We consider that incompleteness is a fundamental notion to understand the way in which we construct knowledge.
Then we will introduce an approach to biological dynamics where randomness is central to the theoretical determination: randomness does not oppose biological stability but contributes to it by variability, adaptation, and diversity.
Then, evolutionary and ontogenetic trajectories are continual changes of coherence structures involving symmetry changes within an ever-changing global stability.
Email tracking allows email senders to collect fine-grained behavior and location data on email recipients, who are uniquely identifiable via their email address.
Such tracking invades user privacy in that email tracking techniques gather data without user consent or awareness.
Striving to increase privacy in email communication, this paper develops a detection engine to be the core of a selective tracking blocking mechanism in the form of three contributions.
First, a large collection of email newsletters is analyzed to show the wide usage of tracking over different countries, industries and time.
Second, we propose a set of features geared towards the identification of tracking images under real-world conditions.
Novel features are devised to be computationally feasible and efficient, generalizable and resilient towards changes in tracking infrastructure.
Third, we test the predictive power of these features in a benchmarking experiment using a selection of state- of-the-art classifiers to clarify the effectiveness of model-based tracking identification.
We evaluate the expected accuracy of the approach on out-of-sample data, over increasing periods of time, and when faced with unknown senders.
Conventional approaches to image de-fencing suffer from non-robust fence detection and are limited to processing images of static scenes.
In this position paper, we propose an automatic de-fencing algorithm for images of dynamic scenes.
We divide the problem of image de-fencing into the tasks of automated fence detection, motion estimation and fusion of data from multiple frames of a captured video of the dynamic scene.
Fences are detected automatically using two approaches, namely, employing Gabor filter and a machine learning method.
We cast the fence removal problem in an optimization framework, by modeling the formation of the degraded observations.
The inverse problem is solved using split Bregman technique assuming total variation of the de-fenced image as the regularization constraint.
In this paper, systems of linear differential equations with crisp real coefficients and with initial condition described by a vector of fuzzy numbers are studied.
A new method based on the geometric representations of linear transformations is proposed to find a solution.
The most important difference between this method and methods offered in previous papers is that the solution is considered to be a fuzzy set of real vector-functions rather than a fuzzy vector-function.
Each member of the set satisfies the given system with a certain possibility.
It is shown that at any time the solution constitutes a fuzzy region in the coordinate space, alfa-cuts of which are nested parallelepipeds.
Proposed method is illustrated on examples.
Biomimetic entirely soft robots with animal-like behavior and integrated artificial nervous systems will open up totally new perspectives and applications.
However, until now all presented studies on soft robots were limited to partly soft designs, since all designs at least needed conventional, stiff, electronics, to sense, process signals and activate actuators.
We present the first soft robot with integrated artificial nervous system entirely made of dielectric elastomers - and without any conventional stiff electronic parts.
Supplied with only one external DC voltage, the robot autonomously generates all signals necessary to drive its actuators, and translates an in-plane electromechanical oscillation into a crawling locomotion movement.
Thereby, all functional parts are made of polymer materials and carbon.
Besides the basic design of the world's first entirely soft robot we present prospects to control general behavior of such robots.
Topic modeling is a very powerful technique in data analysis and data mining but it is generally slow.
Many parallelization approaches have been proposed to speed up the learning process.
However, they are usually not very efficient because of the many kinds of overhead, especially the load-balancing problem.
We address this problem by proposing three partitioning algorithms, which either run more quickly or achieve better load balance than current partitioning algorithms.
These algorithms can easily be extended to improve parallelization efficiency on other topic models similar to LDA, e.g., Bag of Timestamps, which is an extension of LDA with time information.
We evaluate these algorithms on two popular datasets, NIPS and NYTimes.
We also build a dataset containing over 1,000,000 scientific publications in the computer science domain from 1951 to 2010 to experiment with Bag of Timestamps parallelization, which we design to demonstrate the proposed algorithms' extensibility.
The results strongly confirm the advantages of these algorithms.
We study the information rates of unipolar orthogonal frequency division multiplexing (OFDM) in discrete-time optical intensity channels (OIC) with Gaussian noise under average optical power constraint.
Several single-, double-, and multicomponent unipolar OFDM schemes are considered under the assumption that independent and identically distributed (i.i.d.)
Gaussian or complex Gaussian codebook ensemble and nearest neighbor decoding (minimum Euclidean distance decoding) are used.
We obtain an array of information rate result.
These results validate existing signal-to-noise-and-distortion-ratio (SNDR) based rate analysis, establish the equivalence of information rates of certain schemes, and demonstrate the evident benefits of using component-multiplexing at high signal-to-noise-ratio (SNR).
For double- and multi-component schemes, the component power allocation strategies that maximize the information rates are investigated.
In particular, by utilizing a power allocation strategy, we prove that several multi-component schemes approach the high SNR capacity of the discrete-time Gaussian OIC under average power constraint to within 0.07 bits.
3D objects (artefacts) are made to fulfill functions.
Designing an object often starts with defining a list of functionalities that it should provide, also known as functional requirements.
Today, the design of 3D object models is still a slow and largely artisanal activity, with few Computer-Aided Design (CAD) tools existing to aid the exploration of the design solution space.
To accelerate the design process, we introduce an algorithm for generating object shapes with desired functionalities.
Following the concept of form follows function, we assume that existing object shapes were rationally chosen to provide desired functionalities.
First, we use an artificial neural network to learn a function-to-form mapping by analysing a dataset of objects labeled with their functionalities.
Then, we combine forms providing one or more desired functions, generating an object shape that is expected to provide all of them.
Finally, we verify in simulation whether the generated object possesses the desired functionalities, by defining and executing functionality tests on it.
Evidence of signatures associated with cryptographic modes of operation is established.
Motivated by some analogies between cryptographic and dynamical systems, in particular with chaos theory, we propose an algorithm based on Lyapunov exponents of discrete dynamical systems to estimate the divergence among ciphertexts as the encryption algorithm is applied iteratively.
The results allow to distinguish among six modes of operation, namely ECB, CBC, OFB, CFB, CTR and PCBC using DES, IDEA, TEA and XTEA block ciphers of 64 bits, as well as AES, RC6, Twofish, Seed, Serpent and Camellia block ciphers of 128 bits.
Furthermore, the proposed methodology enables a classification of modes of operation of cryptographic systems according to their strength.
The class of Gaussian Process (GP) methods for Temporal Difference learning has shown promise for data-efficient model-free Reinforcement Learning.
In this paper, we consider a recent variant of the GP-SARSA algorithm, called Sparse Pseudo-input Gaussian Process SARSA (SPGP-SARSA), and derive recursive formulas for its predictive moments.
This extension promotes greater memory efficiency, since previous computations can be reused and, interestingly, it provides a technique for updating value estimates on a multiple timescales
In the field of mutation analysis, mutation is the systematic generation of mutated programs (i.e., mutants) from an original program.
The concept of mutation has been widely applied to various testing problems, including test set selection, fault localization, and program repair.
However, surprisingly little focus has been given to the theoretical foundation of mutation-based testing methods, making it difficult to understand, organize, and describe various mutation-based testing methods.
This paper aims to consider a theoretical framework for understanding mutation-based testing methods.
While there is a solid testing framework for general testing, this is incongruent with mutation-based testing methods, because it focuses on the correctness of a program for a test, while the essence of mutation-based testing concerns the differences between programs (including mutants) for a test.
In this paper, we begin the construction of our framework by defining a novel testing factor, called a test differentiator, to transform the paradigm of testing from the notion of correctness to the notion of difference.
We formally define behavioral differences of programs for a set of tests as a mathematical vector, called a d-vector.
We explore the multi-dimensional space represented by d-vectors, and provide a graphical model for describing the space.
Based on our framework and formalization, we interpret existing mutation-based fault localization methods and mutant set minimization as applications, and identify novel implications for future work.
Epistemic logic with non-standard knowledge operators, especially the "knowing-value" operator, has recently gathered much attention.
With the "knowing-value" operator, we can express knowledge of individual variables, but not of the relations between them in general.
In this paper, we propose a new operator Kf to express knowledge of the functional dependencies between variables.
The semantics of this Kf operator uses a function domain which imposes a constraint on what counts as a functional dependency relation.
By adjusting this function domain, different interesting logics arise, and in this paper we axiomatize three such logics in a single agent setting.
Then we show how these three logics can be unified by allowing the function domain to vary relative to different agents and possible worlds.
A multiagent axiomatization is given in this case.
Unpaired Image-to-Image translation aims to convert the image from one domain (input domain A) to another domain (target domain B), without providing paired examples for the training.
The state-of-the-art, Cycle-GAN demonstrated the power of Generative Adversarial Networks with Cycle-Consistency Loss.
While its results are promising, there is scope for optimization in the training process.
This paper introduces a new neural network architecture, which only learns the translation from domain A to B and eliminates the need for reverse mapping (B to A), by introducing a new Deviation-loss term.
Furthermore, few other improvements to the Cycle-GAN are found and utilized in this new architecture, contributing to significantly lesser training duration.
We present an algorithm for creating high resolution anatomically plausible images consistent with acquired clinical brain MRI scans with large inter-slice spacing.
Although large data sets of clinical images contain a wealth of information, time constraints during acquisition result in sparse scans that fail to capture much of the anatomy.
These characteristics often render computational analysis impractical as many image analysis algorithms tend to fail when applied to such images.
Highly specialized algorithms that explicitly handle sparse slice spacing do not generalize well across problem domains.
In contrast, we aim to enable application of existing algorithms that were originally developed for high resolution research scans to significantly undersampled scans.
We introduce a generative model that captures fine-scale anatomical structure across subjects in clinical image collections and derive an algorithm for filling in the missing data in scans with large inter-slice spacing.
Our experimental results demonstrate that the resulting method outperforms state-of-the-art upsampling super-resolution techniques, and promises to facilitate subsequent analysis not previously possible with scans of this quality.
Our implementation is freely available at https://github.com/adalca/papago .
Geospatial extensions of SPARQL like GeoSPARQL and stSPARQL have recently been defined and corresponding geospatial RDF stores have been implemented.
However, there is no widely used benchmark for evaluating geospatial RDF stores which takes into account recent advances to the state of the art in this area.
In this paper, we develop a benchmark, called Geographica, which uses both real-world and synthetic data to test the offered functionality and the performance of some prominent geospatial RDF stores.
In this paper, we present Watasense, an unsupervised system for word sense disambiguation.
Given a sentence, the system chooses the most relevant sense of each input word with respect to the semantic similarity between the given sentence and the synset constituting the sense of the target word.
Watasense has two modes of operation.
The sparse mode uses the traditional vector space model to estimate the most similar word sense corresponding to its context.
The dense mode, instead, uses synset embeddings to cope with the sparsity problem.
We describe the architecture of the present system and also conduct its evaluation on three different lexical semantic resources for Russian.
We found that the dense mode substantially outperforms the sparse one on all datasets according to the adjusted Rand index.
Numerous pattern recognition applications can be formed as learning from graph-structured data, including social network, protein-interaction network, the world wide web data, knowledge graph, etc.
While convolutional neural network (CNN) facilitates great advances in gridded image/video understanding tasks, very limited attention has been devoted to transform these successful network structures (including Inception net, Residual net, Dense net, etc.) to establish convolutional networks on graph, due to its irregularity and complexity geometric topologies (unordered vertices, unfixed number of adjacent edges/vertices).
In this paper, we aim to give a comprehensive analysis of when work matters by transforming different classical network structures to graph CNN, particularly in the basic graph recognition problem.
Specifically, we firstly review the general graph CNN methods, especially in its spectral filtering operation on the irregular graph data.
We then introduce the basic structures of ResNet, Inception and DenseNet into graph CNN and construct these network structures on graph, named as G_ResNet, G_Inception, G_DenseNet.
In particular, it seeks to help graph CNNs by shedding light on how these classical network structures work and providing guidelines for choosing appropriate graph network frameworks.
Finally, we comprehensively evaluate the performance of these different network structures on several public graph datasets (including social networks and bioinformatic datasets), and demonstrate how different network structures work on graph CNN in the graph recognition task.
In this paper we propose an end-to-end trainable deep neural network model for egocentric activity recognition.
Our model is built on the observation that egocentric activities are highly characterized by the objects and their locations in the video.
Based on this, we develop a spatial attention mechanism that enables the network to attend to regions containing objects that are correlated with the activity under consideration.
We learn highly specialized attention maps for each frame using class-specific activations from a CNN pre-trained for generic image recognition, and use them for spatio-temporal encoding of the video with a convolutional LSTM.
Our model is trained in a weakly supervised setting using raw video-level activity-class labels.
Nonetheless, on standard egocentric activity benchmarks our model surpasses by up to +6% points recognition accuracy the currently best performing method that leverages hand segmentation and object location strong supervision for training.
We visually analyze attention maps generated by the network, revealing that the network successfully identifies the relevant objects present in the video frames which may explain the strong recognition performance.
We also discuss an extensive ablation analysis regarding the design choices.
The recent advancement in computing technologies and resulting vision based applications have gives rise to a novel practice called telemedicine that requires patient diagnosis images or allied information to recommend or even perform diagnosis practices being located remotely.
However, to ensure accurate and optimal telemedicine there is the requirement of seamless or flawless biomedical information about patient.
On the contrary, medical data transmitted over insecure channel often remains prone to get manipulated or corrupted by attackers.
The existing cryptosystems alone are not sufficient to deal with these issues and hence in this paper a highly robust reversible image steganography model has been developed for secret information hiding.
Unlike traditional wavelet transform techniques, we incorporated Discrete Ripplet Transformation (DRT) technique for message embedding in the medical cover images.
In addition, to assure seamless communication over insecure channel, a dual cryptosystem model containing proposed steganography scheme and RSA cryptosystem has been developed.
One of the key novelties of the proposed research work is the use of adaptive genetic algorithm (AGA) for optimal pixel adjustment process (OPAP) that enriches data hiding capacity as well as imperceptibility features.
The performance assessment reveals that the proposed steganography model outperforms other wavelet transformation based approaches in terms of high PSNR, embedding capacity, imperceptibility etc.
Existing counting methods often adopt regression-based approaches and cannot precisely localize the target objects, which hinders the further analysis (e.g., high-level understanding and fine-grained classification).
In addition, most of prior work mainly focus on counting objects in static environments with fixed cameras.
Motivated by the advent of unmanned flying vehicles (i.e., drones), we are interested in detecting and counting objects in such dynamic environments.
We propose Layout Proposal Networks (LPNs) and spatial kernels to simultaneously count and localize target objects (e.g., cars) in videos recorded by the drone.
Different from the conventional region proposal methods, we leverage the spatial layout information (e.g., cars often park regularly) and introduce these spatially regularized constraints into our network to improve the localization accuracy.
To evaluate our counting method, we present a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots.
To the best of our knowledge, it is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.
Soft-input soft-output (SISO) detection algorithms form the basis for iterative decoding.
The associated computational complexity often poses significant challenges for practical receiver implementations, in particular in the context of multiple-input multiple-output wireless systems.
In this paper, we present a low-complexity SISO sphere decoder which is based on the single tree search paradigm, proposed originally for soft-output detection in Studer et al., IEEE J-SAC, 2008.
The algorithm incorporates clipping of the extrinsic log-likelihood ratios in the tree search, which not only results in significant complexity savings, but also allows to cover a large performance/complexity trade-off region by adjusting a single parameter.
This article presents for the first time a global method for registering 3D curves with 3D surfaces without requiring an initialization.
The algorithm works with 2-tuples point+vector that consist in pairs of points augmented with the information of their tangents or normals.
A closed-form solution for determining the alignment transformation from a pair of matching 2-tuples is proposed.
In addition, the set of necessary conditions for two 2-tuples to match is derived.
This allows fast search of correspondences that are used in an hypothesise-and-test framework for accomplishing global registration.
Comparative experiments demonstrate that the proposed algorithm is the first effective solution for curve vs surface registration, with the method achieving accurate alignment in situations of small overlap and large percentage of outliers in a fraction of a second.
The proposed framework is extended to the cases of curve vs curve and surface vs surface registration, with the former being particularly relevant since it is also a largely unsolved problem.
Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled.
This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation.
We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types.
The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations.
Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations.
We have formerly introduced Deep Random Secrecy, a new cryptologic technique capable to ensure secrecy as close as desired from perfection against unlimited passive eavesdropping opponents.
We have also formerly introduced an extended protocol, based on Deep Random Secrecy, capable to resist to unlimited active MITM.
The main limitation of those protocols, in their initial presented version, is the important quantity of information that needs to be exchanged between the legitimate partners to distill secure digits.
We have defined and shown existence of an absolute constant, called Cryptologic Limit, which represents the upper-bound of Secrecy rate that can be reached by Deep Random Secrecy protocols.
At last, we have already presented practical algorithms to generate Deep Randomness from classical computing resources.
This article is presenting an optimization technique, based on recombination and reuse of random bits; this technique enables to dramatically increase the bandwidth performance of formerly introduced protocols, without jeopardizing the entropy of secret information.
That optimization enables to envision an implementation of Deep Random Secrecy at very reasonable cost.
The article also summarizes former results in the perspective of a comprehensive implementation.
The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information.
In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner.
Our method applies simple nonlinear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream.
Experimental results, carried on two well-known datasets (HMDB51 and UCF101), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition.
Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation.
In this paper one presents new similarity, cardinality and entropy measures for bipolar fuzzy set and for its particular forms like intuitionistic, paraconsistent and fuzzy set.
All these are constructed in the framework of multi-valued representations and are based on a penta-valued logic that uses the following logical values: true, false, unknown, contradictory and ambiguous.
Also a new distance for bounded real interval was defined.
With the rapid development of economy in China over the past decade, air pollution has become an increasingly serious problem in major cities and caused grave public health concerns in China.
Recently, a number of studies have dealt with air quality and air pollution.
Among them, some attempt to predict and monitor the air quality from different sources of information, ranging from deployed physical sensors to social media.
These methods are either too expensive or unreliable, prompting us to search for a novel and effective way to sense the air quality.
In this study, we propose to employ the state of the art in computer vision techniques to analyze photos that can be easily acquired from online social media.
Next, we establish the correlation between the haze level computed directly from photos with the official PM 2.5 record of the taken city at the taken time.
Our experiments based on both synthetic and real photos have shown the promise of this image-based approach to estimating and monitoring air pollution.
Many videos depict people, and it is their interactions that inform us of their activities, relation to one another and the cultural and social setting.
With advances in human action recognition, researchers have begun to address the automated recognition of these human-human interactions from video.
The main challenges stem from dealing with the considerable variation in recording settings, the appearance of the people depicted and the performance of their interaction.
This survey provides a summary of these challenges and datasets, followed by an in-depth discussion of relevant vision-based recognition and detection methods.
We focus on recent, promising work based on convolutional neural networks (CNNs).
Finally, we outline directions to overcome the limitations of the current state-of-the-art.
We propose an interpretation of the first-order answer set programming (FOASP) in terms of intuitionistic proof theory.
It is obtained by two polynomial translations between FOASP and the bounded-arity fragment of the Sigma_1 level of the Mints hierarchy in first-order intuitionistic logic.
It follows that Sigma_1 formulas using predicates of fixed arity (in particular unary) is of the same strength as FOASP.
Our construction reveals a close similarity between constructive provability and stable entailment, or equivalently, between the construction of an answer set and an intuitionistic refutation.
This paper is under consideration for publication in Theory and Practice of Logic Programming
Using a human-oriented formal example proof of the (lim+) theorem, i.e. that the sum of limits is the limit of the sum, which is of value for reference on its own, we exhibit a non-permutability of beta-steps and delta+-steps (according to Smullyan's classification), which is not visible with non-liberalized delta-rules and not serious with further liberalized delta-rules, such as the delta++-rule.
Besides a careful presentation of the search for a proof of (lim+) with several pedagogical intentions, the main subject is to explain why the order of beta-steps plays such a practically important role in some calculi.
The reconstruction of a deterministic data field from binary-quantized noisy observations of sensors which are randomly deployed over the field domain is studied.
The study focuses on the extremes of lack of deterministic control in the sensor deployment, lack of knowledge of the noise distribution, and lack of sensing precision and reliability.
Such adverse conditions are motivated by possible real-world scenarios where a large collection of low-cost, crudely manufactured sensors are mass-deployed in an environment where little can be assumed about the ambient noise.
A simple estimator that reconstructs the entire data field from these unreliable, binary-quantized, noisy observations is proposed.
Technical conditions for the almost sure and integrated mean squared error (MSE) convergence of the estimate to the data field, as the number of sensors tends to infinity, are derived and their implications are discussed.
For finite-dimensional, bounded-variation, and Sobolev-differentiable function classes, specific integrated MSE decay rates are derived.
For the first and third function classes these rates are found to be minimax order optimal with respect to infinite precision sensing and known noise distribution.
Model precision in a classification task is highly dependent on the feature space that is used to train the model.
Moreover, whether the features are sequential or static will dictate which classification method can be applied as most of the machine learning algorithms are designed to deal with either one or another type of data.
In real-life scenarios, however, it is often the case that both static and dynamic features are present, or can be extracted from the data.
In this work, we demonstrate how generative models such as Hidden Markov Models (HMM) and Long Short-Term Memory (LSTM) artificial neural networks can be used to extract temporal information from the dynamic data.
We explore how the extracted information can be combined with the static features in order to improve the classification performance.
We evaluate the existing techniques and suggest a hybrid approach, which outperforms other methods on several public datasets.
Datalog has become a popular language for writing static analyses.
Because Datalog is very limited, some implementations of Datalog for static analysis have extended it with new language features.
However, even with these features it is hard or impossible to express a large class of analyses because they use logical formulae to represent program state.
FormuLog fills this gap by extending Datalog to represent, manipulate, and reason about logical formulae.
We have used FormuLog to implement declarative versions of symbolic execution and abstract model checking, analyses previously out of the scope of Datalog-based languages.
While this paper focuses on the design of FormuLog and one of the analyses we have implemented in it, it also touches on a prototype implementation of the language and identifies performance optimizations that we believe will be necessary to scale FormuLog to real-world static analysis problems.
In large and active software projects, it becomes impractical for a developer to stay aware of all project activity.
While it might not be necessary to know about each commit or issue, it is arguably important to know about the ones that are unusual.
To investigate this hypothesis, we identified unusual events in 200 GitHub projects using a comprehensive list of ways in which an artifact can be unusual and asked 140 developers responsible for or affected by these events to comment on the usefulness of the corresponding information.
Based on 2,096 answers, we identify the subset of unusual events that developers consider particularly useful, including large code modifications and unusual amounts of reviewing activity, along with qualitative evidence on the reasons behind these answers.
Our findings provide a means for reducing the amount of information that developers need to parse in order to stay up to date with development activity in their projects.
The privacy implications of third-party tracking is a well-studied problem.
Recent research has shown that besides data aggregators and behavioral advertisers, online social networks also act as trackers via social widgets.
Existing cookie policies are not enough to solve these problems, pushing users to employ blacklist-based browser extensions to prevent such tracking.
Unfortunately, such approaches require maintaining and distributing blacklists, which are often too general and adversely affect non-tracking services for advertisements and analytics.
In this paper, we propose and advocate for a general third-party cookie policy that prevents third-party tracking with cookies and preserves the functionality of social widgets without requiring a blacklist and adversely affecting non-tracking services.
We implemented a proof-of-concept of our policy as browser extensions for Mozilla Firefox and Google Chrome.
To date, our extensions have been downloaded about 11.8K times and have over 2.8K daily users combined.
We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks.
During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content.
This is accomplished by adapting an arbitrary style transfer network to perform style randomization, by sampling input style embeddings from a multivariate normal distribution instead of inferring them from a style image.
In addition to standard classification experiments, we investigate the effect of style augmentation (and data augmentation generally) on domain transfer tasks.
We find that data augmentation significantly improves robustness to domain shift, and can be used as a simple, domain agnostic alternative to domain adaptation.
Comparing style augmentation against a mix of seven traditional augmentation techniques, we find that it can be readily combined with them to improve network performance.
We validate the efficacy of our technique with domain transfer experiments in classification and monocular depth estimation, illustrating consistent improvements in generalization.
The Personalization of information has taken recommender systems at a very high level.
With personalization these systems can generate user specific recommendations accurately and efficiently.
User profiling helps personalization, where information retrieval is done to personalize a scenario which maintains a separate user profile for individual user.
The main objective of this paper is to explore this field of personalization in context of user profiling, to help researchers make aware of the user profiling.
Various trends, techniques and Applications have been discussed in paper which will fulfill this motto.
Software architecture (SA) is celebrating 25 years.
This is so if we consider the seminal papers establishing SA as a distinct discipline and scientific publications that have identified cornerstones of both research and practice, like architecture views, architecture description languages, and architecture evaluation.
With the pervasive use of cloud provisioning, the dynamic integration of multi-party distributed services, and the steep increase in the digitalization of business and society, making sound design decisions encompasses an increasingly-large and complex problem space.
The role of SA is essential as never before, so much so that no organization undertakes `serious' projects without the support of suitable architecture practices.
But, how did SA practice evolve in the past 25 years? and What are the challenges ahead?
There have been various attempts to summarize the state of research and practice of SA.
Still, we miss the practitioners' view on the questions above.
To fill this gap, we have first extracted the top-10 topics resulting from the analysis of 5,622 scientific papers.
Then, we have used such topics to design an online survey filled out by 57 SA practitioners with 5 to 20+ years of experience.
We present the results of the survey with a special focus on the SA topics that SA practitioners perceive, in the past, present and future, as the most impactful.
We finally use the results to draw preliminary takeaways.
The construction of a reference ontology for a large domain still remains an hard human task.
The process is sometimes assisted by software tools that facilitate the information extraction from a textual corpus.
Despite of the great use of XML Schema files on the internet and especially in the B2B domain, tools that offer a complete semantic analysis of XML schemas are really rare.
In this paper we introduce Janus, a tool for automatically building a reference knowledge base starting from XML Schema files.
Janus also provides different useful views to simplify B2B application integration.
An accurate modeling of skin effect inside conductors is of capital importance to solve transmission line and scattering problems.
This paper presents a surface-based formulation to model skin effect in conductors of arbitrary cross section, and compute the per-unit-length impedance of a multiconductor transmission line.
The proposed formulation is based on the Dirichlet-Neumann operator that relates the longitudinal electric field to the tangential magnetic field on the boundary of a conductor.
We demonstrate how the surface operator can be obtained through the contour integral method for conductors of arbitrary shape.
The proposed algorithm is simple to implement, efficient, and can handle arbitrary cross-sections, which is a main advantage over the existing approach based on eigenfunctions, which is available only for canonical conductor's shapes.
The versatility of the method is illustrated through a diverse set of examples, which includes transmission lines with trapezoidal, curved, and V-shaped conductors.
Numerical results demonstrate the accuracy, versatility, and efficiency of the proposed technique.
Cryptography is an important field in the area of data encryption.
There are different cryptographic techniques available varying from the simplest to complex.
One of the complex symmetric key cryptography techniques is using Data Encryption Standard Algorithm.
This paper explores a unique approach to generation of key using fingerprint.
The generated key is used as an input key to the DES Algorithm
The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image.
To bridge the semantic gap that exists between the representation of an image by low-level features (namely, colour, shape, texture) and its high-level semantic content as perceived by humans, CBIR systems typically make use of the relevance feedback (RF) mechanism.
RF iteratively incorporates user-given inputs regarding the relevance of retrieved images, to improve retrieval efficiency.
One approach is to vary the weights of the features dynamically via feature reweighting.
In this work, an attempt has been made to improve retrieval accuracy by enhancing a CBIR system based on color features alone, through implicit incorporation of shape information obtained through prior segmentation of the images.
Novel schemes for feature reweighting as well as for initialization of the relevant set for improved relevance feedback, have also been proposed for boosting performance of RF- based CBIR.
At the same time, new measures for evaluation of retrieval accuracy have been suggested, to overcome the limitations of existing measures in the RF context.
Results of extensive experiments have been presented to illustrate the effectiveness of the proposed approaches.
A heterogeneous resource, such as a land-estate, is already divided among several agents in an unfair way.
It should be re-divided among the agents in a way that balances fairness with ownership rights.
We present re-division protocols that attain various trade-off points between fairness and ownership rights, in various settings differing in the geometric constraints on the allotments: (a) no geometric constraints; (b) connectivity --- the cake is a one-dimensional interval and each piece must be a contiguous interval; (c) rectangularity --- the cake is a two-dimensional rectangle or rectilinear polygon and the pieces should be rectangles; (d) convexity --- the cake is a two-dimensional convex polygon and the pieces should be convex.
Our re-division protocols have implications on another problem: the price-of-fairness --- the loss of social welfare caused by fairness requirements.
Each protocol implies an upper bound on the price-of-fairness with the respective geometric constraints.
In this paper, we report on experiments with the use of local measures for depth motion for visual action recognition from MPEG encoded RGBD video sequences.
We show that such measures can be combined with local space-time video descriptors for appearance to provide a computationally efficient method for recognition of actions.
Fisher vectors are used for encoding and concatenating a depth descriptor with existing RGB local descriptors.
We then employ a linear SVM for recognizing manipulation actions using such vectors.
We evaluate the effectiveness of such measures by comparison to the state-of-the-art using two recent datasets for action recognition in kitchen environments.
This paper reports the analysis of audio and visual features in predicting the continuous emotion dimensions under the seventh Audio/Visual Emotion Challenge (AVEC 2017), which was done as part of a B.Tech.2nd year internship project.
For visual features we used the HOG (Histogram of Gradients) features, Fisher encodings of SIFT (Scale-Invariant Feature Transform) features based on Gaussian mixture model (GMM) and some pretrained Convolutional Neural Network layers as features; all these extracted for each video clip.
For audio features we used the Bag-of-audio-words (BoAW) representation of the LLDs (low-level descriptors) generated by openXBOW provided by the organisers of the event.
Then we trained fully connected neural network regression model on the dataset for all these different modalities.
We applied multimodal fusion on the output models to get the Concordance correlation coefficient on Development set as well as Test set.
This paper proposes a joint segmentation and deconvolution Bayesian method for medical ultrasound (US) images.
Contrary to piecewise homogeneous images, US images exhibit heavy characteristic speckle patterns correlated with the tissue structures.
The generalized Gaussian distribution (GGD) has been shown to be one of the most relevant distributions for characterizing the speckle in US images.
Thus, we propose a GGD-Potts model defined by a label map coupling US image segmentation and deconvolution.
The Bayesian estimators of the unknown model parameters, including the US image, the label map and all the hyperparameters are difficult to be expressed in closed form.
Thus, we investigate a Gibbs sampler to generate samples distributed according to the posterior of interest.
These generated samples are finally used to compute the Bayesian estimators of the unknown parameters.
The performance of the proposed Bayesian model is compared with existing approaches via several experiments conducted on realistic synthetic data and in vivo US images.
Recent machine learning algorithms dedicated to solving semi-linear PDEs are improved by using different neural network architectures and different parameterizations.
These algorithms are compared to a new one that solves a fixed point problem by using deep learning techniques.
This new algorithm appears to be competitive in terms of accuracy with the best existing algorithms.
Liveliness detection acts as a safe guard against spoofing attacks.
Most of the researchers used vision based techniques to detect liveliness of the user, but they are highly sensitive to illumination effects.
Therefore it is very hard to design a system, which will work robustly under all circumstances.
Literature shows that most of the research utilize eye blink or mouth movement to detect the liveliness, while the other group used face texture to distinguish between real and imposter.
The classification results of all these approaches decreases drastically in variable light conditions.
Hence in this paper we are introducing fuzzy expert system which is sufficient enough to handle most of the cases comes in real time.
We have used two testing parameters, (a) under bad illumination and (b) less movement in eyes and mouth in case of real user to evaluate the performance of the system.
The system is behaving well in all, while in first case its False Rejection Rate (FRR) is 0.28, and in second case its FRR is 0.4.
The launch of Google Scholar (GS) marked the beginning of a revolution in the scientific information market.
This search engine, unlike traditional databases, automatically indexes information from the academic web.
Its ease of use, together with its wide coverage and fast indexing speed, have made it the first tool most scientists currently turn to when they need to carry out a literature search.
Additionally, the fact that its search results were accompanied from the beginning by citation counts, as well as the later development of secondary products which leverage this citation data (such as Google Scholar Metrics and Google Scholar Citations), made many scientists wonder about its potential as a source of data for bibliometric analyses.
The goal of this chapter is to lay the foundations for the use of GS as a supplementary source (and in some disciplines, arguably the best alternative) for scientific evaluation.
First, we present a general overview of how GS works.
Second, we present empirical evidences about its main characteristics (size, coverage, and growth rate).
Third, we carry out a systematic analysis of the main limitations this search engine presents as a tool for the evaluation of scientific performance.
Lastly, we discuss the main differences between GS and other more traditional bibliographic databases in light of the correlations found between their citation data.
We conclude that Google Scholar presents a broader view of the academic world because it has brought to light a great amount of sources that were not previously visible.
Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously.
In this chapter, we advocate a rule-based approach to multi-label classification.
Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts.
Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain.
Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data.
Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short overview of related work in this area.
In this paper, we propose a convolutional neural network(CNN) with 3-D rank-1 filters which are composed by the outer product of 1-D filters.
After being trained, the 3-D rank-1 filters can be decomposed into 1-D filters in the test time for fast inference.
The reason that we train 3-D rank-1 filters in the training stage instead of consecutive 1-D filters is that a better gradient flow can be obtained with this setting, which makes the training possible even in the case where the network with consecutive 1-D filters cannot be trained.
The 3-D rank-1 filters are updated by both the gradient flow and the outer product of the 1-D filters in every epoch, where the gradient flow tries to obtain a solution which minimizes the loss function, while the outer product operation tries to make the parameters of the filter to live on a rank-1 sub-space.
Furthermore, we show that the convolution with the rank-1 filters results in low rank outputs, constraining the final output of the CNN also to live on a low dimensional subspace.
The question whether an ontology can safely be replaced by another, possibly simpler, one is fundamental for many ontology engineering and maintenance tasks.
It underpins, for example, ontology versioning, ontology modularization, forgetting, and knowledge exchange.
What safe replacement means depends on the intended application of the ontology.
If, for example, it is used to query data, then the answers to any relevant ontology-mediated query should be the same over any relevant data set; if, in contrast, the ontology is used for conceptual reasoning, then the entailed subsumptions between concept expressions should coincide.
This gives rise to different notions of ontology inseparability such as query inseparability and concept inseparability, which generalize corresponding notions of conservative extensions.
We survey results on various notions of inseparability in the context of description logic ontologies, discussing their applications, useful model-theoretic characterizations, algorithms for determining whether two ontologies are inseparable (and, sometimes, for computing the difference between them if they are not), and the computational complexity of this problem.
Many internal software metrics and external quality attributes of Java programs correlate strongly with program size.
This knowledge has been used pervasively in quantitative studies of software through practices such as normalization on size metrics.
This paper reports size-related super- and sublinear effects that have not been known before.
Findings obtained on a very large collection of Java programs -- 30,911 projects hosted at Google Code as of Summer 2011 -- unveils how certain characteristics of programs vary disproportionately with program size, sometimes even non-monotonically.
Many of the specific parameters of nonlinear relations are reported.
This result gives further insights for the differences of "programming in the small" vs. "programming in the large."
The reported findings carry important consequences for OO software metrics, and software research in general: metrics that have been known to correlate with size can now be properly normalized so that all the information that is left in them is size-independent.
The results obtained by analyzing signals with the Square Wave Method (SWM) introduced previously can be presented in the frequency domain clearly and precisely by using the Square Wave Transform (SWT) described here.
As an example, the SWT is used to analyze a sequence of samples (that is, of measured values) taken from an electroencephalographic recording.
We provide a framework for determining the centralities of agents in a broad family of random networks.
Current understanding of network centrality is largely restricted to deterministic settings, but practitioners frequently use random network models to accommodate data limitations or prove asymptotic results.
Our main theorems show that on large random networks, centrality measures are close to their expected values with high probability.
We illustrate the economic consequences of these results by presenting three applications: (1) In network formation models based on community structure (called stochastic block models), we show network segregation and differences in community size produce inequality.
Benefits from peer effects tend to accrue disproportionately to bigger and better-connected communities.
(2) When link probabilities depend on geography, we can compute and compare the centralities of agents in different locations.
(3) In models where connections depend on several independent characteristics, we give a formula that determines centralities 'characteristic-by-characteristic'.
The basic techniques from these applications, which use the main theorems to reduce questions about random networks to deterministic calculations, extend to many network games.
In this paper, we propose a normalized cut segmentation algorithm with spatial regularization priority and adaptive similarity matrix.
We integrate the well-known expectation-maximum(EM) method in statistics and the regularization technique in partial differential equation (PDE) method into normalized cut (Ncut).
The introduced EM technique makes our method can adaptively update the similarity matrix, which can help us to get a better classification criterion than the classical Ncut method.
While the regularization priority can guarantee the proposed algorithm has a robust performance under noise.
To unify the three totally different methods including EM, spatial regularization, and spectral graph clustering, we built a variational framework to combine them and get a general normalized cut segmentation algorithm.
The well-defined theory of the proposed model is also given in the paper.
Compared with some existing spectral clustering methods such as the traditional Ncut algorithm and the variational based Chan-Vese model, the numerical experiments show that our methods can achieve promising segmentation performance.
Decision making in modern large-scale and complex systems such as communication networks, smart electricity grids, and cyber-physical systems motivate novel game-theoretic approaches.
This paper investigates big strategic (non-cooperative) games where a finite number of individual players each have a large number of continuous decision variables and input data points.
Such high-dimensional decision spaces and big data sets lead to computational challenges, relating to efforts in non-linear optimization scaling up to large systems of variables.
In addition to these computational challenges, real-world players often have limited information about their preference parameters due to the prohibitive cost of identifying them or due to operating in dynamic online settings.
The challenge of limited information is exacerbated in high dimensions and big data sets.
Motivated by both computational and information limitations that constrain the direct solution of big strategic games, our investigation centers around reductions using linear transformations such as random projection methods and their effect on Nash equilibrium solutions.
Specific analytical results are presented for quadratic games and approximations.
In addition, an adversarial learning game is presented where random projection and sampling schemes are investigated.
I assume in this paper that the proposition "I cannot know your intentional states" is true.
I consider its consequences on the use of so-called "intentional concepts" for Requirements Engineering.
I argue that if you take this proposition to be true, then intentional concepts (e.g., goal, belief, desire, intention, etc.) start to look less relevant (though not irrelevant), despite being the focus of significant research attention over the past three decades.
I identify substantial problems that arise if you use instances of intentional concepts to reflect intentional states.
I sketch an approach to address these problems.
In it, intentional concepts have a less prominent role, while notions of time, uncertainty, prediction, observability, evidence, and learning are at the forefront.
Testing has become an indispensable activity of software development, yet writing good and relevant tests remains a quite challenging task.
One well-known problem is that it often is impossible or unrealistic to test for every outcome, as the input and/or output of a program component can represent incredbly large, unless infinite domains.
A common approach to tackle this issue it to only test classes of cases, and to assume that those classes cover all (or at least most) of the cases a component is susceptible to be exposed to.
Unfortunately, those kind of assumptions can prove wrong in many situations, causing a yet well-tested program to fail upon a particular input.
In this short paper, we propose to leverage formal verification, in particular model checking techniques, as a way to better identify cases for which the aforementioned assumptions do not hold, and ultimately strenghten the confidence one can have in a test suite.
The idea is to extract a formal specification of the data types of a program, in the form of a term rewriting system, and to check that specification against a set of properties specified by the programmer.
Cases for which tose properties do not hold can then be identified using model checking, and selected as test cases.
This work presents an in-depth analysis of the majority of the deep neural networks (DNNs) proposed in the state of the art for image recognition.
For each DNN multiple performance indices are observed, such as recognition accuracy, model complexity, computational complexity, memory usage, and inference time.
The behavior of such performance indices and some combinations of them are analyzed and discussed.
To measure the indices we experiment the use of DNNs on two different computer architectures, a workstation equipped with a NVIDIA Titan X Pascal and an embedded system based on a NVIDIA Jetson TX1 board.
This experimentation allows a direct comparison between DNNs running on machines with very different computational capacity.
This study is useful for researchers to have a complete view of what solutions have been explored so far and in which research directions are worth exploring in the future; and for practitioners to select the DNN architecture(s) that better fit the resource constraints of practical deployments and applications.
To complete this work, all the DNNs, as well as the software used for the analysis, are available online.
A formal theory based on a binary operator of directional associative relation is constructed in the article and an understanding of an associative normal form of image constructions is introduced.
A model of a commutative semigroup, which provides a presentation of a sentence as three components of an interrogative linguistic image construction, is considered.
Class prediction is an important application of microarray gene expression data analysis.
The high-dimensionality of microarray data, where number of genes (variables) is very large compared to the number of samples (obser- vations), makes the application of many prediction techniques (e.g., logistic regression, discriminant analysis) difficult.
An efficient way to solve this prob- lem is by using dimension reduction statistical techniques.
Increasingly used in psychology-related applications, Rasch model (RM) provides an appealing framework for handling high-dimensional microarray data.
In this paper, we study the potential of RM-based modeling in dimensionality reduction with binarized microarray gene expression data and investigate its prediction ac- curacy in the context of class prediction using linear discriminant analysis.
Two different publicly available microarray data sets are used to illustrate a general framework of the approach.
Performance of the proposed method is assessed by re-randomization scheme using principal component analysis (PCA) as a benchmark method.
Our results show that RM-based dimension reduction is as effective as PCA-based dimension reduction.
The method is general and can be applied to the other high-dimensional data problems.
Sparse code multiple access (SCMA) scheme is considered to be one promising non-orthogonal multiple access technology for the future fifth generation (5G) communications.
Due to the sparse nature, message passing algorithm (MPA) has been used as the receiver to achieve close to maximum likelihood (ML) detection performance with much lower complexity.
However, the complexity order of MPA is still exponential with the size of codebook and the degree of signal superposition on a given resource element.
In this paper, we propose a novel low complexity iterative receiver based on expectation propagation algorithm (EPA), which reduces the complexity order from exponential to linear.
Simulation results demonstrate that the proposed EPA receiver achieves nearly the same block error rate (BLER) performance as the conventional message passing algorithm (MPA) receiver with orders less complexity.
Reliable 4D aircraft trajectory prediction, whether in a real-time setting or for analysis of counterfactuals, is important to the efficiency of the aviation system.
Toward this end, we first propose a highly generalizable efficient tree-based matching algorithm to construct image-like feature maps from high-fidelity meteorological datasets - wind, temperature and convective weather.
We then model the track points on trajectories as conditional Gaussian mixtures with parameters to be learned from our proposed deep generative model, which is an end-to-end convolutional recurrent neural network that consists of a long short-term memory (LSTM) encoder network and a mixture density LSTM decoder network.
The encoder network embeds last-filed flight plan information into fixed-size hidden state variables and feeds the decoder network, which further learns the spatiotemporal correlations from the historical flight tracks and outputs the parameters of Gaussian mixtures.
Convolutional layers are integrated into the pipeline to learn representations from the high-dimension weather features.
During the inference process, beam search, adaptive Kalman filter, and Rauch-Tung-Striebel smoother algorithms are used to prune the variance of generated trajectories.
We provide a complete characterisation of the phenomenon of adversarial examples - inputs intentionally crafted to fool machine learning models.
We aim to cover all the important concerns in this field of study: (1) the conjectures on the existence of adversarial examples, (2) the security, safety and robustness implications, (3) the methods used to generate and (4) protect against adversarial examples and (5) the ability of adversarial examples to transfer between different machine learning models.
We provide ample background information in an effort to make this document self-contained.
Therefore, this document can be used as survey, tutorial or as a catalog of attacks and defences using adversarial examples.
In this paper, adaptive neural control (ANC) is investigated for a class of strict-feedback nonlinear stochastic systems with unknown parameters, unknown nonlinear functions and stochastic disturbances.
The new controller of adaptive neural network with state feedback is presented by using a universal approximation of radial basis function neural network and backstepping.
An adaptive neural network state-feedback controller is designed by constructing a suitable Lyapunov function.
Adaptive bounding design technique is used to deal with the unknown nonlinear functions and unknown parameters.
It is shown that, the global asymptotically stable in probability can be achieved for the closed-loop system.
The simulation results are presented to demonstrate the effectiveness of the proposed control strategy in the presence of unknown parameters, unknown nonlinear functions and stochastic disturbances.
Representing knowledge as high-dimensional vectors in a continuous semantic vector space can help overcome the brittleness and incompleteness of traditional knowledge bases.
We present a method for performing deductive reasoning directly in such a vector space, combining analogy, association, and deduction in a straightforward way at each step in a chain of reasoning, drawing on knowledge from diverse sources and ontologies.
We present PFDCMSS, a novel message-passing based parallel algorithm for mining time-faded heavy hitters.
The algorithm is a parallel version of the recently published FDCMSS sequential algorithm.
We formally prove its correctness by showing that the underlying data structure, a sketch augmented with a Space Saving stream summary holding exactly two counters, is mergeable.
Whilst mergeability of traditional sketches derives immediately from theory, we show that merging our augmented sketch is non trivial.
Nonetheless, the resulting parallel algorithm is fast and simple to implement.
To the best of our knowledge, PFDCMSS is the first parallel algorithm solving the problem of mining time-faded heavy hitters on message-passing parallel architectures.
Extensive experimental results confirm that PFDCMSS retains the extreme accuracy and error bound provided by FDCMSS whilst providing excellent parallel scalability.
We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks.
The forecaster is shown to achieve an expected regret of the optimal order O(sqrt(n log N)) where n is the time horizon and N is the number of experts.
More importantly, it is shown that the forecaster changes its prediction at most O(sqrt(n log N)) times, in expectation.
We also extend the analysis to online combinatorial optimization and show that even in this more general setting, the forecaster rarely switches between experts while having a regret of near-optimal order.
Social dilemmas, where mutual cooperation can lead to high payoffs but participants face incentives to cheat, are ubiquitous in multi-agent interaction.
We wish to construct agents that cooperate with pure cooperators, avoid exploitation by pure defectors, and incentivize cooperation from the rest.
However, often the actions taken by a partner are (partially) unobserved or the consequences of individual actions are hard to predict.
We show that in a large class of games good strategies can be constructed by conditioning one's behavior solely on outcomes (ie. one's past rewards).
We call this consequentialist conditional cooperation.
We show how to construct such strategies using deep reinforcement learning techniques and demonstrate, both analytically and experimentally, that they are effective in social dilemmas beyond simple matrix games.
We also show the limitations of relying purely on consequences and discuss the need for understanding both the consequences of and the intentions behind an action.
Automatic language identification is a natural language processing problem that tries to determine the natural language of a given content.
In this paper we present a statistical method for automatic language identification of written text using dictionaries containing stop words and diacritics.
We propose different approaches that combine the two dictionaries to accurately determine the language of textual corpora.
This method was chosen because stop words and diacritics are very specific to a language, although some languages have some similar words and special characters they are not all common.
The languages taken into account were romance languages because they are very similar and usually it is hard to distinguish between them from a computational point of view.
We have tested our method using a Twitter corpus and a news article corpus.
Both corpora consists of UTF-8 encoded text, so the diacritics could be taken into account, in the case that the text has no diacritics only the stop words are used to determine the language of the text.
The experimental results show that the proposed method has an accuracy of over 90% for small texts and over 99.8% for
Mobile devices gather the communication capabilities as no other gadget.
Plus, they now comprise a wider set of applications while still maintaining reduced size and weight.
They have started to include accessibility features that enable the inclusion of disabled people.
However, these inclusive efforts still fall short considering the possibilities of such devices.
This is mainly due to the lack of interoperability and extensibility of current mobile operating systems (OS).
In this paper, we present a case study of a multi-impaired person where access to basic mobile applications was provided in an applicational basis.
We outline the main flaws in current mobile OS and suggest how these could further empower developers to provide accessibility components.
These could then be compounded to provide system-wide inclusion to a wider range of (multi)-impairments.
We empirically investigate learning from partial feedback in neural machine translation (NMT), when partial feedback is collected by asking users to highlight a correct chunk of a translation.
We propose a simple and effective way of utilizing such feedback in NMT training.
We demonstrate how the common machine translation problem of domain mismatch between training and deployment can be reduced solely based on chunk-level user feedback.
We conduct a series of simulation experiments to test the effectiveness of the proposed method.
Our results show that chunk-level feedback outperforms sentence based feedback by up to 2.61% BLEU absolute.
This paper provides a case for using Bayesian data analysis (BDA) to make more grounded claims regarding practical significance of software engineering research.
We show that using BDA, here combined with cumulative prospect theory (CPT), is appropriate when a researcher or practitioner wants to make clearer connections between statistical findings and practical significance in empirical software engineering research.
To illustrate our point we provide an example case using previously published data.
We build a multilevel Bayesian model for this data, for which we compare the out of sample predictive power.
Finally, we use our model to make out of sample predictions while, ultimately, connecting this to practical significance using CPT.
Throughout the case that we present, we argue that a Bayesian approach is a natural, theoretically well-grounded, practical work-flow for data analysis in empirical software engineering.
By including prior beliefs, assuming parameters are drawn from a probability distribution, assuming the true value is a random variable for uncertainty intervals, using counter-factual plots for sanity checks, conducting predictive posterior checks, and out of sample predictions, we will better understand the phenomenon being studied, while at the same time avoid the obsession with p-values.
Recent studies have numerically demonstrated the possible advantages of the asynchronous non-orthogonal multiple access (ANOMA) over the conventional synchronous non-orthogonal multiple access (NOMA).
The ANOMA makes use of the oversampling technique by intentionally introducing a timing mismatch between symbols of different users.
Focusing on a two-user uplink system, for the first time, we analytically prove that the ANOMA with a sufficiently large frame length can always outperform the NOMA in terms of the sum throughput.
To this end, we derive the expression for the sum throughput of the ANOMA as a function of signal-to-noise ratio (SNR), frame length, and normalized timing mismatch.
Based on the derived expression, we find that users should transmit at full powers to maximize the sum throughput.
In addition, we obtain the optimal timing mismatch as the frame length goes to infinity.
Moreover, we comprehensively study the impact of timing error on the ANOMA throughput performance.
Two types of timing error, i.e., the synchronization timing error and the coordination timing error, are considered.
We derive the throughput loss incurred by both types of timing error and find that the synchronization timing error has a greater impact on the throughput performance compared to the coordination timing error.
The Bulk Synchronous Parallel(BSP) computational model has emerged as the dominant distributed framework to build large-scale iterative graph processing systems.
While its implementations(e.g., Pregel, Giraph, and Hama) achieve high scalability, frequent synchronization and communication among the workers can cause substantial parallel inefficiency.
To help address this critical concern, this paper introduces the GraphHP(Graph Hybrid Processing) platform which inherits the friendly vertex-centric BSP programming interface and optimizes its synchronization and communication overhead.
To achieve the goal, we first propose a hybrid execution model which differentiates between the computations within a graph partition and across the partitions, and decouples the computations within a partition from distributed synchronization and communication.
By implementing the computations within a partition by pseudo-superstep iteration in memory, the hybrid execution model can effectively reduce synchronization and communication overhead while not requiring heavy scheduling overhead or graph-centric sequential algorithms.
We then demonstrate how the hybrid execution model can be easily implemented within the BSP abstraction to preserve its simple programming interface.
Finally, we evaluate our implementation of the GraphHP platform on classical BSP applications and show that it performs significantly better than the state-of-the-art BSP implementations.
Our GraphHP implementation is based on Hama, but can easily generalize to other BSP platforms.
An important problem in the implementation of Markov Chain Monte Carlo algorithms is to determine the convergence time, or the number of iterations before the chain is close to stationarity.
For many Markov chains used in practice this time is not known.
Even in cases where the convergence time is known to be polynomial, the theoretical bounds are often too crude to be practical.
Thus, practitioners like to carry out some form of statistical analysis in order to assess convergence.
This has led to the development of a number of methods known as convergence diagnostics which attempt to diagnose whether the Markov chain is far from stationarity.
We study the problem of testing convergence in the following settings and prove that the problem is hard in a computational sense: Given a Markov chain that mixes rapidly, it is hard for Statistical Zero Knowledge (SZK-hard) to distinguish whether starting from a given state, the chain is close to stationarity by time t or far from stationarity at time ct for a constant c. We show the problem is in AM intersect coAM.
Second, given a Markov chain that mixes rapidly it is coNP-hard to distinguish whether it is close to stationarity by time t or far from stationarity at time ct for a constant c. The problem is in coAM.
Finally, it is PSPACE-complete to distinguish whether the Markov chain is close to stationarity by time t or far from being mixed at time ct for c at least 1.
Delay-coordinate reconstruction is a proven modeling strategy for building effective forecasts of nonlinear time series.
The first step in this process is the estimation of good values for two parameters, the time delay and the embedding dimension.
Many heuristics and strategies have been proposed in the literature for estimating these values.
Few, if any, of these methods were developed with forecasting in mind, however, and their results are not optimal for that purpose.
Even so, these heuristics---intended for other applications---are routinely used when building delay coordinate reconstruction-based forecast models.
In this paper, we propose a new strategy for choosing optimal parameter values for forecast methods that are based on delay-coordinate reconstructions.
The basic calculation involves maximizing the shared information between each delay vector and the future state of the system.
We illustrate the effectiveness of this method on several synthetic and experimental systems, showing that this metric can be calculated quickly and reliably from a relatively short time series, and that it provides a direct indication of how well a near-neighbor based forecasting method will work on a given delay reconstruction of that time series.
This allows a practitioner to choose reconstruction parameters that avoid any pathologies, regardless of the underlying mechanism, and maximize the predictive information contained in the reconstruction.
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments.
Because these experiments are computation- and data-intensive, they require high-performance computing (HPC) techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems (SWfMS) and databases.
In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments.
This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application.
Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information.
We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow.
We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database.
Some of these queries are available as a pre-built feature of the BioWorkbench web application.
Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time.
We also show how the application of machine learning techniques can enrich the analysis process.
Transportation processes, which play a prominent role in the life and social sciences, are typically described by discrete models on lattices.
For studying their dynamics a continuous formulation of the problem via partial differential equations (PDE) is employed.
In this paper we propose a symbolic computation approach to derive mean-field PDEs from a lattice-based model.
We start with the microscopic equations, which state the probability to find a particle at a given lattice site.
Then the PDEs are formally derived by Taylor expansions of the probability densities and by passing to an appropriate limit as the time steps and the distances between lattice sites tend to zero.
We present an implementation in a computer algebra system that performs this transition for a general class of models.
In order to rewrite the mean-field PDEs in a conservative formulation, we adapt and implement symbolic integration methods that can handle unspecified functions in several variables.
To illustrate our approach, we consider an application in crowd motion analysis where the dynamics of bidirectional flows are studied.
However, the presented approach can be applied to various transportation processes of multiple species with variable size in any dimension, for example, to confirm several proposed mean-field models for cell motility.
In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks.
In the past decade, many works have been done about the link prediction in social networks.
The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks.
A systematical category for link prediction techniques and problems is presented.
Then link prediction techniques and problems are analyzed and discussed.
Typical applications of link prediction are also addressed.
Achievements and roadmaps of some active research groups are introduced.
Finally, some future challenges of the link prediction in social networks are discussed.
In this paper, is used the Lagrangian classical mechanics for modeling the dynamics of an underactuated system, specifically a rotary inverted pendulum that will have two equations of motion.
A basic design of the system is proposed in SOLIDWORKS 3D CAD software, which based on the material and dimensions of the model provides some physical variables necessary for modeling.
In order to verify the results obtained, a comparison the CAD model simulated in the environment SimMechanics of MATLAB software with the mathematical model who was consisting of Euler Lagrange's equations implemented in Simulink MATLAB, solved with the ODE23tb method, included in the MATLAB libraries for the solution of systems of equations of the type and order obtained.
This article also has a topological analysis of pendulum trajectories through a phase space diagram, which allows the identification of stable and unstable regions of the system.
We provide a solution for elementary science test using instructional materials.
We posit that there is a hidden structure that explains the correctness of an answer given the question and instructional materials and present a unified max-margin framework that learns to find these hidden structures (given a corpus of question-answer pairs and instructional materials), and uses what it learns to answer novel elementary science questions.
Our evaluation shows that our framework outperforms several strong baselines.
In IEEE 802.11 networks, selfish stations can pursue a better quality of service (QoS) through selfish MAC-layer attacks.
Such attacks are easy to perform, secure routing protocols do not prevent them, and their detection may be complex.
Two-hop relay topologies allow a new angle of attack: a selfish relay can tamper with either source traffic, transit traffic, or both.
We consider the applicability of selfish attacks and their variants in the two-hop relay topology, quantify their impact, and study defense measures.
The age of the root of the Indo-European language family has received much attention since the application of Bayesian phylogenetic methods by Gray and Atkinson(2003).
The root age of the Indo-European family has tended to decrease from an age that supported the Anatolian origin hypothesis to an age that supports the Steppe origin hypothesis with the application of new models (Chang et al., 2015).
However, none of the published work in the Indo-European phylogenetics studied the effect of tree priors on phylogenetic analyses of the Indo-European family.
In this paper, I intend to fill this gap by exploring the effect of tree priors on different aspects of the Indo-European family's phylogenetic inference.
I apply three tree priors---Uniform, Fossilized Birth-Death (FBD), and Coalescent---to five publicly available datasets of the Indo-European language family.
I evaluate the posterior distribution of the trees from the Bayesian analysis using Bayes Factor, and find that there is support for the Steppe origin hypothesis in the case of two tree priors.
I report the median and 95% highest posterior density (HPD) interval of the root ages for all the three tree priors.
A model comparison suggested that either Uniform prior or FBD prior is more suitable than the Coalescent prior to the datasets belonging to the Indo-European language family.
We propose a language-agnostic way of automatically generating sets of semantically similar clusters of entities along with sets of "outlier" elements, which may then be used to perform an intrinsic evaluation of word embeddings in the outlier detection task.
We used our methodology to create a gold-standard dataset, which we call WikiSem500, and evaluated multiple state-of-the-art embeddings.
The results show a correlation between performance on this dataset and performance on sentiment analysis.
The discriminative power of modern deep learning models for 3D human action recognition is growing ever so potent.
In conjunction with the recent resurgence of 3D human action representation with 3D skeletons, the quality and the pace of recent progress have been significant.
However, the inner workings of state-of-the-art learning based methods in 3D human action recognition still remain mostly black-box.
In this work, we propose to use a new class of models known as Temporal Convolutional Neural Networks (TCN) for 3D human action recognition.
Compared to popular LSTM-based Recurrent Neural Network models, given interpretable input such as 3D skeletons, TCN provides us a way to explicitly learn readily interpretable spatio-temporal representations for 3D human action recognition.
We provide our strategy in re-designing the TCN with interpretability in mind and how such characteristics of the model is leveraged to construct a powerful 3D activity recognition method.
Through this work, we wish to take a step towards a spatio-temporal model that is easier to understand, explain and interpret.
The resulting model, Res-TCN, achieves state-of-the-art results on the largest 3D human action recognition dataset, NTU-RGBD.
Numerous variants of Self-Organizing Maps (SOMs) have been proposed in the literature, including those which also possess an underlying structure, and in some cases, this structure itself can be defined by the user Although the concepts of growing the SOM and updating it have been studied, the whole issue of using a self-organizing Adaptive Data Structure (ADS) to further enhance the properties of the underlying SOM, has been unexplored.
In an earlier work, we impose an arbitrary, user-defined, tree-like topology onto the codebooks, which consequently enforced a neighborhood phenomenon and the so-called tree-based Bubble of Activity.
In this paper, we consider how the underlying tree itself can be rendered dynamic and adaptively transformed.
To do this, we present methods by which a SOM with an underlying Binary Search Tree (BST) structure can be adaptively re-structured using Conditional Rotations (CONROT).
These rotations on the nodes of the tree are local, can be done in constant time, and performed so as to decrease the Weighted Path Length (WPL) of the entire tree.
In doing this, we introduce the pioneering concept referred to as Neural Promotion, where neurons gain prominence in the Neural Network (NN) as their significance increases.
We are not aware of any research which deals with the issue of Neural Promotion.
The advantages of such a scheme is that the user need not be aware of any of the topological peculiarities of the stochastic data distribution.
Rather, the algorithm, referred to as the TTOSOM with Conditional Rotations (TTOCONROT), converges in such a manner that the neurons are ultimately placed in the input space so as to represent its stochastic distribution, and additionally, the neighborhood properties of the neurons suit the best BST that represents the data.
These properties have been confirmed by our experimental results on a variety of data sets.
Conceived in the early 1990s, Experience Replay (ER) has been shown to be a successful mechanism to allow online learning algorithms to reuse past experiences.
Traditionally, ER can be applied to all machine learning paradigms (i.e., unsupervised, supervised, and reinforcement learning).
Recently, ER has contributed to improving the performance of deep reinforcement learning.
Yet, its application to many practical settings is still limited by the memory requirements of ER, necessary to explicitly store previous observations.
To remedy this issue, we explore a novel approach, Online Contrastive Divergence with Generative Replay (OCD_GR), which uses the generative capability of Restricted Boltzmann Machines (RBMs) instead of recorded past experiences.
The RBM is trained online, and does not require the system to store any of the observed data points.
We compare OCD_GR to ER on 9 real-world datasets, considering a worst-case scenario (data points arriving in sorted order) as well as a more realistic one (sequential random-order data points).
Our results show that in 64.28% of the cases OCD_GR outperforms ER and in the remaining 35.72% it has an almost equal performance, while having a considerably reduced space complexity (i.e., memory usage) at a comparable time complexity.
Conventional surveillance systems for monitoring infectious diseases, such as influenza, face challenges due to shortage of skilled healthcare professionals, remoteness of communities and absence of communication infrastructures.
Internet-based approaches for surveillance are appealing logistically as well as economically.
Search engine queries and Twitter have been the primarily used data sources in such approaches.
The aim of this study is to assess the predictive power of an alternative data source, Instagram.
By using 317 weeks of publicly available data from Instagram, we trained several machine learning algorithms to both nowcast and forecast the number of official influenza-like illness incidents in Finland where population-wide official statistics about the weekly incidents are available.
In addition to date and hashtag count features of online posts, we were able to utilize also the visual content of the posted images with the help of deep convolutional neural networks.
Our best nowcasting model reached a mean absolute error of 11.33 incidents per week and a correlation coefficient of 0.963 on the test data.
Forecasting models for predicting 1 week and 2 weeks ahead showed statistical significance as well by reaching correlation coefficients of 0.903 and 0.862, respectively.
This study demonstrates how social media and in particular, digital photographs shared in them, can be a valuable source of information for the field of infodemiology.
This paper investigates reversibility properties of 1-dimensional 3-neighborhood d-state finite cellular automata (CAs) of length n under periodic boundary condition.
A tool named reachability tree has been developed from de Bruijn graph which represents all possible reachable configurations of an n-cell CA.
This tool has been used to test reversibility of CAs.
We have identified a large set of reversible CAs using this tool by following some greedy strategies.
Coupled natural systems are generally modeled at multiple abstraction levels.
Both structural scale and behavioral complexity of these models are determinants in the kinds of questions that can be posed and answered.
As scale and complexity of models increase, simulation efficiency must increase to resolve tradeoffs between model resolution and simulation time.
From this vantage point, we will show some problems and solutions by using as example a vegetation-landscape model where individual plants belonging to different species are represented as collectives that undergo growth and decline cycles spanning hundreds of years.
Collective plant entities are assigned to cells of a static, two-dimensional grid.
This coarse-grain model, guided by homomorphic modeling ideas, is derived from a fine-grain model representing plants as individual objects.
These models are developed using Python and GRASS tools.
A set of experiments is devised to reveal some barriers in modeling and simulating this class of systems.
A wireless network is realized by mobile devices which communicate over radio channels.
Since, experiments of real life problem with real devices are very difficult, simulation is used very often.
Among many other important properties that have to be defined for simulative experiments, the mobility model and the radio propagation model have to be selected carefully.
Both have strong impact on the performance of mobile wireless networks, e.g., the performance of routing protocols varies with these models.
There are many mobility and radio propagation models proposed in literature.
Each of them was developed with different objectives and is not suited for every physical scenario.
The radio propagation models used in common wireless network simulators, in general researcher consider simple radio propagation models and neglect obstacles in the propagation environment.
In this paper, we study the performance of wireless networks simulation by consider different Radio propagation models with considering obstacles in the propagation environment.
In this paper we analyzed the performance of wireless networks by OPNET Modeler .In this paper we quantify the parameters such as throughput, packet received attenuation.
In the last time some papers were devoted to the study of the con- nections between binary block codes and BCK-algebras.
In this paper, we try to generalize these results to n-ary block codes, providing an algorithm which allows us to construct a BCK-algebra from a given n-ary block code.
The aim of this paper is to alter the abstract definition of the program of the theoretical programming model which has been developed at Eotvos Lorand University for many years in order to investigate methods that support designing correct programs.
The motivation of this modification was that the dynamic properties of programs appear in the model.
This new definition of the program gives a hand to extend the model with the concept of subprograms while the earlier results of the original programming model are preserved.
Model checking has been successfully used in many computer science fields, including artificial intelligence, theoretical computer science, and databases.
Most of the proposed solutions make use of classical, point-based temporal logics, while little work has been done in the interval temporal logic setting.
Recently, a non-elementary model checking algorithm for Halpern and Shoham's modal logic of time intervals HS over finite Kripke structures (under the homogeneity assumption) and an EXPSPACE model checking procedure for two meaningful fragments of it have been proposed.
In this paper, we show that more efficient model checking procedures can be developed for some expressive enough fragments of HS.
The newly released Orange D4D mobile phone data base provides new insights into the use of mobile technology in a developing country.
Here we perform a series of spatial data analyses that reveal important geographic aspects of mobile phone use in Cote d'Ivoire.
We first map the locations of base stations with respect to the population distribution and the number and duration of calls at each base station.
On this basis, we estimate the energy consumed by the mobile phone network.
Finally, we perform an analysis of inter-city mobility, and identify high-traffic roads in the country.
PL for SOA proposes, formally, a software engineering methodology, development techniques and support tools for the provision of service product lines.
We propose rigorous modeling techniques for the specification and verification of formal notations and languages for service computing with inclinations of variability.
Through these cutting-edge technologies, increased levels of flexibility and adaptivity can be achieved.
This will involve developing semantics of variability over behavioural models of services.
Such tools will assist organizations to plan, optimize and control the quality of software service provision, both at design and at run time by making it possible to develop flexible and cost-effective software systems that support high levels of reuse.
We tackle this challenge from two levels.
We use feature modeling from product line engineering and, from a services point of view, the orchestration language Orc.
We introduce the Smart Grid as the service product line to apply the techniques to.
Visual object tracking is a challenging computer vision task with numerous real-world applications.
Here we propose a simple but efficient Spectral Filter Tracking (SFT)method.
To characterize rotational and translation invariance of tracking targets, the candidate image region is models as a pixelwise grid graph.
Instead of the conventional graph matching, we convert the tracking into a plain least square regression problem to estimate the best center coordinate of the target.
But different from the holistic regression of correlation filter based methods, SFT can operate on localized surrounding regions of each pixel (i.e.,vertex) by using spectral graph filters, which thus is more robust to resist local variations and cluttered background.To bypass the eigenvalue decomposition problem of the graph Laplacian matrix L, we parameterize spectral graph filters as the polynomial of L by spectral graph theory, in which L k exactly encodes a k-hop local neighborhood of each vertex.
Finally, the filter parameters (i.e., polynomial coefficients) as well as feature projecting functions are jointly integrated into the regression model.
Compared with word embedding based on point representation, distribution-based word embedding shows more flexibility in expressing uncertainty and therefore embeds richer semantic information when representing words.
The Wasserstein distance provides a natural notion of dissimilarity with probability measures and has a closed-form solution when measuring the distance between two Gaussian distributions.
Therefore, with the aim of representing words in a highly efficient way, we propose to operate a Gaussian word embedding model with a loss function based on the Wasserstein distance.
Also, external information from ConceptNet will be used to semi-supervise the results of the Gaussian word embedding.
Thirteen datasets from the word similarity task, together with one from the word entailment task, and six datasets from the downstream document classification task will be evaluated in this paper to test our hypothesis.
Recently, merging signal processing techniques with information security services has found a lot of attention.
Steganography and steganalysis are among those trends.
Like their counterparts in cryptology, steganography and steganalysis are in a constant battle.
Steganography methods try to hide the presence of covert messages in innocuous-looking data, whereas steganalysis methods try to reveal existence of such messages and to break steganography methods.
The stream nature of audio signals, their popularity, and their wide spread usage make them very suitable media for steganography.
This has led to a very rich literature on both steganography and steganalysis of audio signals.
This paper intends to conduct a comprehensive review of audio steganalysis methods aggregated over near fifteen years.
Furthermore, we implement some of the most recent audio steganalysis methods and conduct a comparative analysis on their performances.
Finally, the paper provides some possible directions for future researches on audio steganalysis.
The Internet-of-things (IoT) is the paradigm where anything will be connected.
There are two main approaches to handle the surge in the uplink (UL) traffic the IoT is expected to generate, namely, Scheduled UL (SC-UL) and random access uplink (RA-UL) transmissions.
SC-UL is perceived as a viable tool to control Quality-of-Service (QoS) levels while entailing some overhead in the scheduling request prior to any UL transmission.
On the other hand, RA-UL is a simple single-phase transmission strategy.
While this obviously eliminates scheduling overheads, very little is known about how scalable RA-UL is.
At this critical junction, there is a dire need to analyze the scalability of these two paradigms.
To that end, this paper develops a spatiotemporal mathematical framework to analyze and assess the performance of SC-UL and RA-UL.
The developed paradigm jointly utilizes stochastic geometry and queueing theory.
Based on such a framework, we show that the answer to the "scheduling vs. random access paradox" actually depends on the operational scenario.
Particularly, RA-UL scheme offers low access delays but suffers from limited scalability, i.e., cannot support a large number of IoT devices.
On the other hand, SC-UL transmission is better suited for higher device intensities and traffic rates.
Generative models that learn disentangled representations for different factors of variation in an image can be very useful for targeted data augmentation.
By sampling from the disentangled latent subspace of interest, we can efficiently generate new data necessary for a particular task.
Learning disentangled representations is a challenging problem, especially when certain factors of variation are difficult to label.
In this paper, we introduce a novel architecture that disentangles the latent space into two complementary subspaces by using only weak supervision in form of pairwise similarity labels.
Inspired by the recent success of cycle-consistent adversarial architectures, we use cycle-consistency in a variational auto-encoder framework.
Our non-adversarial approach is in contrast with the recent works that combine adversarial training with auto-encoders to disentangle representations.
We show compelling results of disentangled latent subspaces on three datasets and compare with recent works that leverage adversarial training.
Natural Language Interfaces and tools such as spellcheckers and Web search in one's own language are known to be useful in ICT-mediated communication.
Most languages in Southern Africa are under-resourced, however.
Therefore, it would be very useful if both the generic and the few language-specific NLP tools could be reused or easily adapted across languages.
This depends on the notion, and extent, of similarity between the languages.
We assess this from the angle of orthography and corpora.
Twelve versions of the Universal Declaration of Human Rights (UDHR) are examined, showing clusters of languages, and which are thus more or less amenable to cross-language adaptation of NLP tools, which do not match with Guthrie zones.
To examine the generalisability of these results, we zoom in on isiZulu both quantitatively and qualitatively with four other corpora and texts in different genres.
The results show that the UDHR is a typical text document orthographically.
The results also provide insight into usability of typical measures such as lexical diversity and genre, and that the same statistic may mean different things in different documents.
While NLTK for Python could be used for basic analyses of text, it, and similar NLP tools, will need considerable customization.
Identifying (and fixing) homonymous and synonymous author profiles is one of the major tasks of curating personalized bibliographic metadata repositories like the dblp computer science bibliography.
In this paper, we present and evaluate a machine learning approach to identify homonymous author bibliographies using a simple multilayer perceptron setup.
We train our model on a novel gold-standard data set derived from the past years of active, manual curation at the dblp computer science bibliography.
The problem of landmark recognition has achieved excellent results in small-scale datasets.
When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase.
In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible.
In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search.
It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition.
It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k.
This paper presents a framework for exact discovery of the top-k sequential patterns under Leverage.
It combines (1) a novel definition of the expected support for a sequential pattern - a concept on which most interestingness measures directly rely - with (2) SkOPUS: a new branch-and-bound algorithm for the exact discovery of top-k sequential patterns under a given measure of interest.
Our interestingness measure employs the partition approach.
A pattern is interesting to the extent that it is more frequent than can be explained by assuming independence between any of the pairs of patterns from which it can be composed.
The larger the support compared to the expectation under independence, the more interesting is the pattern.
We build on these two elements to exactly extract the k sequential patterns with highest leverage, consistent with our definition of expected support.
We conduct experiments on both synthetic data with known patterns and real-world datasets; both experiments confirm the consistency and relevance of our approach with regard to the state of the art.
This article was published in Data Mining and Knowledge Discovery and is accessible at http://dx.doi.org/10.1007/s10618-016-0467-9.
Content-based routing (CBR) is a powerful model that supports scalable asynchronous communication among large sets of geographically distributed nodes.
Yet, preserving privacy represents a major limitation for the wide adoption of CBR, notably when the routers are located in public clouds.
Indeed, a CBR router must see the content of the messages sent by data producers, as well as the filters (or subscriptions) registered by data consumers.
This represents a major deterrent for companies for which data is a key asset, as for instance in the case of financial markets or to conduct sensitive business-to-business transactions.
While there exists some techniques for privacy-preserving computation, they are either prohibitively slow or too limited to be usable in real systems.
In this paper, we follow a different strategy by taking advantage of trusted hardware extensions that have just been introduced in off-the-shelf processors and provide a trusted execution environment.
We exploit Intel's new software guard extensions (SGX) to implement a CBR engine in a secure enclave.
Thanks to the hardware-based trusted execution environment (TEE), the compute-intensive CBR operations can operate on decrypted data shielded by the enclave and leverage efficient matching algorithms.
Extensive experimental evaluation shows that SGX adds only limited overhead to insecure plaintext matching outside secure enclaves while providing much better performance and more powerful filtering capabilities than alternative software-only solutions.
To the best of our knowledge, this work is the first to demonstrate the practical benefits of SGX for privacy-preserving CBR.
Recurrent neural networks are strong dynamic systems, but they are very sensitive to their hyper-parameter configuration.
Moreover, training properly a recurrent neural network is a tough task, therefore selecting an appropriate configuration is critical.
There have been proposed varied strategies to tackle this issue, however most of them are still impractical because of the time/resources needed.
In this study, we propose a low computational cost model to evaluate the expected performance of a given architecture based on the distribution of the error of random samples.
We validate empirically our proposal using three use case.
The min-rank of a digraph was shown by Bar-Yossef et al.(2006) to represent the length of an optimal scalar linear solution of the corresponding instance of the Index Coding with Side Information (ICSI) problem.
In this work, the graphs and digraphs of near-extreme min-ranks are characterized.
Those graphs and digraphs correspond to the ICSI instances having near-extreme transmission rates when using optimal scalar linear index codes.
In particular, it is shown that the decision problem whether a digraph has min-rank two is NP-complete.
By contrast, the same question for graphs can be answered in polynomial time.
Additionally, a new upper bound on the min-rank of a digraph, the circuit-packing bound, is presented.
This bound is often tighter than the previously known bounds.
By employing this new bound, we present several families of digraphs whose min-ranks can be found in polynomial time.
High-quality video streaming, either in form of Video-On-Demand (VOD) or live streaming, usually requires converting (ie, transcoding) video streams to match the characteristics of viewers' devices (eg, in terms of spatial resolution or supported formats).
Considering the computational cost of the transcoding operation and the surge in video streaming demands, Streaming Service Providers (SSPs) are becoming reliant on cloud services to guarantee Quality of Service (QoS) of streaming for their viewers.
Cloud providers offer heterogeneous computational services in form of different types of Virtual Machines (VMs) with diverse prices.
Effective utilization of cloud services for video transcoding requires detailed performance analysis of different video transcoding operations on the heterogeneous cloud VMs.
In this research, for the first time, we provide a thorough analysis of the performance of the video stream transcoding on heterogeneous cloud VMs.
Providing such analysis is crucial for efficient prediction of transcoding time on heterogeneous VMs and for the functionality of any scheduling methods tailored for video transcoding.
Based upon the findings of this analysis and by considering the cost difference of heterogeneous cloud VMs, in this research, we also provide a model to quantify the degree of suitability of each cloud VM type for various transcoding tasks.
The provided model can supply resource (VM) provisioning methods with accurate performance and cost trade-offs to efficiently utilize cloud services for video streaming.
We propose an approach to index raster images of dictionary pages which in turn would require very little manual effort to enable direct access to the appropriate pages of the dictionary for lookup.
Accessibility is further improved by feedback and crowdsourcing that enables highlighting of the specific location on the page where the lookup word is found, annotation, digitization, and fielded searching.
This approach is equally applicable on simple scripts as well as complex writing systems.
Using our proposed approach, we have built a Web application called "Dictionary Explorer" which supports word indexes in various languages and every language can have multiple dictionaries associated with it.
Word lookup gives direct access to appropriate pages of all the dictionaries of that language simultaneously.
The application has exploration features like searching, pagination, and navigating the word index through a tree-like interface.
The application also supports feedback, annotation, and digitization features.
Apart from the scanned images, "Dictionary Explorer" aggregates results from various sources and user contributions in Unicode.
We have evaluated the time required for indexing dictionaries of different sizes and complexities in the Urdu language and examined various trade-offs in our implementation.
Using our approach, a single person can make a dictionary of 1,000 pages searchable in less than an hour.
Literature analysis is a key step in obtaining background information in biomedical research.
However, it is difficult for researchers to obtain knowledge of their interests in an efficient manner because of the massive amount of the published biomedical literature.
Therefore, efficient and systematic search strategies are required, which allow ready access to the substantial amount of literature.
In this paper, we propose a novel search system, named Co-Occurrence based on Co-Operational Formation with Advanced Method(COCOFAM) which is suitable for the large-scale literature analysis.
COCOFAM is based on integrating both Spark for local clusters and a global job scheduler to gather crowdsourced co-occurrence data on global clusters.
It will allow users to obtain information of their interests from the substantial amount of literature.
Robot-Assisted Therapy (RAT) has successfully been used in HRI research by including social robots in health-care interventions by virtue of their ability to engage human users both social and emotional dimensions.
Research projects on this topic exist all over the globe in the USA, Europe, and Asia.
All of these projects have the overall ambitious goal to increase the well-being of a vulnerable population.
Typical work in RAT is performed using remote controlled robots; a technique called Wizard-of-Oz (WoZ).
The robot is usually controlled, unbeknownst to the patient, by a human operator.
However, WoZ has been demonstrated to not be a sustainable technique in the long-term.
Providing the robots with autonomy (while remaining under the supervision of the therapist) has the potential to lighten the therapists burden, not only in the therapeutic session itself but also in longer-term diagnostic tasks.
Therefore, there is a need for exploring several degrees of autonomy in social robots used in therapy.
Increasing the autonomy of robots might also bring about a new set of challenges.
In particular, there will be a need to answer new ethical questions regarding the use of robots with a vulnerable population, as well as a need to ensure ethically-compliant robot behaviours.
Therefore, in this workshop we want to gather findings and explore which degree of autonomy might help to improve health-care interventions and how we can overcome the ethical challenges inherent to it.
Multi-threaded programs have traditionally fallen into one of two domains: cooperative and competitive.
These two domains have traditionally remained mostly disjoint, with cooperative threading used for increasing throughput in compute-intensive applications such as scientific workloads and cooperative threading used for increasing responsiveness in interactive applications such as GUIs and games.
As multicore hardware becomes increasingly mainstream, there is a need for bridging these two disjoint worlds, because many applications mix interaction and computation and would benefit from both cooperative and competitive threading.
In this paper, we present techniques for programming and reasoning about parallel interactive applications that can use both cooperative and competitive threading.
Our techniques enable the programmer to write rich parallel interactive programs by creating and synchronizing with threads as needed, and by assigning threads user-defined and partially ordered priorities.
To ensure important responsiveness properties, we present a modal type system analogous to S4 modal logic that precludes low-priority threads from delaying high-priority threads, thereby statically preventing a crucial set of priority-inversion bugs.
We then present a cost model that allows reasoning about responsiveness and completion time of well-typed programs.
The cost model extends the traditional work-span model for cooperative threading to account for competitive scheduling decisions needed to ensure responsiveness.
Finally, we show that our proposed techniques are realistic by implementing them as an extension to the Standard ML language.
Fractional Repetition (FR) codes are well known class of Distributed Replication-based Simple Storage (Dress) codes for the Distributed Storage Systems (DSSs).
In such systems, the replicas of data packets encoded by Maximum Distance Separable (MDS) code, are stored on distributed nodes.
Most of the available constructions for the FR codes are based on combinatorial designs and Graph theory.
In this work, we propose an elegant sequence based approach for the construction of the FR code.
In particular, we propose a beautiful class of codes known as Flower codes and study its basic properties.
In manufacture, steel and other metals are mainly cut and shaped during the fabrication process by computer numerical control (CNC) machines.
To keep high productivity and efficiency of the fabrication process, engineers need to monitor the real-time process of CNC machines, and the lifetime management of machine tools.
In a real manufacturing process, breakage of machine tools usually happens without any indication, this problem seriously affects the fabrication process for many years.
Previous studies suggested many different approaches for monitoring and detecting the breakage of machine tools.
However, there still exists a big gap between academic experiments and the complex real fabrication processes such as the high demands of real-time detections, the difficulty in data acquisition and transmission.
In this work, we use the spindle current approach to detect the breakage of machine tools, which has the high performance of real-time monitoring, low cost, and easy to install.
We analyze the features of the current of a milling machine spindle through tools wearing processes, and then we predict the status of tool breakage by a convolutional neural network(CNN).
In addition, we use a BP neural network to understand the reliability of the CNN.
The results show that our CNN approach can detect tool breakage with an accuracy of 93%, while the best performance of BP is 80%.
Some recent works revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs.
In this work, we revisit the DNN training process that includes adversarial examples into the training dataset so as to improve DNN's resilience to adversarial attacks, namely, adversarial training.
Our experiments show that different adversarial strengths, i.e., perturbation levels of adversarial examples, have different working zones to resist the attack.
Based on the observation, we propose a multi-strength adversarial training method (MAT) that combines the adversarial training examples with different adversarial strengths to defend adversarial attacks.
Two training structures - mixed MAT and parallel MAT - are developed to facilitate the tradeoffs between training time and memory occupation.
Our results show that MAT can substantially minimize the accuracy degradation of deep learning systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.
The paper describes the design, the implementation of a neural controller used in an automatic daylight control system.
The automatic lighting control system (ALCS) attempt to maintain constant the illuminance at the desired level on working plane even if the daylight contribution is variable.
Therefore, the daylight will represent the perturbation signal for the ALCS.
The mathematical model of process is unknown.
The applied structure of control need the inverse model of process.
For this purpose it was used other artificial neural network (ANN) which identify the inverse model of process in an on-line manner.
In fact, this ANN identify the inverse model of process + the perturbation signal.
In this way the learning signal for neural controller has a better accuracy for the present application.
A wide range of numerical methods exists for computing polynomial approximations of solutions of ordinary differential equations based on Chebyshev series expansions or Chebyshev interpolation polynomials.
We consider the application of such methods in the context of rigorous computing (where we need guarantees on the accuracy of the result), and from the complexity point of view.
It is well-known that the order-n truncation of the Chebyshev expansion of a function over a given interval is a near-best uniform polynomial approximation of the function on that interval.
In the case of solutions of linear differential equations with polynomial coefficients, the coefficients of the expansions obey linear recurrence relations with polynomial coefficients.
Unfortunately, these recurrences do not lend themselves to a direct recursive computation of the coefficients, owing among other things to a lack of initial conditions.
We show how they can nevertheless be used, as part of a validated process, to compute good uniform approximations of D-finite functions together with rigorous error bounds, and we study the complexity of the resulting algorithms.
Our approach is based on a new view of a classical numerical method going back to Clenshaw, combined with a functional enclosure method.
Network traffic model is a critical problem for urban applications, mainly because of its diversity and node density.
As wireless sensor network is highly concerned with the development of smart cities, careful consideration to traffic model helps choose appropriate protocols and adapt network parameters to reach best performances on energy-latency tradeoffs.
In this paper, we compare the performance of two off-the-shelf medium access control protocols on two different kinds of traffic models, and then evaluate their application-end information delay and energy consumption while varying traffic parameters and network density.
From the simulation results, we highlight some limits induced by network density and occurrence frequency of event-driven applications.
When it comes to realtime urban services, a protocol selection shall be taken into account - even dynamically - with a special attention to energy-delay tradeoff.
To this end, we provide several insights on parking sensor networks.
We revisit two NP-hard geometric partitioning problems - convex decomposition and surface approximation.
Building on recent developments in geometric separators, we present quasi-polynomial time algorithms for these problems with improved approximation guarantees.
Human motion prediction model has applications in various fields of computer vision.
Without taking into account the inherent stochasticity in the prediction of future pose dynamics, such methods often converges to a deterministic undesired mean of multiple probable outcomes.
Devoid of this, we propose a novel probabilistic generative approach called Bidirectional Human motion prediction GAN, or BiHMP-GAN.
To be able to generate multiple probable human-pose sequences, conditioned on a given starting sequence, we introduce a random extrinsic factor r, drawn from a predefined prior distribution.
Furthermore, to enforce a direct content loss on the predicted motion sequence and also to avoid mode-collapse, a novel bidirectional framework is incorporated by modifying the usual discriminator architecture.
The discriminator is trained also to regress this extrinsic factor r, which is used alongside with the intrinsic factor (encoded starting pose sequence) to generate a particular pose sequence.
To further regularize the training, we introduce a novel recursive prediction strategy.
In spite of being in a probabilistic framework, the enhanced discriminator architecture allows predictions of an intermediate part of pose sequence to be used as a conditioning for prediction of the latter part of the sequence.
The bidirectional setup also provides a new direction to evaluate the prediction quality against a given test sequence.
For a fair assessment of BiHMP-GAN, we report performance of the generated motion sequence using (i) a critic model trained to discriminate between real and fake motion sequence, and (ii) an action classifier trained on real human motion dynamics.
Outcomes of both qualitative and quantitative evaluations, on the probabilistic generations of the model, demonstrate the superiority of BiHMP-GAN over previously available methods.
We present a novel layerwise optimization algorithm for the learning objective of Piecewise-Linear Convolutional Neural Networks (PL-CNNs), a large class of convolutional neural networks.
Specifically, PL-CNNs employ piecewise linear non-linearities such as the commonly used ReLU and max-pool, and an SVM classifier as the final layer.
The key observation of our approach is that the problem corresponding to the parameter estimation of a layer can be formulated as a difference-of-convex (DC) program, which happens to be a latent structured SVM.
We optimize the DC program using the concave-convex procedure, which requires us to iteratively solve a structured SVM problem.
This allows to design an optimization algorithm with an optimal learning rate that does not require any tuning.
Using the MNIST, CIFAR and ImageNet data sets, we show that our approach always improves over the state of the art variants of backpropagation and scales to large data and large network settings.
We study the lobby index (l-index for short) as a local node centrality measure for complex networks.
The l-inde is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of biological network (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II).
In both networks, the l-index has poor correlation with betweenness but correlates with degree and Eigenvector.
Being a local measure, one can take advantage by using the l-index because it carries more information about its neighbors when compared with degree centrality, indeed it requires less time to compute when compared with Eigenvector centrality.
Results suggests that l-index produces better results than degree and Eigenvector measures for ranking purposes, becoming suitable as a tool to perform this task.
Supervised learning tends to produce more accurate classifiers than unsupervised learning in general.
This implies that training data is preferred with annotations.
When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck.
The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images.
A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required.
This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous.
As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision.
However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA).
In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA.
As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data.
While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Generative adversarial networks (GANs) are a recent approach to train generative models of data, which have been shown to work particularly well on image data.
In the current paper we introduce a new model for texture synthesis based on GAN learning.
By extending the input noise distribution space from a single vector to a whole spatial tensor, we create an architecture with properties well suited to the task of texture synthesis, which we call spatial GAN (SGAN).
To our knowledge, this is the first successful completely data-driven texture synthesis method based on GANs.
Our method has the following features which make it a state of the art algorithm for texture synthesis: high image quality of the generated textures, very high scalability w.r.t. the output texture size, fast real-time forward generation, the ability to fuse multiple diverse source images in complex textures.
To illustrate these capabilities we present multiple experiments with different classes of texture images and use cases.
We also discuss some limitations of our method with respect to the types of texture images it can synthesize, and compare it to other neural techniques for texture generation.
Structured illumination microscopy (SIM) is a very important super-resolution microscopy technique, which provides high speed super-resolution with about two-fold spatial resolution enhancement.
Several attempts aimed at improving the performance of SIM reconstruction algorithm have been reported.
However, most of these highlight only one specific aspect of the SIM reconstruction -- such as the determination of the illumination pattern phase shift accurately -- whereas other key elements -- such as determination of modulation factor, estimation of object power spectrum, Wiener filtering frequency components with inclusion of object power spectrum information, translocating and the merging of the overlapping frequency components -- are usually glossed over superficially.
In addition, most of the work reported lie scattered throughout the literature and a comprehensive review of the theoretical background is found lacking.
The purpose of the present work is two-fold: 1) to collect the essential theoretical details of SIM algorithm at one place, thereby making them readily accessible to readers for the first time; and 2) to provide an open source SIM reconstruction code (named OpenSIM), which enables users to interactively vary the code parameters and study it's effect on reconstructed SIM image.
Over the past few years, many black-hat marketplaces have emerged that facilitate access to reputation manipulation services such as fake Facebook likes, fraudulent search engine optimization (SEO), or bogus Amazon reviews.
In order to deploy effective technical and legal countermeasures, it is important to understand how these black-hat marketplaces operate, shedding light on the services they offer, who is selling, who is buying, what are they buying, who is more successful, why are they successful, etc.
Toward this goal, in this paper, we present a detailed micro-economic analysis of a popular online black-hat marketplace, namely, SEOClerks.com.
As the site provides non-anonymized transaction information, we set to analyze selling and buying behavior of individual users, propose a strategy to identify key users, and study their tactics as compared to other (non-key) users.
We find that key users: (1) are mostly located in Asian countries, (2) are focused more on selling black-hat SEO services, (3) tend to list more lower priced services, and (4) sometimes buy services from other sellers and then sell at higher prices.
Finally, we discuss the implications of our analysis with respect to devising effective economic and legal intervention strategies against marketplace operators and key users.
A classical theorem of Erdos, Lovasz and Spencer asserts that the densities of connected subgraphs in large graphs are independent.
We prove an analogue of this theorem for permutations and we then apply the methods used in the proof to give an example of a finitely approximable permutation parameter that is not finitely forcible.
The latter answers a question posed by two of the authors and Moreira and Sampaio.
Products of Hidden Markov Models(PoHMMs) are an interesting class of generative models which have received little attention since their introduction.
This maybe in part due to their more computationally expensive gradient-based learning algorithm,and the intractability of computing the log likelihood of sequences under the model.
In this paper, we demonstrate how the partition function can be estimated reliably via Annealed Importance Sampling.
We perform experiments using contrastive divergence learning on rainfall data and data captured from pairs of people dancing.
Our results suggest that advances in learning and evaluation for undirected graphical models and recent increases in available computing power make PoHMMs worth considering for complex time-series modeling tasks.
Due to the rapidly rising popularity of Massive Open Online Courses (MOOCs), there is a growing demand for scalable automated support technologies for student learning.
Transferring traditional educational resources to online contexts has become an increasingly relevant problem in recent years.
For learning science theories to be applicable, educators need a way to identify learning behaviors of students which contribute to learning outcomes, and use them to design and provide personalized intervention support to the students.
Click logs are an important source of information about students' learning behaviors, however current literature has limited understanding of how these behaviors are represented within click logs.
In this project, we have exploited the temporal dynamics of student behaviors both to do behavior modeling via graphical modeling approaches and to do performance prediction via recurrent neural network approaches in order to first identify student behaviors and then use them to predict their final outcome in the course.
Our experiments showed that the long short-term memory (LSTM) model is capable of learning long-term dependencies in a sequence and outperforms other strong baselines in the prediction task.
Further, these sequential approaches to click log analysis can be successfully imported to other courses when used with results obtained from graphical model behavior modeling.
While Kolmogorov complexity is the accepted absolute measure of information content in an individual finite object, a similarly absolute notion is needed for the information distance between two individual objects, for example, two pictures.
We give several natural definitions of a universal information metric, based on length of shortest programs for either ordinary computations or reversible (dissipationless) computations.
It turns out that these definitions are equivalent up to an additive logarithmic term.
We show that the information distance is a universal cognitive similarity distance.
We investigate the maximal correlation of the shortest programs involved, the maximal uncorrelation of programs (a generalization of the Slepian-Wolf theorem of classical information theory), and the density properties of the discrete metric spaces induced by the information distances.
A related distance measures the amount of nonreversibility of a computation.
Using the physical theory of reversible computation, we give an appropriate (universal, anti-symmetric, and transitive) measure of the thermodynamic work required to transform one object in another object by the most efficient process.
Information distance between individual objects is needed in pattern recognition where one wants to express effective notions of "pattern similarity" or "cognitive similarity" between individual objects and in thermodynamics of computation where one wants to analyse the energy dissipation of a computation from a particular input to a particular output.
Coreference resolution is an intermediate step for text understanding.
It is used in tasks and domains for which we do not necessarily have coreference annotated corpora.
Therefore, generalization is of special importance for coreference resolution.
However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or datasets.
In this paper, we investigate the role of linguistic features in building more generalizable coreference resolvers.
We show that generalization improves only slightly by merely using a set of additional linguistic features.
However, employing features and subsets of their values that are informative for coreference resolution, considerably improves generalization.
Thanks to better generalization, our system achieves state-of-the-art results in out-of-domain evaluations, e.g., on WikiCoref, our system, which is trained on CoNLL, achieves on-par performance with a system designed for this dataset.
The vast majority of today's mobile malware targets Android devices.
This has pushed the research effort in Android malware analysis in the last years.
An important task of malware analysis is the classification of malware samples into known families.
Static malware analysis is known to fall short against techniques that change static characteristics of the malware (e.g. code obfuscation), while dynamic analysis has proven effective against such techniques.
To the best of our knowledge, the most notable work on Android malware family classification purely based on dynamic analysis is DroidScribe.
With respect to DroidScribe, our approach is easier to reproduce.
Our methodology only employs publicly available tools, does not require any modification to the emulated environment or Android OS, and can collect data from physical devices.
The latter is a key factor, since modern mobile malware can detect the emulated environment and hide their malicious behavior.
Our approach relies on resource consumption metrics available from the proc file system.
Features are extracted through detrended fluctuation analysis and correlation.
Finally, a SVM is employed to classify malware into families.
We provide an experimental evaluation on malware samples from the Drebin dataset, where we obtain a classification accuracy of 82%, proving that our methodology achieves an accuracy comparable to that of DroidScribe.
Furthermore, we make the software we developed publicly available, to ease the reproducibility of our results.
It has been suggested that changes in physiological arousal precede potentially dangerous aggressive behavior in youth with autism spectrum disorder (ASD) who are minimally verbal (MV-ASD).
The current work tests this hypothesis through time-series analyses on biosignals acquired prior to proximal aggression onset.
We implement ridge-regularized logistic regression models on physiological biosensor data wirelessly recorded from 15 MV-ASD youth over 64 independent naturalistic observations in a hospital inpatient unit.
Our results demonstrate proof-of-concept, feasibility, and incipient validity predicting aggression onset 1 minute before it occurs using global, person-dependent, and hybrid classifier models.
We introduce a novel multimodal machine translation model that utilizes parallel visual and textual information.
Our model jointly optimizes the learning of a shared visual-language embedding and a translator.
The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics.
Our approach achieves competitive state-of-the-art results on the Multi30K and the Ambiguous COCO datasets.
We also collected a new multilingual multimodal product description dataset to simulate a real-world international online shopping scenario.
On this dataset, our visual attention grounding model outperforms other methods by a large margin.
Having more followers has become a norm in recent social media and micro-blogging communities.
This battle has been taking shape from the early days of Twitter.
Despite this strong competition for followers, many Twitter users are continuously losing their followers.
This work addresses the problem of identifying the reasons behind the drop of followers of users in Twitter.
As a first step, we extract various features by analyzing the content of the posts made by the Twitter users who lose followers consistently.
We then leverage these features to early detect follower loss.
We propose various models and yield an overall accuracy of 73% with high precision and recall.
Our model outperforms baseline model by 19.67% (w.r.t accuracy), 33.8% (w.r.t precision) and 14.3% (w.r.t recall).
Continuous integration (CI) tools integrate code changes by automatically compiling, building, and executing test cases upon submission of code changes.
Use of CI tools is getting increasingly popular, yet how proprietary projects reap the benefits of CI remains unknown.
To investigate the influence of CI on software development, we analyze 150 open source software (OSS) projects, and 123 proprietary projects.
For OSS projects, we observe the expected benefits after CI adoption, e.g., improvements in bug and issue resolution.
However, for the proprietary projects, we cannot make similar observations.
Our findings indicate that only adoption of CI might not be enough to the improve software development process.
CI can be effective for software development if practitioners use CI's feedback mechanism efficiently, by applying the practice of making frequent commits.
For our set of proprietary projects we observe practitioners commit less frequently, and hence not use CI effectively for obtaining feedback on the submitted code changes.
Based on our findings we recommend industry practitioners to adopt the best practices of CI to reap the benefits of CI tools for example, making frequent commits.
DynamicGEM is an open-source Python library for learning node representations of dynamic graphs.
It consists of state-of-the-art algorithms for defining embeddings of nodes whose connections evolve over time.
The library also contains the evaluation framework for four downstream tasks on the network: graph reconstruction, static and temporal link prediction, node classification, and temporal visualization.
We have implemented various metrics to evaluate the state-of-the-art methods, and examples of evolving networks from various domains.
We have easy-to-use functions to call and evaluate the methods and have extensive usage documentation.
Furthermore, DynamicGEM provides a template to add new algorithms with ease to facilitate further research on the topic.
Videos are one of the best documentation options for a rich and effective communication.
They allow experiencing the overall context of a situation by representing concrete realizations of certain requirements.
Despite 35 years of research on integrating videos in requirements engineering (RE), videos are not an established documentation option in terms of RE best practices.
Several approaches use videos but omit the details about how to produce them.
Software professionals lack knowledge on how to communicate visually with videos since they are not directors.
Therefore, they do not necessarily have the required skills neither to produce good videos in general nor to deduce what constitutes a good video for an existing approach.
The discipline of video production provides numerous generic guidelines that represent best practices on how to produce a good video with specific characteristics.
We propose to analyze this existing know-how to learn what constitutes a good video for visual communication.
As a plan of action, we suggest a literature study of video production guidelines.
We expect to identify quality characteristics of good videos in order to derive a quality model.
Software professionals may use such a quality model for videos as an orientation for planning, shooting, post-processing, and viewing a video.
Thus, we want to encourage and enable software professionals to produce good videos at moderate costs, yet sufficient quality.
The "Smart City" (SC) concept revolves around the idea of embodying cutting-edge ICT solutions in the very fabric of future cities, in order to offer new and better services to citizens while lowering the city management costs, both in monetary, social, and environmental terms.
In this framework, communication technologies are perceived as subservient to the SC services, providing the means to collect and process the data needed to make the services function.
In this paper, we propose a new vision in which technology and SC services are designed to take advantage of each other in a symbiotic manner.
According to this new paradigm, which we call "SymbioCity", SC services can indeed be exploited to improve the performance of the same communication systems that provide them with data.
Suggestive examples of this symbiotic ecosystem are discussed in the paper.
The dissertation is then substantiated in a proof-of-concept case study, where we show how the traffic monitoring service provided by the London Smart City initiative can be used to predict the density of users in a certain zone and optimize the cellular service in that area.
Seeking a general framework for reasoning about and comparing programming languages, we derive a new view of Milner's CCS.
We construct a category E of 'plays', and a subcategory V of 'views'.
We argue that presheaves on V adequately represent 'innocent' strategies, in the sense of game semantics.
We equip innocent strategies with a simple notion of interaction.
We then prove decomposition results for innocent strategies, and, restricting to presheaves of finite ordinals, prove that innocent strategies are a final coalgebra for a polynomial functor derived from the game.
This leads to a translation of CCS with recursive equations.
Finally, we propose a notion of 'interactive equivalence' for innocent strategies, which is close in spirit to Beffara's interpretation of testing equivalences in concurrency theory.
In this framework, we consider analogues of fair testing and must testing.
We show that must testing is strictly finer in our model than in CCS, since it avoids what we call 'spatial unfairness'.
Still, it differs from fair testing, and we show that it coincides with a relaxed form of fair testing.
In this paper, we present an unsupervised learning framework for analyzing activities and interactions in surveillance videos.
In our framework, three levels of video events are connected by Hierarchical Dirichlet Process (HDP) model: low-level visual features, simple atomic activities, and multi-agent interactions.
Atomic activities are represented as distribution of low-level features, while complicated interactions are represented as distribution of atomic activities.
This learning process is unsupervised.
Given a training video sequence, low-level visual features are extracted based on optic flow and then clustered into different atomic activities and video clips are clustered into different interactions.
The HDP model automatically decide the number of clusters, i.e. the categories of atomic activities and interactions.
Based on the learned atomic activities and interactions, a training dataset is generated to train the Gaussian Process (GP) classifier.
Then the trained GP models work in newly captured video to classify interactions and detect abnormal events in real time.
Furthermore, the temporal dependencies between video events learned by HDP-Hidden Markov Models (HMM) are effectively integrated into GP classifier to enhance the accuracy of the classification in newly captured videos.
Our framework couples the benefits of the generative model (HDP) with the discriminant model (GP).
We provide detailed experiments showing that our framework enjoys favorable performance in video event classification in real-time in a crowded traffic scene.
We propose an automatic algorithm, named SDI, for the segmentation of skin lesions in dermoscopic images, articulated into three main steps: selection of the image ROI, selection of the segmentation band, and segmentation.
We present extensive experimental results achieved by the SDI algorithm on the lesion segmentation dataset made available for the ISIC 2017 challenge on Skin Lesion Analysis Towards Melanoma Detection, highlighting its advantages and disadvantages.
Evolutionary algorithms based on modeling the statistical dependencies (interactions) between the variables have been proposed to solve a wide range of complex problems.
These algorithms learn and sample probabilistic graphical models able to encode and exploit the regularities of the problem.
This paper investigates the effect of using probabilistic modeling techniques as a way to enhance the behavior of MOEA/D framework.
MOEA/D is a decomposition based evolutionary algorithm that decomposes a multi-objective optimization problem (MOP) in a number of scalar single-objective subproblems and optimizes them in a collaborative manner.
MOEA/D framework has been widely used to solve several MOPs.
The proposed algorithm, MOEA/D using probabilistic Graphical Models (MOEA/D-GM) is able to instantiate both univariate and multi-variate probabilistic models for each subproblem.
To validate the introduced framework algorithm, an experimental study is conducted on a multi-objective version of the deceptive function Trap5.
The results show that the variant of the framework (MOEA/D-Tree), where tree models are learned from the matrices of the mutual information between the variables, is able to capture the structure of the problem.
MOEA/D-Tree is able to achieve significantly better results than both MOEA/D using genetic operators and MOEA/D using univariate probability models, in terms of the approximation to the true Pareto front.
The variance component tests used in genomewide association studies of thousands of individuals become computationally exhaustive when multiple traits are analysed in the context of omics studies.
We introduce two high-throughput algorithms -- CLAK-CHOL and CLAK-EIG -- for single and multiple phenotype genome-wide association studies (GWAS).
The algorithms, generated with the help of an expert system, reduce the computational complexity to the point that thousands of traits can be analyzed for association with millions of polymorphisms in a course of days on a standard workstation.
By taking advantage of problem specific knowledge, CLAK-CHOL and CLAK-EIG significantly outperform the current state-of-the-art tools in both single and multiple trait analysis.
This paper addresses the sensitivity of neural image caption generators to their visual input.
A sensitivity analysis and omission analysis based on image foils is reported, showing that the extent to which image captioning architectures retain and are sensitive to visual information varies depending on the type of word being generated and the position in the caption as a whole.
We motivate this work in the context of broader goals in the field to achieve more explainability in AI.
This paper proposes to perform authorship analysis using the Fast Compression Distance (FCD), a similarity measure based on compression with dictionaries directly extracted from the written texts.
The FCD computes a similarity between two documents through an effective binary search on the intersection set between the two related dictionaries.
In the reported experiments the proposed method is applied to documents which are heterogeneous in style, written in five different languages and coming from different historical periods.
Results are comparable to the state of the art and outperform traditional compression-based methods.
Robust estimators, like the median of a point set, are important for data analysis in the presence of outliers.
We study robust estimators for locationally uncertain points with discrete distributions.
That is, each point in a data set has a discrete probability distribution describing its location.
The probabilistic nature of uncertain data makes it challenging to compute such estimators, since the true value of the estimator is now described by a distribution rather than a single point.
We show how to construct and estimate the distribution of the median of a point set.
Building the approximate support of the distribution takes near-linear time, and assigning probability to that support takes quadratic time.
We also develop a general approximation technique for distributions of robust estimators with respect to ranges with bounded VC dimension.
This includes the geometric median for high dimensions and the Siegel estimator for linear regression.
We show that the mechanisms used in the name data networking (NDN) and the original content centric networking (CCN) architectures may not detect Interest loops, even if the network in which they operate is static and no faults occur.
Furthermore, we show that no correct Interest forwarding strategy can be defined that allows Interest aggregation and attempts to detect Interest looping by identifying Interests uniquely.
We introduce SIFAH (Strategy for Interest Forwarding and Aggregation with Hop-Counts), the first Interest forwarding strategy shown to be correct under any operational conditions of a content centric network.
SIFAH operates by having forwarding information bases (FIBs) store the next hops and number of hops to named content, and by having each Interest state the name of the requested content and the hop count from the router forwarding an Interest to the content.
We present the results of simulation experiments using the ndnSIM simulator comparing CCN and NDN with SIFAH.
The results of these experiments illustrate the negative impact of undetected Interest looping when Interests are aggregated in CCN and NDN, and the performance advantages of using SIFAH.
A novel framework is presented for the analysis of multi-level coding that takes into account degrees of freedom attended and ignored by the different levels of analysis.
It can be shown that for a multi-level coding system, skipped or incomplete error correction at many levels can save energy and provide equally good results to perfect correction.
This is the case for both discrete and continuous cases.
This has relevance to approximate computing, and also to deep learning networks, which can readily be construed as multiple levels of inadequate error correction reacting to some input signal, but which are typically considered beyond analysis by traditional information theoretical methods.
The finding also has significance in natural systems, e.g. neuronal signaling, vision, and molecular genetics, which can be characterized as relying on multiple layers of inadequate error correction.
In the context of Noisy Multi-Objective Optimization, dealing with uncertainties requires the decision maker to define some preferences about how to handle them, through some statistics (e.g., mean, median) to be used to evaluate the qualities of the solutions, and define the corresponding Pareto set.
Approximating these statistics requires repeated samplings of the population, drastically increasing the overall computational cost.
To tackle this issue, this paper proposes to directly estimate the probability of each individual to be selected, using some Hoeffding races to dynamically assign the estimation budget during the selection step.
The proposed racing approach is validated against static budget approaches with NSGA-II on noisy versions of the ZDT benchmark functions.
Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective.
We introduce an integrated deep neural network architecture for modeling ESM.
It learns to estimate the occupancy state of the world and progressively construct top-down 2D global maps from egocentric views in a spatially extended environment.
During the exploration, our proposed ESM model updates belief of the global map based on local observations using a recurrent neural network.
It also augments the local mapping with a novel external memory to encode and store latent representations of the visited places over long-term exploration in large environments which enables agents to perform place recognition and hence, loop closure.
Our proposed ESM network contributes in the following aspects: (1) without feature engineering, our model predicts free space based on egocentric views efficiently in an end-to-end manner; (2) different from other deep learning-based mapping system, ESMN deals with continuous actions and states which is vitally important for robotic control in real applications.
In the experiments, we demonstrate its accurate and robust global mapping capacities in 3D virtual mazes and realistic indoor environments by comparing with several competitive baselines.
Mathematical theorems are human knowledge able to be accumulated in the form of symbolic representation, and proving theorems has been considered intelligent behavior.
Based on the BHK interpretation and the Curry-Howard isomorphism, proof assistants, software capable of interacting with human for constructing formal proofs, have been developed in the past several decades.
Since proofs can be considered and expressed as programs, proof assistants simplify and verify a proof by computationally evaluating the program corresponding to the proof.
Thanks to the transformation from logic to computation, it is now possible to generate or search for formal proofs directly in the realm of computation.
Evolutionary algorithms, known to be flexible and versatile, have been successfully applied to handle a variety of scientific and engineering problems in numerous disciplines for also several decades.
Examining the feasibility of establishing the link between evolutionary algorithms, as the program generator, and proof assistants, as the proof verifier, in order to automatically find formal proofs to a given logic sentence is the primary goal of this study.
In the article, we describe in detail our first, ad-hoc attempt to fully automatically prove theorems as well as the preliminary results.
Ten simple theorems from various branches of mathematics were proven, and most of these theorems cannot be proven by using the tactic auto alone in Coq, the adopted proof assistant.
The implication and potential influence of this study are discussed, and the developed source code with the obtained experimental results are released as open source.
This paper introduces a network for volumetric segmentation that learns from sparsely annotated volumetric images.
We outline two attractive use cases of this method: (1) In a semi-automated setup, the user annotates some slices in the volume to be segmented.
The network learns from these sparse annotations and provides a dense 3D segmentation.
(2) In a fully-automated setup, we assume that a representative, sparsely annotated training set exists.
Trained on this data set, the network densely segments new volumetric images.
The proposed network extends the previous u-net architecture from Ronneberger et al. by replacing all 2D operations with their 3D counterparts.
The implementation performs on-the-fly elastic deformations for efficient data augmentation during training.
It is trained end-to-end from scratch, i.e., no pre-trained network is required.
We test the performance of the proposed method on a complex, highly variable 3D structure, the Xenopus kidney, and achieve good results for both use cases.
Energy storage systems (EES) are expected to be an indispensable resource for mitigating the effects on networks of high penetrations of distributed generation in the near future.
This paper analyzes the benefits of EES in unbalanced low voltage (LV) networks regarding three aspects, namely, power losses, the hosting capacity and network unbalance.
For doing so, a mixed integer quadratic programmming model (MIQP) is developed to minimize annual energy losses and determine the sizing and placement of ESS, while satisfying voltage constraints.
A real unbalanced LV UK grid is adopted to examine the effects of ESS under two scenarios: the installation of one community ESS (CESS) and multiple distributed ESSs (DESSs).
The results illustrate that both scenarios present high performance in accomplishing the above tasks, while DESSs, with the same aggregated size, are slightly better.
This margin is expected to be amplified as the aggregated size of DESSs increases.
Separate selling of two independent goods is shown to yield at least 62% of the optimal revenue, and at least 73% when the goods satisfy the Myerson regularity condition.
This improves the 50% result of Hart and Nisan (2017, originally circulated in 2012).
The implementation of smart building technology in the form of smart infrastructure applications has great potential to improve sustainability and energy efficiency by leveraging humans-in-the-loop strategy.
However, human preference in regard to living conditions is usually unknown and heterogeneous in its manifestation as control inputs to a building.
Furthermore, the occupants of a building typically lack the independent motivation necessary to contribute to and play a key role in the control of smart building infrastructure.
Moreover, true human actions and their integration with sensing/actuation platforms remains unknown to the decision maker tasked with improving operational efficiency.
By modeling user interaction as a sequential discrete game between non-cooperative players, we introduce a gamification approach for supporting user engagement and integration in a human-centric cyber-physical system.
We propose the design and implementation of a large-scale network game with the goal of improving the energy efficiency of a building through the utilization of cutting-edge Internet of Things (IoT) sensors and cyber-physical systems sensing/actuation platforms.
A benchmark utility learning framework that employs robust estimations for classical discrete choice models provided for the derived high dimensional imbalanced data.
To improve forecasting performance, we extend the benchmark utility learning scheme by leveraging Deep Learning end-to-end training with Deep bi-directional Recurrent Neural Networks.
We apply the proposed methods to high dimensional data from a social game experiment designed to encourage energy efficient behavior among smart building occupants in Nanyang Technological University (NTU) residential housing.
Using occupant-retrieved actions for resources such as lighting and A/C, we simulate the game defined by the estimated utility functions.
Earthquake signal detection is at the core of observational seismology.
A good detection algorithm should be sensitive to small and weak events with a variety of waveform shapes, robust to background noise and non-earthquake signals, and efficient for processing large data volumes.
Here, we introduce the Cnn-Rnn Earthquake Detector (CRED), a detector based on deep neural networks.
The network uses a combination of convolutional layers and bi-directional long-short-term memory units in a residual structure.
It learns the time-frequency characteristics of the dominant phases in an earthquake signal from three component data recorded on a single station.
We train the network using 500,000 seismograms (250k associated with tectonic earthquakes and 250k identified as noise) recorded in Northern California and tested it with an F-score of 99.95.
The robustness of the trained model with respect to the noise level and non-earthquake signals is shown by applying it to a set of semi-synthetic signals.
The model is applied to one month of continuous data recorded at Central Arkansas to demonstrate its efficiency, generalization, and sensitivity.
Our model is able to detect more than 700 microearthquakes as small as -1.3 ML induced during hydraulic fracturing far away than the training region.
The performance of the model is compared with STA/LTA, template matching, and FAST algorithms.
Our results indicate an efficient and reliable performance of CRED.
This framework holds great promise in lowering the detection threshold while minimizing false positive detection rates.
Multi-view image-based rendering consists in generating a novel view of a scene from a set of source views.
In general, this works by first doing a coarse 3D reconstruction of the scene, and then using this reconstruction to establish correspondences between source and target views, followed by blending the warped views to get the final image.
Unfortunately, discontinuities in the blending weights, due to scene geometry or camera placement, result in artifacts in the target view.
In this paper, we show how to avoid these artifacts by imposing additional constraints on the image gradients of the novel view.
We propose a variational framework in which an energy functional is derived and optimized by iteratively solving a linear system.
We demonstrate this method on several structured and unstructured multi-view datasets, and show that it numerically outperforms state-of-the-art methods, and eliminates artifacts that result from visibility discontinuities
The U.S Government has been the target for cyber-attacks from all over the world.
Just recently, former President Obama accused the Russian government of the leaking emails to Wikileaks and declared that the U.S. might be forced to respond.
While Russia denied involvement, it is clear that the U.S. has to take some defensive measures to protect its data infrastructure.
Insider threats have been the cause of other sensitive information leaks too, including the infamous Edward Snowden incident.
Most of the recent leaks were in the form of text.
Due to the nature of text data, security classifications are assigned manually.
In an adversarial environment, insiders can leak texts through E-mail, printers, or any untrusted channels.
The optimal defense is to automatically detect the unstructured text security class and enforce the appropriate protection mechanism without degrading services or daily tasks.
Unfortunately, existing Data Leak Prevention (DLP) systems are not well suited for detecting unstructured texts.
In this paper, we compare two recent approaches in the literature for text security classification, evaluating them on actual sensitive text data from the WikiLeaks dataset.
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses.
For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and prosodic features.
We find that different types of acoustic-prosodic features are individually helpful, and together give statistically significant improvements in parse and disfluency detection F1 scores over a strong text-only baseline.
For this study with known sentence boundaries, error analyses show that the main benefit of acoustic-prosodic features is in sentences with disfluencies, attachment decisions are most improved, and transcription errors obscure gains from prosody.
In this paper, we show the evaluation of the spectral radius for node degree as the basis to analyze the variation in the node degrees during the evolution of scale-free networks and small-world networks.
Spectral radius is the principal eigenvalue of the adjacency matrix of a network graph and spectral radius ratio for node degree is the ratio of the spectral radius and the average node degree.
We observe a very high positive correlation between the spectral radius ratio for node degree and the coefficient of variation of node degree (ratio of the standard deviation of node degree and average node degree).
We show how the spectral radius ratio for node degree can be used as the basis to tune the operating parameters of the evolution models for scale-free networks and small-world networks as well as evaluate the impact of the number of links added per node introduced during the evolution of a scale-free network and evaluate the impact of the probability of rewiring during the evolution of a small-world network from a regular network.
End-to-end networks trained for task-oriented dialog, such as for recommending restaurants to a user, suffer from out-of-vocabulary (OOV) problem -- the entities in the Knowledge Base (KB) may not be seen by the network at training time, making it hard to use them in dialog.
We propose a novel Hierarchical Pointer Generator Memory Network (HyP-MN), in which the next word may be generated from the decode vocabulary or copied from a hierarchical memory maintaining KB results and previous utterances.
This hierarchical memory layout along with a novel KB dropout helps to alleviate the OOV problem.
Evaluating over the dialog bAbI tasks, we find that HyP-MN outperforms state-of-the-art results, with considerable improvements (10% on OOV test set).
HyP-MN also achieves competitive performances on various real-world datasets such as CamRest676 and In-car assistant dataset.
Currently, various hardware and software companies are developing augmented reality devices, most prominently Microsoft with its Hololens.
Besides gaming, such devices can be used for serious pervasive applications, like interactive mobile simulations to support engineers in the field.
Interactive simulations have high demands on resources, which the mobile device alone is unable to satisfy.
Therefore, we propose a framework to support mobile simulations by distributing the computation between the mobile device and a remote server based on the reduced basis method.
Evaluations show that we can speed-up the numerical computation by over 131 times while using 73 times less energy.
We present a system for identifying conceptual shifts between visual categories, which will form the basis for a co-creative drawing system to help users draw more creative sketches.
The system recognizes human sketches and matches them to structurally similar sketches from categories to which they do not belong.
This would allow a co-creative drawing system to produce an ambiguous sketch that blends features from both categories.
Data mining practitioners are facing challenges from data with network structure.
In this paper, we address a specific class of global-state networks which comprises of a set of network instances sharing a similar structure yet having different values at local nodes.
Each instance is associated with a global state which indicates the occurrence of an event.
The objective is to uncover a small set of discriminative subnetworks that can optimally classify global network values.
Unlike most existing studies which explore an exponential subnetwork space, we address this difficult problem by adopting a space transformation approach.
Specifically, we present an algorithm that optimizes a constrained dual-objective function to learn a low-dimensional subspace that is capable of discriminating networks labelled by different global states, while reconciling with common network topology sharing across instances.
Our algorithm takes an appealing approach from spectral graph learning and we show that the globally optimum solution can be achieved via matrix eigen-decomposition.
This paper presents an angle-based approach for distributed formation shape stabilization of multi-agent systems in the plane.
We develop an angle rigidity theory to study whether a planar framework can be determined by angles between segments uniquely up to translations, rotations, scalings and reflections.
The proposed angle rigidity theory is applied to the formation stabilization problem, where multiple single-integrator modeled agents cooperatively achieve an angle-constrained formation.
During the formation process, the global coordinate system is unknown for each agent and wireless communications between agents are not required.
Moreover, by utilizing the advantage of high degrees of freedom, we propose a distributed control law for agents to stabilize a desired formation shape with desired orientation and scale.
Two simulation examples are performed for illustrating effectiveness of the proposed control strategies.
In this paper we proposed an ultimate theory to solve the multi-target control problem through its introduction to the machine learning framework in automatic driving, which explored the implementation of excellent drivers' knowledge acquisition.
Nowadays there exist some core problems that have not been fully realized by the researchers in automatic driving, such as the optimal way to control the multi-target objective functions of energy saving, safe driving, headway distance control and comfort driving, as well as the resolvability of the networks that automatic driving relied on and the high-performance chips like GPU on the complex driving environments.
According to these problems, we developed a new theory to map multitarget objective functions in different spaces into the same one and thus introduced a machine learning framework of SDL(Super Deep Learning) for optimal multi-targetcontrol based on knowledge acquisition.
We will present in this paper the optimal multi-target control by combining the fuzzy relationship of each multi-target objective function and the implementation of excellent drivers' knowledge acquired by machine learning.
Theoretically, the impact of this method will exceed that of the fuzzy control method used in automatic train.
Previous approaches to model and analyze facial expression analysis use three different techniques: facial action units, geometric features and graph based modelling.
However, previous approaches have treated these technique separately.
There is an interrelationship between these techniques.
The facial expression analysis is significantly improved by utilizing these mappings between major geometric features involved in facial expressions and the subset of facial action units whose presence or absence are unique to a facial expression.
This paper combines dimension reduction techniques and image classification with search space pruning achieved by this unique subset of facial action units to significantly prune the search space.
The performance results on the publicly facial expression database shows an improvement in performance by 70% over time while maintaining the emotion recognition correctness.
We present a survey on maritime object detection and tracking approaches, which are essential for the development of a navigational system for autonomous ships.
The electro-optical (EO) sensor considered here is a video camera that operates in the visible or the infrared spectra, which conventionally complement radar and sonar and have demonstrated effectiveness for situational awareness at sea has demonstrated its effectiveness over the last few years.
This paper provides a comprehensive overview of various approaches of video processing for object detection and tracking in the maritime environment.
We follow an approach-based taxonomy wherein the advantages and limitations of each approach are compared.
The object detection system consists of the following modules: horizon detection, static background subtraction and foreground segmentation.
Each of these has been studied extensively in maritime situations and has been shown to be challenging due to the presence of background motion especially due to waves and wakes.
The main processes involved in object tracking include video frame registration, dynamic background subtraction, and the object tracking algorithm itself.
The challenges for robust tracking arise due to camera motion, dynamic background and low contrast of tracked object, possibly due to environmental degradation.
The survey also discusses multisensor approaches and commercial maritime systems that use EO sensors.
The survey also highlights methods from computer vision research which hold promise to perform well in maritime EO data processing.
Performance of several maritime and computer vision techniques is evaluated on newly proposed Singapore Maritime Dataset.
In this paper we extend the classical notion of strong and weak backdoor sets for SAT and CSP by allowing that different instantiations of the backdoor variables result in instances that belong to different base classes; the union of the base classes forms a heterogeneous base class.
Backdoor sets to heterogeneous base classes can be much smaller than backdoor sets to homogeneous ones, hence they are much more desirable but possibly harder to find.
We draw a detailed complexity landscape for the problem of detecting strong and weak backdoor sets into heterogeneous base classes for SAT and CSP.
We present a novel cross-view classification algorithm where the gallery and probe data come from different views.
A popular approach to tackle this problem is the multi-view subspace learning (MvSL) that aims to learn a latent subspace shared by multi-view data.
Despite promising results obtained on some applications, the performance of existing methods deteriorates dramatically when the multi-view data is sampled from nonlinear manifolds or suffers from heavy outliers.
To circumvent this drawback, motivated by the Divide-and-Conquer strategy, we propose Multi-view Hybrid Embedding (MvHE), a unique method of dividing the problem of cross-view classification into three subproblems and building one model for each subproblem.
Specifically, the first model is designed to remove view discrepancy, whereas the second and third models attempt to discover the intrinsic nonlinear structure and to increase discriminability in intra-view and inter-view samples respectively.
The kernel extension is conducted to further boost the representation power of MvHE.
Extensive experiments are conducted on four benchmark datasets.
Our methods demonstrate overwhelming advantages against the state-of-the-art MvSL based cross-view classification approaches in terms of classification accuracy and robustness.
Depth from focus (DFF) is one of the classical ill-posed inverse problems in computer vision.
Most approaches recover the depth at each pixel based on the focal setting which exhibits maximal sharpness.
Yet, it is not obvious how to reliably estimate the sharpness level, particularly in low-textured areas.
In this paper, we propose `Deep Depth From Focus (DDFF)' as the first end-to-end learning approach to this problem.
One of the main challenges we face is the hunger for data of deep neural networks.
In order to obtain a significant amount of focal stacks with corresponding groundtruth depth, we propose to leverage a light-field camera with a co-calibrated RGB-D sensor.
This allows us to digitally create focal stacks of varying sizes.
Compared to existing benchmarks our dataset is 25 times larger, enabling the use of machine learning for this inverse problem.
We compare our results with state-of-the-art DFF methods and we also analyze the effect of several key deep architectural components.
These experiments show that our proposed method `DDFFNet' achieves state-of-the-art performance in all scenes, reducing depth error by more than 75% compared to the classical DFF methods.
In light of the tremendous amount of data produced by social media, a large body of research have revisited the relevance estimation of the users' generated content.
Most of the studies have stressed the multidimensional nature of relevance and proved the effectiveness of combining the different criteria that it embodies.
Traditional relevance estimates combination methods are often based on linear combination schemes.
However, despite being effective, those aggregation mechanisms are not effective in real-life applications since they heavily rely on the non-realistic independence property of the relevance dimensions.
In this paper, we propose to tackle this issue through the design of a novel fuzzy-based document ranking model.
We also propose an automated methodology to capture the importance of relevance dimensions, as well as information about their interaction.
This model, based on the Choquet Integral, allows to optimize the aggregated documents relevance scores using any target information retrieval relevance metric.
Experiments within the TREC Microblog task and a social personalized information retrieval task highlighted that our model significantly outperforms a wide range of state-of-the-art aggregation operators, as well as a representative learning to rank methods.
According to E.T.Jaynes and E.P.Wigner, entropy is an anthropomorphic concept in the sense that in a physical system correspond many thermodynamic systems.
The physical system can be examined from many points of view each time examining different variables and calculating entropy differently.
In this paper we discuss how this concept may be applied in information entropy; how Shannon's definition of entropy can fit in Jayne's and Wigner's statement.
This is achieved by generalizing Shannon's notion of information entropy and this is the main contribution of the paper.
Then we discuss how entropy under these considerations may be used for the comparison of password complexity and as a measure of diversity useful in the analysis of the behavior of genetic algorithms.
The paper focuses on the problem of vision-based obstacle detection and tracking for unmanned aerial vehicle navigation.
A real-time object localization and tracking strategy from monocular image sequences is developed by effectively integrating the object detection and tracking into a dynamic Kalman model.
At the detection stage, the object of interest is automatically detected and localized from a saliency map computed via the image background connectivity cue at each frame; at the tracking stage, a Kalman filter is employed to provide a coarse prediction of the object state, which is further refined via a local detector incorporating the saliency map and the temporal information between two consecutive frames.
Compared to existing methods, the proposed approach does not require any manual initialization for tracking, runs much faster than the state-of-the-art trackers of its kind, and achieves competitive tracking performance on a large number of image sequences.
Extensive experiments demonstrate the effectiveness and superior performance of the proposed approach.
Recent captioning models are limited in their ability to scale and describe concepts unseen in paired image-text corpora.
We propose the Novel Object Captioner (NOC), a deep visual semantic captioning model that can describe a large number of object categories not present in existing image-caption datasets.
Our model takes advantage of external sources -- labeled images from object recognition datasets, and semantic knowledge extracted from unannotated text.
We propose minimizing a joint objective which can learn from these diverse data sources and leverage distributional semantic embeddings, enabling the model to generalize and describe novel objects outside of image-caption datasets.
We demonstrate that our model exploits semantic information to generate captions for hundreds of object categories in the ImageNet object recognition dataset that are not observed in MSCOCO image-caption training data, as well as many categories that are observed very rarely.
Both automatic evaluations and human judgements show that our model considerably outperforms prior work in being able to describe many more categories of objects.
In this demo, we present PackageBuilder, a system that extends database systems to support package queries.
A package is a collection of tuples that individually satisfy base constraints and collectively satisfy global constraints.
The need for package support arises in a variety of scenarios: For example, in the creation of meal plans, users are not only interested in the nutritional content of individual meals (base constraints), but also care to specify daily consumption limits and control the balance of the entire plan (global constraints).
We introduce PaQL, a declarative SQL-based package query language, and the interface abstractions which allow users to interactively specify package queries and easily navigate through their results.
To efficiently evaluate queries, the system employs pruning and heuristics, as well as state-of-the-art constraint optimization solvers.
We demonstrate PackageBuilder by allowing attendees to interact with the system's interface, to define PaQL queries and to observe how query evaluation is performed.
Environment perception is an important task with great practical value and bird view is an essential part for creating panoramas of surrounding environment.
Due to the large gap and severe deformation between the frontal view and bird view, generating a bird view image from a single frontal view is challenging.
To tackle this problem, we propose the BridgeGAN, i.e., a novel generative model for bird view synthesis.
First, an intermediate view, i.e., homography view, is introduced to bridge the large gap.
Next, conditioned on the three views (frontal view, homography view and bird view) in our task, a multi-GAN based model is proposed to learn the challenging cross-view translation.
Extensive experiments conducted on a synthetic dataset have demonstrated that the images generated by our model are much better than those generated by existing methods, with more consistent global appearance and sharper details.
Ablation studies and discussions show its reliability and robustness in some challenging cases.
Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information.
Selecting a "good" layout method is thus important for visualizing a graph.
The selection can be highly subjective and dependent on the given task.
A common approach to selecting a good layout is to use aesthetic criteria and visual inspection.
However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive.
In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels.
For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics.
An important contribution of our work is the development of a new framework to design graph kernels.
Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics.
Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy.
In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.
Estimating the Worst-Case Execution Time (WCET) of an application is an essential task in the context of developing real-time or safety-critical software, but it is also a complex and error-prone process.
Conventional approaches require at least some manual inputs from the user, such as loop bounds and infeasible path information, which are hard to obtain and can lead to unsafe results if they are incorrect.
This is aggravated by the lack of a comprehensive explanation of the WCET estimate, i.e., a specific trace showing how WCET was reached.
It is therefore hard to spot incorrect inputs and hard to improve the worst-case timing of the application.
Meanwhile, modern processors have reached a complexity that refutes analysis and puts more and more burden on the practitioner.
In this article we show how all of these issues can be significantly mitigated or even solved, if we use processors that are amenable to WCET analysis.
We define and identify such processors, and then we propose an automated tool set which estimates a precise WCET without unsafe manual inputs, and also reconstructs a maximum-detail view of the WCET path that can be examined in a debugger environment.
Our approach is based on Model Checking, which however is known to scale badly with growing application size.
We address this issue by shifting the analysis to source code level, where source code transformations can be applied that retain the timing behavior, but reduce the complexity.
Our experiments show that fast and precise estimates can be achieved with Model Checking, that its scalability can even exceed current approaches, and that new opportunities arise in the context of "timing debugging".
In visual recognition, the key to the performance improvement of ResNet is the success in establishing the stack of deep sequential convolutional layers using identical mapping by a shortcut connection.
It results in multiple paths of data flow under a network and the paths are merged with the equal weights.
However, it is questionable whether it is correct to use the fixed and predefined weights at the mapping units of all paths.
In this paper, we introduce the active weighted mapping method which infers proper weight values based on the characteristic of input data on the fly.
The weight values of each mapping unit are not fixed but changed as the input image is changed, and the most proper weight values for each mapping unit are derived according to the input image.
For this purpose, channel-wise information is embedded from both the shortcut connection and convolutional block, and then the fully connected layers are used to estimate the weight values for the mapping units.
We train the backbone network and the proposed module alternately for a more stable learning of the proposed method.
Results of the extensive experiments show that the proposed method works successfully on the various backbone architectures from ResNet to DenseNet.
We also verify the superiority and generality of the proposed method on various datasets in comparison with the baseline.
In this paper, we propose a robust visual tracking method which exploits the relationships of targets in adjacent frames using patchwise joint sparse representation.
Two sets of overlapping patches with different sizes are extracted from target candidates to construct two dictionaries with consideration of joint sparse representation.
By applying this representation into structural sparse appearance model, we can take two-fold advantages.
First, the correlation of target patches over time is considered.
Second, using this local appearance model with different patch sizes takes into account local features of target thoroughly.
Furthermore, the position of candidate patches and their occlusion levels are utilized simultaneously to obtain the final likelihood of target candidates.
Evaluations on recent challenging benchmark show that our tracking method outperforms the state-of-the-art trackers.
Re-speaking is a mechanism for obtaining high quality subtitles for use in live broadcast and other public events.
Because it relies on humans performing the actual re-speaking, the task of estimating the quality of the results is non-trivial.
Most organisations rely on humans to perform the actual quality assessment, but purely automatic methods have been developed for other similar problems, like Machine Translation.
This paper will try to compare several of these methods: BLEU, EBLEU, NIST, METEOR, METEOR-PL, TER and RIBES.
These will then be matched to the human-derived NER metric, commonly used in re-speaking.
While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data.
One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers.
Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations.
Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning.
In this paper, we do both: we propose Deep TAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks in order to learn complex tasks in just a short amount of time with a human trainer.
We demonstrate Deep TAMER's success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods.
The abstraction tasks are challenging for multi- modal sequences as they require a deeper semantic understanding and a novel text generation for the data.
Although the recurrent neural networks (RNN) can be used to model the context of the time-sequences, in most cases the long-term dependencies of multi-modal data make the back-propagation through time training of RNN tend to vanish in the time domain.
Recently, inspired from Multiple Time-scale Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU), called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to learn the long-term dependencies in natural language processing.
Particularly it is also able to accomplish the abstraction task for paragraphs given that the time constants are well defined.
In this paper, we compare the MTRNN and MTGRU in terms of its learning performances as well as their abstraction representation on higher level (with a slower neural activation).
This was done by conducting two studies based on a smaller data- set (two-dimension time sequences from non-linear functions) and a relatively large data-set (43-dimension time sequences from iCub manipulation tasks with multi-modal data).
We conclude that gated recurrent mechanisms may be necessary for learning long-term dependencies in large dimension multi-modal data-sets (e.g. learning of robot manipulation), even when natural language commands was not involved.
But for smaller learning tasks with simple time-sequences, generic version of recurrent models, such as MTRNN, were sufficient to accomplish the abstraction task.
How many copies of a parallelepiped are needed to ensure that for every point in the parallelepiped a copy of each other point exists, such that the distance between them equals the distance of the pair of points when the opposite sites of the parallelepiped are identified?
This question is answered in Euclidean space by constructing the smallest domain that fulfills the above condition.
We also describe how to obtain all primitive cells of a lattice (i.e., closures of fundamental domains) that realise the smallest number of copies needed and give them explicitly in 2D and 3D.
Logic programs with aggregates (LPA) are one of the major linguistic extensions to Logic Programming (LP).
In this work, we propose a generalization of the notions of unfounded set and well-founded semantics for programs with monotone and antimonotone aggregates (LPAma programs).
In particular, we present a new notion of unfounded set for LPAma programs, which is a sound generalization of the original definition for standard (aggregate-free) LP.
On this basis, we define a well-founded operator for LPAma programs, the fixpoint of which is called well-founded model (or well-founded semantics) for LPAma programs.
The most important properties of unfounded sets and the well-founded semantics for standard LP are retained by this generalization, notably existence and uniqueness of the well-founded model, together with a strong relationship to the answer set semantics for LPAma programs.
We show that one of the D-well-founded semantics, defined by Pelov, Denecker, and Bruynooghe for a broader class of aggregates using approximating operators, coincides with the well-founded model as defined in this work on LPAma programs.
We also discuss some complexity issues, most importantly we give a formal proof of tractable computation of the well-founded model for LPA programs.
Moreover, we prove that for general LPA programs, which may contain aggregates that are neither monotone nor antimonotone, deciding satisfaction of aggregate expressions with respect to partial interpretations is coNP-complete.
As a consequence, a well-founded semantics for general LPA programs that allows for tractable computation is unlikely to exist, which justifies the restriction on LPAma programs.
Finally, we present a prototype system extending DLV, which supports the well-founded semantics for LPAma programs, at the time of writing the only implemented system that does so.
Experiments with this prototype show significant computational advantages of aggregate constructs over equivalent aggregate-free encodings.
Increasing data traffic demands over wireless spectrum have necessitated spectrum sharing and coexistence between heterogeneous systems such as radar and cellular communications systems.
In this context, we specifically investigate the co-channel coexistence between an air traffic control (ATC) radar and a wide area cellular communication (comms) system.
We present a comprehensive characterization and analysis of interference caused by the comms system on the ATC radar with respect to multiple parameters such as radar range, protection radius around the radar, and radar antenna elevation angle.
The analysis suggests that maintaining a protection radius of 50 km around the radar will ensure the required INR protection criterion of -10 dB at the radar receiver with ~0.9 probability, even when the radar beam is in the same horizon as the comms BS.
Detailed evaluations of the radar target detection performance provide a framework to choose appropriate protection radii around the radar to meet specific performance requirements.
This paper explores the use of Pyramid Vector Quantization (PVQ) to reduce the computational cost for a variety of neural networks (NNs) while, at the same time, compressing the weights that describe them.
This is based on the fact that the dot product between an N dimensional vector of real numbers and an N dimensional PVQ vector can be calculated with only additions and subtractions and one multiplication.
This is advantageous since tensor products, commonly used in NNs, can be re-conduced to a dot product or a set of dot products.
Finally, it is stressed that any NN architecture that is based on an operation that can be re-conduced to a dot product can benefit from the techniques described here.
We apply cross-lingual Latent Semantic Indexing to the Bilingual Document Alignment Task at WMT16.
Reduced-rank singular value decomposition of a bilingual term-document matrix derived from known English/French page pairs in the training data allows us to map monolingual documents into a joint semantic space.
Two variants of cosine similarity between the vectors that place each document into the joint semantic space are combined with a measure of string similarity between corresponding URLs to produce 1:1 alignments of English/French web pages in a variety of domains.
The system achieves a recall of ca.88% if no in-domain data is used for building the latent semantic model, and 93% if such data is included.
Analysing the system's errors on the training data, we argue that evaluating aligner performance based on exact URL matches under-estimates their true performance and propose an alternative that is able to account for duplicates and near-duplicates in the underlying data.
Modern computer threats are far more complicated than those seen in the past.
They are constantly evolving, altering their appearance, perpetually changing disguise.
Under such circumstances, detecting known threats, a fortiori zero-day attacks, requires new tools, which are able to capture the essence of their behavior, rather than some fixed signatures.
In this work, we propose novel universal anomaly detection algorithms, which are able to learn the normal behavior of systems and alert for abnormalities, without any prior knowledge on the system model, nor any knowledge on the characteristics of the attack.
The suggested method utilizes the Lempel-Ziv universal compression algorithm in order to optimally give probability assignments for normal behavior (during learning), then estimate the likelihood of new data (during operation) and classify it accordingly.
The suggested technique is generic, and can be applied to different scenarios.
Indeed, we apply it to key problems in computer security.
The first is detecting Botnets Command and Control (C&C) channels.
A Botnet is a logical network of compromised machines which are remotely controlled by an attacker using a C&C infrastructure, in order to perform malicious activities.
We derive a detection algorithm based on timing data, which can be collected without deep inspection, from open as well as encrypted flows.
We evaluate the algorithm on real-world network traces, showing how a universal, low complexity C&C identification system can be built, with high detection rates and low false-alarm probabilities.
Further applications include malicious tools detection via system calls monitoring and data leakage identification.
Multi-label learning is concerned with the classification of data with multiple class labels.
This is in contrast to the traditional classification problem where every data instance has a single label.
Due to the exponential size of output space, exploiting intrinsic information in feature and label spaces has been the major thrust of research in recent years and use of parametrization and embedding have been the prime focus.
Researchers have studied several aspects of embedding which include label embedding, input embedding, dimensionality reduction and feature selection.
These approaches differ from one another in their capability to capture other intrinsic properties such as label correlation, local invariance etc.
We assume here that the input data form groups and as a result, the label matrix exhibits a sparsity pattern and hence the labels corresponding to objects in the same group have similar sparsity.
In this paper, we study the embedding of labels together with the group information with an objective to build an efficient multi-label classification.
We assume the existence of a low-dimensional space onto which the feature vectors and label vectors can be embedded.
In order to achieve this, we address three sub-problems namely; (1) Identification of groups of labels; (2) Embedding of label vectors to a low rank-space so that the sparsity characteristic of individual groups remains invariant; and (3) Determining a linear mapping that embeds the feature vectors onto the same set of points, as in stage 2, in the low-dimensional space.
We compare our method with seven well-known algorithms on twelve benchmark data sets.
Our experimental analysis manifests the superiority of our proposed method over state-of-art algorithms for multi-label learning.
This paper proposes a model of information cascades as directed spanning trees (DSTs) over observed documents.
In addition, we propose a contrastive training procedure that exploits partial temporal ordering of node infections in lieu of labeled training links.
This combination of model and unsupervised training makes it possible to improve on models that use infection times alone and to exploit arbitrary features of the nodes and of the text content of messages in information cascades.
With only basic node and time lag features similar to previous models, the DST model achieves performance with unsupervised training comparable to strong baselines on a blog network inference task.
Unsupervised training with additional content features achieves significantly better results, reaching half the accuracy of a fully supervised model.
Obesity treatment requires obese patients to record all food intakes per day.
Computer vision has been introduced to estimate calories from food images.
In order to increase accuracy of detection and reduce the error of volume estimation in food calorie estimation, we present our calorie estimation method in this paper.
To estimate calorie of food, a top view and side view is needed.
Faster R-CNN is used to detect the food and calibration object.
GrabCut algorithm is used to get each food's contour.
Then the volume is estimated with the food and corresponding object.
Finally we estimate each food's calorie.
And the experiment results show our estimation method is effective.
Zero automata are a probabilistic extension of parity automata on infinite trees.
The satisfiability of a certain probabilistic variant of mso, called tmso + zero, reduces to the emptiness problem for zero automata.
We introduce a variant of zero automata called nonzero automata.
We prove that for every zero automaton there is an equivalent nonzero automaton of quadratic size and the emptiness problem of nonzero automata is decidable and both in NP and in coNP.
These results imply that tmso + zero has decidable satisfiability.
Human communication typically has an underlying structure.
This is reflected in the fact that in many user generated videos, a starting point, ending, and certain objective steps between these two can be identified.
In this paper, we propose a method for parsing a video into such semantic steps in an unsupervised way.
The proposed method is capable of providing a semantic "storyline" of the video composed of its objective steps.
We accomplish this using both visual and language cues in a joint generative model.
The proposed method can also provide a textual description for each of the identified semantic steps and video segments.
We evaluate this method on a large number of complex YouTube videos and show results of unprecedented quality for this intricate and impactful problem.
This paper studies the structure of a parabolic partial differential equation on graphs and digital n-dimensional manifolds, which are digital models of continuous n-manifolds.
Conditions for the existence of solutions of equations are determined and investigated.
Numerical solutions of the equation on a Klein bottle, a projective plane, a 4D sphere and a Moebius strip are presented.
The detection and localization of a target from samples of its generated field is a problem of interest in a broad range of applications.
Often, the target field admits structural properties that enable the design of lower sample detection strategies with good performance.
This paper designs a sampling and localization strategy which exploits separability and unimodality in target fields and theoretically analyzes the trade-off achieved between sampling density, noise level and convergence rate of localization.
In particular, the strategy adopts an exploration-exploitation approach to target detection and utilizes the theory of low-rank matrix completion, coupled with unimodal regression, on decaying and approximately separable target fields.
The assumptions on the field are fairly generic and are applicable to many decay profiles since no specific knowledge of the field is necessary, besides its admittance of an approximately rank-one representation.
Extensive numerical experiments and comparisons are performed to test the efficacy and robustness of the presented approach.
Numerical results suggest that the proposed strategy outperforms algorithms based on mean-shift clustering, surface interpolation and naive low-rank matrix completion with peak detection, under low sampling density.
Graph is a useful data structure to model various real life aspects like email communications, co-authorship among researchers, interactions among chemical compounds, and so on.
Supporting such real life interactions produce a knowledge rich massive repository of data.
However, efficiently understanding underlying trends and patterns is hard due to large size of the graph.
Therefore, this paper presents a scalable compression solution to compute summary of a weighted graph.
All the aforementioned interactions from various domains are represented as edge weights in a graph.
Therefore, creating a summary graph while considering this vital aspect is necessary to learn insights of different communication patterns.
By experimenting the proposed method on two real world and publically available datasets against a state of the art technique, we obtain order of magnitude performance gain and better summarization accuracy.
In many real world networks, a vertex is usually associated with a transaction database that comprehensively describes the behaviour of the vertex.
A typical example is the social network, where the behaviour of every user is depicted by a transaction database that stores his daily posted contents.
A transaction database is a set of transactions, where a transaction is a set of items.
Every path of the network is a sequence of vertices that induces multiple sequences of transactions.
The sequences of transactions induced by all of the paths in the network forms an extremely large sequence database.
Finding frequent sequential patterns from such sequence database discovers interesting subsequences that frequently appear in many paths of the network.
However, it is a challenging task, since the sequence database induced by a database graph is too large to be explicitly induced and stored.
In this paper, we propose the novel notion of database graph, which naturally models a wide spectrum of real world networks by associating each vertex with a transaction database.
Our goal is to find the top-k frequent sequential patterns in the sequence database induced from a database graph.
We prove that this problem is #P-hard.
To tackle this problem, we propose an efficient two-step sampling algorithm that approximates the top-k frequent sequential patterns with provable quality guarantee.
Extensive experimental results on synthetic and real-world data sets demonstrate the effectiveness and efficiency of our method.
Parallel coordinate plots (PCPs) are among the most useful techniques for the visualization and exploration of high-dimensional data spaces.
They are especially useful for the representation of correlations among the dimensions, which identify relationships and interdependencies between variables.
However, within these high-dimensional spaces, PCPs face difficulties in displaying the correlation between combinations of dimensions and generally require additional display space as the number of dimensions increases.
In this paper, we present a new technique for high-dimensional data visualization in which a set of low-dimensional PCPs are interactively constructed by sampling user-selected subsets of the high-dimensional data space.
In our technique, we first construct a graph visualization of sets of well-correlated dimensions.
Users observe this graph and are able to interactively select the dimensions by sampling from its cliques, thereby dynamically specifying the most relevant lower dimensional data to be used for the construction of focused PCPs.
Our interactive sampling overcomes the shortcomings of the PCPs by enabling the visualization of the most meaningful dimensions (i.e., the most relevant information) from high-dimensional spaces.
We demonstrate the effectiveness of our technique through two case studies, where we show that the proposed interactive low-dimensional space constructions were pivotal for visualizing the high-dimensional data and discovering new patterns.
Recent renewed interest in optimizing and analyzing floating-point programs has lead to a diverse array of new tools for numerical programs.
These tools are often complementary, each focusing on a distinct aspect of numerical programming.
Building reliable floating point applications typically requires addressing several of these aspects, which makes easy composition essential.
This paper describes the composition of two recent floating-point tools: Herbie, which performs accuracy optimization, and Daisy, which performs accuracy verification.
We find that the combination provides numerous benefits to users, such as being able to use Daisy to check whether Herbie's unsound optimizations improved the worst-case roundoff error, as well as benefits to tool authors, including uncovering a number of bugs in both tools.
The combination also allowed us to compare the different program rewriting techniques implemented by these tools for the first time.
The paper lays out a road map for combining other floating-point tools and for surmounting common challenges.
Visual representation is crucial for a visual tracking method's performances.
Conventionally, visual representations adopted in visual tracking rely on hand-crafted computer vision descriptors.
These descriptors were developed generically without considering tracking-specific information.
In this paper, we propose to learn complex-valued invariant representations from tracked sequential image patches, via strong temporal slowness constraint and stacked convolutional autoencoders.
The deep slow local representations are learned offline on unlabeled data and transferred to the observational model of our proposed tracker.
The proposed observational model retains old training samples to alleviate drift, and collect negative samples which are coherent with target's motion pattern for better discriminative tracking.
With the learned representation and online training samples, a logistic regression classifier is adopted to distinguish target from background, and retrained online to adapt to appearance changes.
Subsequently, the observational model is integrated into a particle filter framework to peform visual tracking.
Experimental results on various challenging benchmark sequences demonstrate that the proposed tracker performs favourably against several state-of-the-art trackers.
To solve the text-based question and answering task that requires relational reasoning, it is necessary to memorize a large amount of information and find out the question relevant information from the memory.
Most approaches were based on external memory and four components proposed by Memory Network.
The distinctive component among them was the way of finding the necessary information and it contributes to the performance.
Recently, a simple but powerful neural network module for reasoning called Relation Network (RN) has been introduced.
We analyzed RN from the view of Memory Network, and realized that its MLP component is able to reveal the complicate relation between question and object pair.
Motivated from it, we introduce which uses MLP to find out relevant information on Memory Network architecture.
It shows new state-of-the-art results in jointly trained bAbI-10k story-based question answering tasks and bAbI dialog-based question answering tasks.
Classification and clustering algorithms have been proved to be successful individually in different contexts.
Both of them have their own advantages and limitations.
For instance, although classification algorithms are more powerful than clustering methods in predicting class labels of objects, they do not perform well when there is a lack of sufficient manually labeled reliable data.
On the other hand, although clustering algorithms do not produce label information for objects, they provide supplementary constraints (e.g., if two objects are clustered together, it is more likely that the same label is assigned to both of them) that one can leverage for label prediction of a set of unknown objects.
Therefore, systematic utilization of both these types of algorithms together can lead to better prediction performance.
In this paper, We propose a novel algorithm, called EC3 that merges classification and clustering together in order to support both binary and multi-class classification.
EC3 is based on a principled combination of multiple classification and multiple clustering methods using an optimization function.
We theoretically show the convexity and optimality of the problem and solve it by block coordinate descent method.
We additionally propose iEC3, a variant of EC3 that handles imbalanced training data.
We perform an extensive experimental analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known standalone classifiers, 5 ensemble classifiers, and 2 existing methods that merge classification and clustering) on 13 standard benchmark datasets.
We show that our methods outperform other baselines for every single dataset, achieving at most 10% higher AUC.
Moreover our methods are faster (1.21 times faster than the best baseline), more resilient to noise and class imbalance than the best baseline method.
Given n red and n blue points in general position in the plane, it is well-known that there is a perfect matching formed by non-crossing line segments.
We characterize the bichromatic point sets which admit exactly one non-crossing matching.
We give several geometric descriptions of such sets, and find an O(nlogn) algorithm that checks whether a given bichromatic set has this property.
In this paper we give an exponential lower bound for Cunningham's least recently considered (round-robin) rule as applied to parity games, Markhov decision processes and linear programs.
This improves a recent subexponential bound of Friedmann for this rule on these problems.
The round-robin rule fixes a cyclical order of the variables and chooses the next pivot variable starting from the previously chosen variable and proceeding in the given circular order.
It is perhaps the simplest example from the class of history-based pivot rules.
Our results are based on a new lower bound construction for parity games.
Due to the nature of the construction we are also able to obtain an exponential lower bound for the round-robin rule applied to acyclic unique sink orientations of hypercubes (AUSOs).
Furthermore these AUSOs are realizable as polytopes.
We believe these are the first such results for history based rules for AUSOs, realizable or not.
The paper is self-contained and requires no previous knowledge of parity games.
The mood of a text and the intention of the writer can be reflected in the typeface.
However, in designing a typeface, it is difficult to keep the style of various characters consistent, especially for languages with lots of morphological variations such as Chinese.
In this paper, we propose a Typeface Completion Network (TCN) which takes one character as an input, and automatically completes the entire set of characters in the same style as the input characters.
Unlike existing models proposed for image-to-image translation, TCN embeds a character image into two separate vectors representing typeface and content.
Combined with a reconstruction loss from the latent space, and with other various losses, TCN overcomes the inherent difficulty in designing a typeface.
Also, compared to previous image-to-image translation models, TCN generates high quality character images of the same typeface with a much smaller number of model parameters.
We validate our proposed model on the Chinese and English character datasets, which is paired data, and the CelebA dataset, which is unpaired data.
In these datasets, TCN outperforms recently proposed state-of-the-art models for image-to-image translation.
The source code of our model is available at https://github.com/yongqyu/TCN.
We study the application of active learning techniques to the translation of unbounded data streams via interactive neural machine translation.
The main idea is to select, from an unbounded stream of source sentences, those worth to be supervised by a human agent.
The user will interactively translate those samples.
Once validated, these data is useful for adapting the neural machine translation model.
We propose two novel methods for selecting the samples to be validated.
We exploit the information from the attention mechanism of a neural machine translation system.
Our experiments show that the inclusion of active learning techniques into this pipeline allows to reduce the effort required during the process, while increasing the quality of the translation system.
Moreover, it enables to balance the human effort required for achieving a certain translation quality.
Moreover, our neural system outperforms classical approaches by a large margin.
In this paper, second-order hidden Markov model (HMM2) has been used and implemented to improve the recognition performance of text-dependent speaker identification systems under neutral talking condition.
Our results show that HMM2 improves the recognition performance under neutral talking condition compared to the first-order hidden Markov model (HMM1).
The recognition performance has been improved by 9%.
Browsing privacy solutions face an uphill battle to deployment.
Many operate counter to the economic objectives of popular online services (e.g., by completely blocking ads) and do not provide enough incentive for users who may be subject to performance degradation for deploying them.
In this study, we take a step towards realizing a system for online privacy that is mutually beneficial to users and online advertisers: an information market.
This system not only maintains economic viability for online services, but also provides users with financial compensation to encourage them to participate.
We prototype and evaluate an information market that provides privacy and revenue to users while preserving and sometimes improving their Web performance.
We evaluate feasibility of the market via a one month field study with 63 users and find that users are indeed willing to sell their browsing information.
We also use Web traces of millions of users to drive a simulation study to evaluate the system at scale.
We find that the system can indeed be profitable to both users and online advertisers.
In this paper, we present a consensus-based framework for decentralized estimation of deterministic parameters in wireless sensor networks (WSNs).
In particular, we propose an optimization algorithm to design (possibly complex) sensor gains in order to achieve an estimate of the parameter of interest that is as accurate as possible.
The proposed design algorithm employs a cyclic approach capable of handling various sensor gain constraints.
In addition, each iteration of the proposed design framework is comprised of the Gram-Schmidt process and power-method like iterations, and as a result, enjoys a low-computational cost.
With the increasing availability of large databases of 3D CAD models, depth-based recognition methods can be trained on an uncountable number of synthetically rendered images.
However, discrepancies with the real data acquired from various depth sensors still noticeably impede progress.
Previous works adopted unsupervised approaches to generate more realistic depth data, but they all require real scans for training, even if unlabeled.
This still represents a strong requirement, especially when considering real-life/industrial settings where real training images are hard or impossible to acquire, but texture-less 3D models are available.
We thus propose a novel approach leveraging only CAD models to bridge the realism gap.
Purely trained on synthetic data, playing against an extensive augmentation pipeline in an unsupervised manner, our generative adversarial network learns to effectively segment depth images and recover the clean synthetic-looking depth information even from partial occlusions.
As our solution is not only fully decoupled from the real domains but also from the task-specific analytics, the pre-processed scans can be handed to any kind and number of recognition methods also trained on synthetic data.
Through various experiments, we demonstrate how this simplifies their training and consistently enhances their performance, with results on par with the same methods trained on real data, and better than usual approaches doing the reverse mapping.
For medical volume visualization, one of the most important tasks is to reveal clinically relevant details from the 3D scan (CT, MRI ...), e.g. the coronary arteries, without obscuring them with less significant parts.
These volume datasets contain different materials which are difficult to extract and visualize with 1D transfer functions based solely on the attenuation coefficient.
Multi-dimensional transfer functions allow a much more precise classification of data which makes it easier to separate different surfaces from each other.
Unfortunately, setting up multi-dimensional transfer functions can become a fairly complex task, generally accomplished by trial and error.
This paper explains neural networks, and then presents an efficient way to speed up visualization process by semi-automatic transfer function generation.
We describe how to use neural networks to detect distinctive features shown in the 2D histogram of the volume data and how to use this information for data classification.
We present a distributed algorithm for a swarm of active particles to camouflage in an environment.
Each particle is equipped with sensing, computation and communication, allowing the system to take color and gradient information from the environment and self-organize into an appropriate pattern.
Current artificial camouflage systems are either limited to static patterns, which are adapted for specific environments, or rely on back-projection, which depend on the viewer's point of view.
Inspired by the camouflage abilities of the cuttlefish, we propose a distributed estimation and pattern formation algorithm that allows to quickly adapt to different environments.
We present convergence results both in simulation as well as on a swarm of miniature robots "Droplets" for a variety of patterns.
We introduce a problem called the Minimum Shared-Power Edge Cut (MSPEC).
The input to the problem is an undirected edge-weighted graph with distinguished vertices s and t, and the goal is to find an s-t cut by assigning "powers" at the vertices and removing an edge if the sum of the powers at its endpoints is at least its weight.
The objective is to minimize the sum of the assigned powers.
MSPEC is a graph generalization of a barrier coverage problem in a wireless sensor network: given a set of unit disks with centers in a rectangle, what is the minimum total amount by which we must shrink the disks to permit an intruder to cross the rectangle undetected, i.e. without entering any disc.
This is a more sophisticated measure of barrier coverage than the minimum number of disks whose removal breaks the barrier.
We develop a fully polynomial time approximation scheme (FPTAS) for MSPEC.
We give polynomial time algorithms for the special cases where the edge weights are uniform, or the power values are restricted to a bounded set.
Although MSPEC is related to network flow and matching problems, its computational complexity (in P or NP-hard) remains open.
Domain Name System (DNS), one of the important infrastructure in the Internet, was vulnerable to attacks, for the DNS designer didn't take security issues into consideration at the beginning.
The defects of DNS may lead to users' failure of access to the websites, what's worse, users might suffer a huge economic loss.
In order to correct the DNS wrong resource records, we propose a Self-Feedback Correction System for DNS (SFCSD), which can find and track a large number of common websites' domain name and IP address correct correspondences to provide users with a real-time auto-updated correct (IP, Domain) binary tuple list.
By matching specific strings with SSL, DNS and HTTP traffic passively, filtering with the CDN CNAME and non-homepage URL feature strings, verifying with webpage fingerprint algorithm, SFCSD obtains a large number of highly possibly correct IP addresses to make an active manual correction in the end.
Its self-feedback mechanism can expand search range and improve performance.
Experiments show that, SFCSD can achieve 94.3% precision and 93.07% recall rate with the optimal threshold selection in the test dataset.
It has 8Gbps processing speed stand-alone to find almost 1000 possibly correct (IP, Domain) per day for the each specific string and to correct almost 200.
Anthropomimetic robots are robots that sense, behave, interact and feel like humans.
By this definition, anthropomimetic robots require human-like physical hardware and actuation, but also brain-like control and sensing.
The most self-evident realization to meet those requirements would be a human-like musculoskeletal robot with a brain-like neural controller.
While both musculoskeletal robotic hardware and neural control software have existed for decades, a scalable approach that could be used to build and control an anthropomimetic human-scale robot has not been demonstrated yet.
Combining Myorobotics, a framework for musculoskeletal robot development, with SpiNNaker, a neuromorphic computing platform, we present the proof-of-principle of a system that can scale to dozens of neurally-controlled, physically compliant joints.
At its core, it implements a closed-loop cerebellar model which provides real-time low-level neural control at minimal power consumption and maximal extensibility: higher-order (e.g., cortical) neural networks and neuromorphic sensors like silicon-retinae or -cochleae can naturally be incorporated.
Vehicular Ad Hoc Networks (VANET) is a very promising research venue that can offers many useful and critical applications including the safety applications.
Most of these applications require that each vehicle knows precisely its current position in real time.
GPS is the most common positioning technique for VANET.
However, it is not accurate.
Moreover, the GPS signals cannot be received in the tunnels, undergrounds, or near tall buildings.
Thus, no positioning service can be obtained in these locations.
Even if the Deferential GPS (DGPS) can provide high accuracy, but still no GPS converge in these locations.
In this paper, we provide positioning techniques for VANET that can provide accurate positioning service in the areas where GPS signals are hindered by the obstacles.
Experimental results show significant improvement in the accuracy.
This allows when combined with DGPS the continuity of a precise positioning service that can be used by most of the VANET applications.
Recent empirical studies show that the performance of GenProg is not satisfactory, particularly for Java.
In this paper, we propose ARJA, a new GP based repair approach for automated repair of Java programs.
To be specific, we present a novel lower-granularity patch representation that properly decouples the search subspaces of likely-buggy locations, operation types and potential fix ingredients, enabling GP to explore the search space more effectively.
Based on this new representation, we formulate automated program repair as a multi-objective search problem and use NSGA-II to look for simpler repairs.
To reduce the computational effort and search space, we introduce a test filtering procedure that can speed up the fitness evaluation of GP and three types of rules that can be applied to avoid unnecessary manipulations of the code.
Moreover, we also propose a type matching strategy that can create new potential fix ingredients by exploiting the syntactic patterns of the existing statements.
We conduct a large-scale empirical evaluation of ARJA along with its variants on both seeded bugs and real-world bugs in comparison with several state-of-the-art repair approaches.
Our results verify the effectiveness and efficiency of the search mechanisms employed in ARJA and also show its superiority over the other approaches.
In particular, compared to jGenProg (an implementation of GenProg for Java), an ARJA version fully following the redundancy assumption can generate a test-suite adequate patch for more than twice the number of bugs (from 27 to 59), and a correct patch for nearly four times of the number (from 5 to 18), on 224 real-world bugs considered in Defects4J.
Furthermore, ARJA is able to correctly fix several real multi-location bugs that are hard to be repaired by most of the existing repair approaches.
We explore the use of segments learnt using Byte Pair Encoding (referred to as BPE units) as basic units for statistical machine translation between related languages and compare it with orthographic syllables, which are currently the best performing basic units for this translation task.
BPE identifies the most frequent character sequences as basic units, while orthographic syllables are linguistically motivated pseudo-syllables.
We show that BPE units modestly outperform orthographic syllables as units of translation, showing up to 11% increase in BLEU score.
While orthographic syllables can be used only for languages whose writing systems use vowel representations, BPE is writing system independent and we show that BPE outperforms other units for non-vowel writing systems too.
Our results are supported by extensive experimentation spanning multiple language families and writing systems.
Access control is a crucial part of a system's security, restricting what actions users can perform on resources.
Therefore, access control is a core component when dealing with e-Health data and resources, discriminating which is available for a certain party.
We consider that current systems that attempt to assure the share of policies between facilities are prone to system's and network's faults and do not assure the integrity of policies lifecycle.
By approaching this problem with the use of a distributed ledger, namely a consortium blockchain, where the operations are stored as transactions, we ensure that the different facilities have knowledge about all the parties that can act over the e-Health resources while maintaining integrity, auditability, authenticity, and scalability.
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers.
In this paper, we propose the first quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations, and the geometry of the decision boundary.
Specifically, we establish theoretical bounds on the robustness of classifiers under two decision boundary models (flat and curved models).
We show in particular that the robustness of deep networks to universal perturbations is driven by a key property of their curvature: there exists shared directions along which the decision boundary of deep networks is systematically positively curved.
Under such conditions, we prove the existence of small universal perturbations.
Our analysis further provides a novel geometric method for computing universal perturbations, in addition to explaining their properties.
The Rate Control Protocol (RCP) is a congestion control protocol that relies on explicit feedback from routers.
RCP estimates the flow rate using two forms of feedback: rate mismatch and queue size.
However, it remains an open design question whether queue size feedback in RCP is useful, given the presence of rate mismatch.
The model we consider has RCP flows operating over a single bottleneck, with heterogeneous time delays.
We first derive a sufficient condition for global stability, and then highlight how this condition favors the design choice of having only rate mismatch in the protocol definition.
Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications.
These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others.
Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space.
We introduce new techniques for compressing inverted indexes that exploit this near-copy regularity.
They are based on run-length, Lempel-Ziv, or grammar compression of the differential inverted lists, instead of the usual practice of gap-encoding them.
We show that, in this highly repetitive setting, our compression methods significantly reduce the space obtained with classical techniques, at the price of moderate slowdowns.
Moreover, our best methods are universal, that is, they do not need to know the versioning structure of the collection, nor that a clear versioning structure even exists.
We also introduce compressed self-indexes in the comparison.
These are designed for general strings (not only natural language texts) and represent the text collection plus the index structure (not an inverted index) in integrated form.
We show that these techniques can compress much further, using a small fraction of the space required by our new inverted indexes.
Yet, they are orders of magnitude slower.
Developing an appropriate design process for a conceptual model is a stepping stone toward designing car bodies.
This paper presents a methodology to design a lightweight and modular space frame chassis for a sedan electric car.
The dual phase high strength steel with improved mechanical properties is employed to reduce the weight of the car body.
Utilizing the finite element analysis yields two models in order to predict the performance of each component.
The first model is a beam structure with a rapid response in structural stiffness simulation.
This model is used for performing the static tests including modal frequency, bending stiffens and torsional stiffness evaluation.
Whereas the second model, i.e., a shell model, is proposed to illustrate every module's mechanical behavior as well as its crashworthiness efficiency.
In order to perform the crashworthiness analysis, the explicit nonlinear dynamic solver provided by ABAQUS, a commercial finite element software, is used.
The results of finite element beam and shell models are in line with the concept design specifications.
Implementation of this procedure leads to generate a lightweight and modular concept for an electric car.
Over the last 25 years four million e-mail addresses have accumulated in the PGP web of trust.
In a study each of them was tested for vitality with the result of 40% being unreachable.
Of the mailboxes proven to be reachable, 46.77% turn out to be operated by one of three organizations.
In this article, the authors share their results and challenges during the study.
Over the last five years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems.
This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear mapping between input images and class labels as well as the affordability of GPUs.
In this paper, we present the design details of a deep learning system for unconstrained face recognition, including modules for face detection, association, alignment and face verification.
The quantitative performance evaluation is conducted using the IARPA Janus Benchmark A (IJB-A), the JANUS Challenge Set 2 (JANUS CS2), and the LFW dataset.
The IJB-A dataset includes real-world unconstrained faces of 500 subjects with significant pose and illumination variations which are much harder than the Labeled Faces in the Wild (LFW) and Youtube Face (YTF) datasets.
JANUS CS2 is the extended version of IJB-A which contains not only all the images/frames of IJB-A but also includes the original videos for evaluating the video-based face verification system.
Some open issues regarding DCNNs for face verification problems are then discussed.
Two channels are equivalent if their maximum likelihood (ML) decoders coincide for every code.
We show that this equivalence relation partitions the space of channels into a generalized hyperplane arrangement.
With this, we define a coding distance between channels in terms of their ML-decoders which is meaningful from the decoding point of view, in the sense that the closer two channels are, the larger is the probability of them sharing the same ML-decoder.
We give explicit formulas for these probabilities.
Feature selection has been studied widely in the literature.
However, the efficacy of the selection criteria for low sample size applications is neglected in most cases.
Most of the existing feature selection criteria are based on the sample similarity.
However, the distance measures become insignificant for high dimensional low sample size (HDLSS) data.
Moreover, the variance of a feature with a few samples is pointless unless it represents the data distribution efficiently.
Instead of looking at the samples in groups, we evaluate their efficiency based on pairwise fashion.
In our investigation, we noticed that considering a pair of samples at a time and selecting the features that bring them closer or put them far away is a better choice for feature selection.
Experimental results on benchmark data sets demonstrate the effectiveness of the proposed method with low sample size, which outperforms many other state-of-the-art feature selection methods.
Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words, sentences and documents in context.
Celebrated methods can be categorized as prediction-based and count-based methods according to the training objectives and model architectures.
Their pros and cons have been extensively analyzed and evaluated in recent studies, but there is relatively less work continuing the line of research to develop an enhanced learning method that brings together the advantages of the two model families.
In addition, the interpretation of the learned word representations still remains somewhat opaque.
Motivated by the observations and considering the pressing need, this paper presents a novel method for learning the word representations, which not only inherits the advantages of classic word embedding methods but also offers a clearer and more rigorous interpretation of the learned word representations.
Built upon the proposed word embedding method, we further formulate a translation-based language modeling framework for the extractive speech summarization task.
A series of empirical evaluations demonstrate the effectiveness of the proposed word representation learning and language modeling techniques in extractive speech summarization.
It is well known that normality (all factors of given length appear in an infinite sequence with the same frequency) can be described as incompressibility via finite automata.
Still the statement and proof of this result as given by Becher and Heiber in terms of "lossless finite-state compressors" do not follow the standard scheme of Kolmogorov complexity definition (the automaton is used for compression, not decompression).
We modify this approach to make it more similar to the traditional Kolmogorov complexity theory (and simpler) by explicitly defining the notion of automatic Kolmogorov complexity and using its simple properties.
Other known notions (Shallit--Wang, Calude--Salomaa--Roblot) of description complexity related to finite automata are discussed (see the last section).
As a byproduct, we obtain simple proofs of classical results about normality (equivalence of definitions with aligned occurences and all occurencies, Wall's theorem saying that a normal number remains normal when multiplied by a rational number, and Agafonov's result saying that normality is preserved by automatic selection rules).
Many natural language processing tasks can be modeled into structured prediction and solved as a search problem.
In this paper, we distill an ensemble of multiple models trained with different initialization into a single model.
In addition to learning to match the ensemble's probability output on the reference states, we also use the ensemble to explore the search space and learn from the encountered states in the exploration.
Experimental results on two typical search-based structured prediction tasks -- transition-based dependency parsing and neural machine translation show that distillation can effectively improve the single model's performance and the final model achieves improvements of 1.32 in LAS and 2.65 in BLEU score on these two tasks respectively over strong baselines and it outperforms the greedy structured prediction models in previous literatures.
We present a self-training approach to unsupervised dependency parsing that reuses existing supervised and unsupervised parsing algorithms.
Our approach, called `iterated reranking' (IR), starts with dependency trees generated by an unsupervised parser, and iteratively improves these trees using the richer probability models used in supervised parsing that are in turn trained on these trees.
Our system achieves 1.8% accuracy higher than the state-of-the-part parser of Spitkovsky et al.(2013) on the WSJ corpus.
The design of flow control systems remains a challenge due to the nonlinear nature of the equations that govern fluid flow.
However, recent advances in computational fluid dynamics (CFD) have enabled the simulation of complex fluid flows with high accuracy, opening the possibility of using learning-based approaches to facilitate controller design.
We present a method for learning the forced and unforced dynamics of airflow over a cylinder directly from CFD data.
The proposed approach, grounded in Koopman theory, is shown to produce stable dynamical models that can predict the time evolution of the cylinder system over extended time horizons.
Finally, by performing model predictive control with the learned dynamical models, we are able to find a straightforward, interpretable control law for suppressing vortex shedding in the wake of the cylinder.
In this paper, we propose an ad-hoc on-demand distance vector routing algorithm for mobile ad-hoc networks taking into account node mobility.
Changeable topology of such mobile ad-hoc networks provokes overhead messages in order to search available routes and maintain found routes.
The overhead messages impede data delivery from sources to destination and deteriorate network performance.
To overcome such a challenge, our proposed algorithm estimates link duration based neighboring node mobility and chooses the most reliable route.
The proposed algorithm also applies the estimate for route maintenance to lessen the number of overhead messages.
Via simulations, the proposed algorithm is verified in various mobile environments.
In the low mobility environment, by reducing route maintenance messages, the proposed algorithm significantly improves network performance such as packet data rate and end-to-end delay.
In the high mobility environment, the reduction of route discovery message enhances network performance since the proposed algorithm provides more reliable routes.
We explore several oversampling techniques for an imbalanced multi-label classification problem, a setting often encountered when developing models for Computer-Aided Diagnosis (CADx) systems.
While most CADx systems aim to optimize classifiers for overall accuracy without considering the relative distribution of each class, we look into using synthetic sampling to increase per-class performance when predicting the degree of malignancy.
Using low-level image features and a random forest classifier, we show that using synthetic oversampling techniques increases the sensitivity of the minority classes by an average of 7.22% points, with as much as a 19.88% point increase in sensitivity for a particular minority class.
Furthermore, the analysis of low-level image feature distributions for the synthetic nodules reveals that these nodules can provide insights on how to preprocess image data for better classification performance or how to supplement the original datasets when more data acquisition is feasible.
Accurate localization of other traffic participants is a vital task in autonomous driving systems.
State-of-the-art systems employ a combination of sensing modalities such as RGB cameras and LiDARs for localizing traffic participants, but most such demonstrations have been confined to plain roads.
We demonstrate, to the best of our knowledge, the first results for monocular object localization and shape estimation on surfaces that do not share the same plane with the moving monocular camera.
We approximate road surfaces by local planar patches and use semantic cues from vehicles in the scene to initialize a local bundle-adjustment like procedure that simultaneously estimates the pose and shape of the vehicles, and the orientation of the local ground plane on which the vehicle stands as well.
We evaluate the proposed approach on the KITTI and SYNTHIA-SF benchmarks, for a variety of road plane configurations.
The proposed approach significantly improves the state-of-the-art for monocular object localization on arbitrarily-shaped roads.
The Discontinuous Reception (DRX) mechanism is commonly employed in current LTE networks to improve energy efficiency of user equipment (UE).
DRX allows UEs to monitor the physical downlink control channel (PDCCH) discontinuously when there is no downlink traffic for them, thus reducing their energy consumption.
However, DRX power savings are achieved at the expense of some increase in packet delay since downlink traffic transmission must be deferred until the UEs resume listening to the PDCCH.
In this paper, we present a promising mechanism that reduces energy consumption of UEs using DRX while simultaneously maintaining average packet delay around a desired target.
Furthermore, our proposal is able to achieve significant power savings without either increasing signaling overhead or requiring any changes to deployed wireless protocols.
Support Vector Data Description (SVDD) is a popular outlier detection technique which constructs a flexible description of the input data.
SVDD computation time is high for large training datasets which limits its use in big-data process-monitoring applications.
We propose a new iterative sampling-based method for SVDD training.
The method incrementally learns the training data description at each iteration by computing SVDD on an independent random sample selected with replacement from the training data set.
The experimental results indicate that the proposed method is extremely fast and provides a good data description .
In this paper, we propose a new deep learning approach, called neural association model (NAM), for probabilistic reasoning in artificial intelligence.
We propose to use neural networks to model association between any two events in a domain.
Neural networks take one event as input and compute a conditional probability of the other event to model how likely these two events are to be associated.
The actual meaning of the conditional probabilities varies between applications and depends on how the models are trained.
In this work, as two case studies, we have investigated two NAM structures, namely deep neural networks (DNN) and relation-modulated neural nets (RMNN), on several probabilistic reasoning tasks in AI, including recognizing textual entailment, triple classification in multi-relational knowledge bases and commonsense reasoning.
Experimental results on several popular datasets derived from WordNet, FreeBase and ConceptNet have all demonstrated that both DNNs and RMNNs perform equally well and they can significantly outperform the conventional methods available for these reasoning tasks.
Moreover, compared with DNNs, RMNNs are superior in knowledge transfer, where a pre-trained model can be quickly extended to an unseen relation after observing only a few training samples.
To further prove the effectiveness of the proposed models, in this work, we have applied NAMs to solving challenging Winograd Schema (WS) problems.
Experiments conducted on a set of WS problems prove that the proposed models have the potential for commonsense reasoning.
Wireless Sensor Network (WSN) consists of large number of low-cost, resource-constrained sensor nodes.
The constraints of the wireless sensor node is their characteristics which include low memory, low computation power, they are deployed in hostile area and left unattended, small range of communication capability and low energy capabilities.
Base on those characteristics makes this network vulnerable to several attacks, such as sinkhole attack.
Sinkhole attack is a type of attack were compromised node tries to attract network traffic by advertise its fake routing update.
One of the impacts of sinkhole attack is that, it can be used to launch other attacks like selective forwarding attack, acknowledge spoofing attack and drops or altered routing information.
It can also used to send bogus information to base station.
This paper is focus on exploring and analyzing the existing solutions which used to detect and identify sinkhole attack in wireless sensor network.
The analysis is based on advantages and limitation of the proposed solutions.
We consider the problem of detecting data races in program traces that have been compressed using straight line programs (SLP), which are special context-free grammars that generate exactly one string, namely the trace that they represent.
We consider two classical approaches to race detection --- using the happens-before relation and the lockset discipline.
We present algorithms for both these methods that run in time that is linear in the size of the compressed, SLP representation.
Typical program executions almost always exhibit patterns that lead to significant compression.
Thus, our algorithms are expected to result in large speedups when compared with analyzing the uncompressed trace.
Our experimental evaluation of these new algorithms on standard benchmarks confirms this observation.
Sentence specificity quantifies the level of detail in a sentence, characterizing the organization of information in discourse.
While this information is useful for many downstream applications, specificity prediction systems predict very coarse labels (binary or ternary) and are trained on and tailored toward specific domains (e.g., news).
The goal of this work is to generalize specificity prediction to domains where no labeled data is available and output more nuanced real-valued specificity ratings.
We present an unsupervised domain adaptation system for sentence specificity prediction, specifically designed to output real-valued estimates from binary training labels.
To calibrate the values of these predictions appropriately, we regularize the posterior distribution of the labels towards a reference distribution.
We show that our framework generalizes well to three different domains with 50%~68% mean absolute error reduction than the current state-of-the-art system trained for news sentence specificity.
We also demonstrate the potential of our work in improving the quality and informativeness of dialogue generation systems.
The participatory Web has enabled the ubiquitous and pervasive access of information, accompanied by an increase of speed and reach in information sharing.
Data dissemination services such as news aggregators are expected to provide up-to-date, real-time information to the end users.
News aggregators are in essence recommendation systems that filter and rank news stories in order to select the few that will appear on the users front screen at any time.
One of the main challenges in such systems is to address the recency and latency problems, that is, to identify as soon as possible how important a news story is.
In this work we propose an integrated framework that aims at predicting the importance of news items upon their publication with a focus on recent and highly popular news, employing resampling strategies, and at translating the result into concrete news rankings.
We perform an extensive experimental evaluation using real-life datasets of the proposed framework as both a stand-alone system and when applied to news recommendations from Google News.
Additionally, we propose and evaluate a combinatorial solution to the augmentation of official media recommendations with social information.
Results show that the proposed approach complements and enhances the news rankings generated by state-of-the-art systems.
The advancement of mobile technologies and the proliferation of map-based applications have enabled a user to access a wide variety of services that range from information queries to navigation systems.
Due to the popularity of map-based applications among the users, the service provider often requires to answer a large number of simultaneous queries.
Thus, processing queries efficiently on spatial networks (i.e., road networks) have become an important research area in recent years.
In this paper, we focus on path queries that find the shortest path between a source and a destination of the user.
In particular, we address the problem of finding the shortest paths for a large number of simultaneous path queries in road networks.
Traditional systems that consider one query at a time are not suitable for many applications due to high computational and service costs.
These systems cannot guarantee required response time in high load conditions.
We propose an efficient group based approach that provides a practical solution with reduced cost.
The key concept for our approach is to group queries that share a common travel path and then compute the shortest path for the group.
Experimental results show that our approach is on an average ten times faster than the traditional approach in return of sacrificing the accuracy by 0.5% in the worst case, which is acceptable for most of the users.
Single individual haplotyping is an NP-hard problem that emerges when attempting to reconstruct an organism's inherited genetic variations using data typically generated by high-throughput DNA sequencing platforms.
Genomes of diploid organisms, including humans, are organized into homologous pairs of chromosomes that differ from each other in a relatively small number of variant positions.
Haplotypes are ordered sequences of the nucleotides in the variant positions of the chromosomes in a homologous pair; for diploids, haplotypes associated with a pair of chromosomes may be conveniently represented by means of complementary binary sequences.
In this paper, we consider a binary matrix factorization formulation of the single individual haplotyping problem and efficiently solve it by means of alternating minimization.
We analyze the convergence properties of the alternating minimization algorithm and establish theoretical bounds for the achievable haplotype reconstruction error.
The proposed technique is shown to outperform existing methods when applied to synthetic as well as real-world Fosmid-based HapMap NA12878 datasets.
Scalable user- and application-aware resource allocation for heterogeneous applications sharing an enterprise network is still an unresolved problem.
The main challenges are: (i) How to define user- and application-aware shares of resources?
(ii) How to determine an allocation of shares of network resources to applications?
(iii) How to allocate the shares per application in heterogeneous networks at scale?
In this paper we propose solutions to the three challenges and introduce a system design for enterprise deployment.
Defining the necessary resource shares per application is hard, as the intended use case and user's preferences influence the resource demand.
Utility functions based on user experience enable a mapping of network resources in terms of throughput and latency budget to a common user-level utility scale.
A multi-objective MILP is formulated to solve the throughput- and delay-aware embedding of each utility function for a max-min fairness criteria.
The allocation of resources in traditional networks with policing and scheduling cannot distinguish large numbers of classes.
We propose a resource allocation system design for enterprise networks based on Software-Defined Networking principles to achieve delay-constrained routing in the network and application pacing at the end-hosts.
The system design is evaluated against best effort networks with applications competing for the throughput of a constrained link.
The competing applications belong to the five application classes web browsing, file download, remote terminal work, video streaming, and Voice-over-IP.
The results show that the proposed methodology improves the minimum and total utility, minimizes packet loss and queuing delay at bottlenecks, establishes fairness in terms of utility between applications, and achieves predictable application performance at high link utilization.
We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility.
The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other players' hidden goals from their observed behavior in order to solve the tasks.
We propose a new approach for learning in these domains: Self Other-Modeling (SOM), in which an agent uses its own policy to predict the other agent's actions and update its belief of their hidden state in an online manner.
We evaluate this approach on three different tasks and show that the agents are able to learn better policies using their estimate of the other players' hidden states, in both cooperative and adversarial settings.
Robot learning from demonstration (LfD) is a research paradigm that can play an important role in addressing the issue of scaling up robot learning.
Since this type of approach enables non-robotics experts can teach robots new knowledge without any professional background of mechanical engineering or computer programming skills, robots can appear in the real world even if it does not have any prior knowledge for any tasks like a new born baby.
There is a growing body of literature that employ LfD approach for training robots.
In this paper, I present a survey of recent research in this area while focusing on studies for human-robot collaborative tasks.
Since there are different aspects between stand-alone tasks and collaborative tasks, researchers should consider these differences to design collaborative robots for more effective and natural human-robot collaboration (HRC).
In this regard, many researchers have shown an increased interest in to make better communication framework between robots and humans because communication is a key issue to apply LfD paradigm for human-robot collaboration.
I thus review some recent works that focus on designing better communication channels/methods at the first, then deal with another interesting research method, Interactive/Active learning, after that I finally present other recent approaches tackle a more challenging problem, learning of complex tasks, in the last of the paper.
Detecting epileptic seizure through analysis of the electroencephalography (EEG) signal becomes a standard method for the diagnosis of epilepsy.
In a manual way, monitoring of long term EEG is tedious and error prone.
Therefore, a reliable automatic seizure detection method is desirable.
A critical challenge to automatic seizure detection is that seizure morphologies exhibit considerable variabilities.
In order to capture essential seizure patterns, this paper leverages an attention mechanism and a bidirectional long short-term memory (BiLSTM) model to exploit both spatially and temporally discriminating features and account for seizure variabilities.
The attention mechanism is to capture spatial features more effectively according to the contributions of brain areas to seizures.
The BiLSTM model is to extract more discriminating temporal features in the forward and the backward directions.
By accounting for both spatial and temporal variations of seizures, the proposed method is more robust across subjects.
The testing results over the noisy real data of CHB-MIT show that the proposed method outperforms the current state-of-the-art methods.
In both mixing-patients and cross-patient experiments, the average sensitivity and specificity are both higher while their corresponding standard deviations are lower than the methods in comparison.
Energy optimization has become a crucial issue in the realm of ICT.
This paper addresses the problem of energy consumption in a Metro Ethernet network.
Ethernet technology deployments have been increasing tremendously because of their simplicity and low cost.
However, much research remains to be conducted to address energy efficiency in Ethernet networks.
In this paper, we propose a novel Energy Aware Forwarding Strategy for Metro Ethernet networks based on a modification of the Internet Energy Aware Routing (EAR) algorithm.
Our contribution identifies the set of links to turn off and maintain links with minimum energy impact on the active state.
Our proposed algorithm could be a superior choice for use in networks with low saturation, as it involves a tradeoff between maintaining good network performance and minimizing the active links in the network.
Performance evaluation shows that, at medium load traffic, energy savings of 60% can be achieved.
At high loads, energy savings of 40% can be achieved without affecting the network performance.
In this paper, we give precise mathematical form to the idea of a structure whose data and axioms are faithfully represented by a graphical calculus; some prominent examples are operads, polycategories, properads, and PROPs.
Building on the established presentation of such structures as algebras for monads on presheaf categories, we describe a characteristic property of the associated monads---the shapeliness of the title---which says that "any two operations of the same shape agree".
An important part of this work is the study of analytic functors between presheaf categories, which are a common generalisation of Joyal's analytic endofunctors on sets and of the parametric right adjoint functors on presheaf categories introduced by Diers and studied by Carboni--Johnstone, Leinster and Weber.
Our shapely monads will be found among the analytic endofunctors, and may be characterised as the submonads of a universal analytic monad with "exactly one operation of each shape".
In fact, shapeliness also gives a way to define the data and axioms of a structure directly from its graphical calculus, by generating a free shapely monad on the basic operations of the calculus.
In this paper we do this for some of the examples listed above; in future work, we intend to do so for graphical calculi such as Milner's bigraphs, Lafont's interaction nets, or Girard's multiplicative proof nets, thereby obtaining canonical notions of denotational model.
Clone-and-own approach is a natural way of source code reuse for software developers.
To assess how known bugs and security vulnerabilities of a cloned component affect an application, developers and security analysts need to identify an original version of the component and understand how the cloned component is different from the original one.
Although developers may record the original version information in a version control system and/or directory names, such information is often either unavailable or incomplete.
In this research, we propose a code search method that takes as input a set of source files and extracts all the components including similar files from a software ecosystem (i.e., a collection of existing versions of software packages).
Our method employs an efficient file similarity computation using b-bit minwise hashing technique.
We use an aggregated file similarity for ranking components.
To evaluate the effectiveness of this tool, we analyzed 75 cloned components in Firefox and Android source code.
The tool took about two hours to report the original components from 10 million files in Debian GNU/Linux packages.
Recall of the top-five components in the extracted lists is 0.907, while recall of a baseline using SHA-1 file hash is 0.773, according to the ground truth recorded in the source code repositories.
Recently the widely used multi-view learning model, Canonical Correlation Analysis (CCA) has been generalised to the non-linear setting via deep neural networks.
Existing deep CCA models typically first decorrelate the feature dimensions of each view before the different views are maximally correlated in a common latent space.
This feature decorrelation is achieved by enforcing an exact decorrelation constraint; these models are thus computationally expensive due to the matrix inversion or SVD operations required for exact decorrelation at each training iteration.
Furthermore, the decorrelation step is often separated from the gradient descent based optimisation, resulting in sub-optimal solutions.
We propose a novel deep CCA model Soft CCA to overcome these problems.
Specifically, exact decorrelation is replaced by soft decorrelation via a mini-batch based Stochastic Decorrelation Loss (SDL) to be optimised jointly with the other training objectives.
Extensive experiments show that the proposed soft CCA is more effective and efficient than existing deep CCA models.
In addition, our SDL loss can be applied to other deep models beyond multi-view learning, and obtains superior performance compared to existing decorrelation losses.
Multimedia content delivery over the Internet is predominantly using the Hypertext Transfer Protocol (HTTP) as its primary protocol and multiple proprietary solutions exits.
The MPEG standard Dynamic Adaptive Streaming over HTTP (DASH) provides an interoperable solution and in recent years various adaptation logics/algorithms have been proposed.
However, to the best of our knowledge, there is no comprehensive evaluation of the various logics/algorithms.
Therefore, this paper provides a comprehensive evaluation of ten different adaptation logics/algorithms, which have been proposed in the past years.
The evaluation is done both objectively and subjectively.
The former is using a predefined bandwidth trajectory within a controlled environment and the latter is done in a real-world environment adopting crowdsourcing.
The results shall provide insights about which strategy can be adopted in actual deployment scenarios.
Additionally, the evaluation methodology described in this paper can be used to evaluate any other/new adaptation logic and to compare it directly with the results reported here.
Multi-label image classification is a fundamental but challenging task towards general visual understanding.
Existing methods found the region-level cues (e.g., features from RoIs) can facilitate multi-label classification.
Nevertheless, such methods usually require laborious object-level annotations (i.e., object labels and bounding boxes) for effective learning of the object-level visual features.
In this paper, we propose a novel and efficient deep framework to boost multi-label classification by distilling knowledge from weakly-supervised detection task without bounding box annotations.
Specifically, given the image-level annotations, (1) we first develop a weakly-supervised detection (WSD) model, and then (2) construct an end-to-end multi-label image classification framework augmented by a knowledge distillation module that guides the classification model by the WSD model according to the class-level predictions for the whole image and the object-level visual features for object RoIs.
The WSD model is the teacher model and the classification model is the student model.
After this cross-task knowledge distillation, the performance of the classification model is significantly improved and the efficiency is maintained since the WSD model can be safely discarded in the test phase.
Extensive experiments on two large-scale datasets (MS-COCO and NUS-WIDE) show that our framework achieves superior performances over the state-of-the-art methods on both performance and efficiency.
This paper explains genetic algorithm for novice in this field.
Basic philosophy of genetic algorithm and its flowchart are described.
Step by step numerical computation of genetic algorithm for solving simple mathematical equality problem will be briefly explained
One of the classical approaches for estimating the frequencies and damping factors in a spectrally sparse signal is the MUltiple SIgnal Classification (MUSIC) algorithm, which exploits the low-rank structure of an autocorrelation matrix.
Low-rank matrices have also received considerable attention recently in the context of optimization algorithms with partial observations.
In this work, we offer a novel optimization-based perspective on the classical MUSIC algorithm that could lead to future developments and understanding.
In particular, we propose an algorithm for spectral estimation that involves searching for the peaks of the dual polynomial corresponding to a certain nuclear norm minimization (NNM) problem, and we show that this algorithm is in fact equivalent to MUSIC itself.
Building on this connection, we also extend the classical MUSIC algorithm to the missing data case.
We provide exact recovery guarantees for our proposed algorithms and quantify how the sample complexity depends on the true spectral parameters.
Simulation results also indicate that the proposed algorithms significantly outperform some relevant existing methods in frequency estimation of damped exponentials.
In recent years, the decoding algorithms in communication networks are becoming increasingly complex aiming to achieve high reliability in correctly decoding received messages.
These decoding algorithms involve computationally complex operations requiring high performance computing hardware, which are generally expensive.
A cost-effective solution is to enhance the Instruction Set Architecture (ISA) of the processors by creating new custom instructions for the computational parts of the decoding algorithms.
In this paper, we propose to utilize the custom instruction approach to efficiently implement the widely used Viterbi decoding algorithm by adding the assembly language instructions to the ISA of DLX, PicoJava II and NIOS II processors, which represent RISC, stack and FPGA-based soft-core processor architectures, respectively.
By using the custom instruction approach, the execution time of the Viterbi algorithm is significantly improved by approximately 3 times for DLX and PicoJava II, and by 2 times for NIOS II.
We present a uniform method for translating an arbitrary nondeterministic finite automaton (NFA) into a deterministic mass action input/output chemical reaction network (I/O CRN) that simulates it.
The I/O CRN receives its input as a continuous time signal consisting of concentrations of chemical species that vary to represent the NFA's input string in a natural way.
The I/O CRN exploits the inherent parallelism of chemical kinetics to simulate the NFA in real time with a number of chemical species that is linear in the size of the NFA.
We prove that the simulation is correct and that it is robust with respect to perturbations of the input signal, the initial concentrations of species, the output (decision), and the rate constants of the reactions of the I/O CRN.
We give a new algorithm to construct optimal alphabetic ternary trees, where every internal node has at most three children.
This algorithm generalizes the classic Hu-Tucker algorithm, though the overall computational complexity has yet to be determined.
Development of reliable methods for optimised energy storage and generation is one of the most imminent challenges in moder power systems.
In this paper an adaptive approach to load leveling problem using novel dynamic models based on the Volterra integral equations of the first kind with piecewise continuous kernels.
These integral equations efficiently solve such inverse problem taking into account both the time dependent efficiencies and the availability of generation/storage of each energy storage technology.
In this analysis a direct numerical method is employed to find the least-cost dispatch of available storages.
The proposed collocation type numerical method has second order accuracy and enjoys self-regularization properties, which is associated with confidence levels of system demand.
This adaptive approach is suitable for energy storage optimisation in real time.
The efficiency of the proposed methodology is demonstrated on the Single Electricity Market of Republic of Ireland and Sakhalin island in the Russian Far East.
In this paper the problem of image restoration (denoising and inpainting) is approached using sparse approximation of local image blocks.
The local image blocks are extracted by sliding square windows over the image.
An adaptive block size selection procedure for local sparse approximation is proposed, which affects the global recovery of underlying image.
Ideally the adaptive local block selection yields the minimum mean square error (MMSE) in recovered image.
This framework gives us a clustered image based on the selected block size, then each cluster is restored separately using sparse approximation.
The results obtained using the proposed framework are very much comparable with the recently proposed image restoration techniques.
This paper proposes a hybrid technique for secured optimal power flow coupled with enhancing voltage stability with FACTS device installation.
The hybrid approach of Improved Gravitational Search algorithm (IGSA) and Firefly algorithm (FA) performance is analyzed by optimally placing TCSC controller.
The algorithm is implemented in MATLAB working platform and the power flow security and voltage stability is evaluated with IEEE 30 bus transmission systems.
The optimal results generated are compared with those available in literature and the superior performance of algorithm is depicted as minimum generation cost, reduced real power losses along with sustaining voltage stability.
Bipedal locomotion skills are challenging to develop.
Control strategies often use local linearization of the dynamics in conjunction with reduced-order abstractions to yield tractable solutions.
In these model-based control strategies, the controller is often not fully aware of many details, including torque limits, joint limits, and other non-linearities that are necessarily excluded from the control computations for simplicity.
Deep reinforcement learning (DRL) offers a promising model-free approach for controlling bipedal locomotion which can more fully exploit the dynamics.
However, current results in the machine learning literature are often based on ad-hoc simulation models that are not based on corresponding hardware.
Thus it remains unclear how well DRL will succeed on realizable bipedal robots.
In this paper, we demonstrate the effectiveness of DRL using a realistic model of Cassie, a bipedal robot.
By formulating a feedback control problem as finding the optimal policy for a Markov Decision Process, we are able to learn robust walking controllers that imitate a reference motion with DRL.
Controllers for different walking speeds are learned by imitating simple time-scaled versions of the original reference motion.
Controller robustness is demonstrated through several challenging tests, including sensory delay, walking blindly on irregular terrain and unexpected pushes at the pelvis.
We also show we can interpolate between individual policies and that robustness can be improved with an interpolated policy.
This paper is concerned with the design of cooperative distributed Model Predictive Control (MPC) for linear systems.
Motivated by the special structure of the distributed models in some existing literature, we propose to apply a state transformation to the original system and global cost function.
This has major implications on the closed-loop stability analysis and the mechanism of the resultant cooperative framework.
It turns out that the proposed framework can be implemented without cooperative iterations being performed in the local optimizations, thus allowing one to compute the local inputs in parallel and independently from each other while requiring only partial plant-wide state information.
The proposed framework can also be realized with cooperative iterations, thereby keeping the advantages of the technique in the former reference.
Under certain conditions, closed-loop stability for both implementation procedures can be guaranteed a priori by appropriate selections of the original local cost functions.
The strengths and benefits of the proposed method are highlighted by means of two numerical examples.
Recognizing fonts has become an important task in document analysis, due to the increasing number of available digital documents in different fonts and emphases.
A generic font-recognition system independent of language, script and content is desirable for processing various types of documents.
At the same time, categorizing calligraphy styles in handwritten manuscripts is important for palaeographic analysis, but has not been studied sufficiently in the literature.
We address the font-recognition problem as analysis and categorization of textures.
We extract features using complex wavelet transform and use support vector machines for classification.
Extensive experimental evaluations on different datasets in four languages and comparisons with state-of-the-art studies show that our proposed method achieves higher recognition accuracy while being computationally simpler.
Furthermore, on a new dataset generated from Ottoman manuscripts, we show that the proposed method can also be used for categorizing Ottoman calligraphy with high accuracy.
Workflow provenance typically assumes that each module is a "black-box", so that each output depends on all inputs (coarse-grained dependencies).
Furthermore, it does not model the internal state of a module, which can change between repeated executions.
In practice, however, an output may depend on only a small subset of the inputs (fine-grained dependencies) as well as on the internal state of the module.
We present a novel provenance framework that marries database-style and workflow-style provenance, by using Pig Latin to expose the functionality of modules, thus capturing internal state and fine-grained dependencies.
A critical ingredient in our solution is the use of a novel form of provenance graph that models module invocations and yields a compact representation of fine-grained workflow provenance.
It also enables a number of novel graph transformation operations, allowing to choose the desired level of granularity in provenance querying (ZoomIn and ZoomOut), and supporting "what-if" workflow analytic queries.
We implemented our approach in the Lipstick system and developed a benchmark in support of a systematic performance evaluation.
Our results demonstrate the feasibility of tracking and querying fine-grained workflow provenance.
In asynchronous physical-layer network coding (APNC) systems, the symbols from multiple transmitters to a common receiver may be misaligned.
The knowledge of the amount of symbol misalignment, hence its estimation, is important to PNC decoding.
This paper addresses the problem of symbol-misalignment estimation and the problem of optimal PNC decoding given the misalignment estimate, assuming the APNC system uses the root-raised-cosine pulse to carry signals (RRC-APNC).
First, we put forth an optimal symbol-misalignment estimator that makes use of double baud-rate samples.
Then, we devise optimal decoders for RRC-APNC in the presence of inaccurate symbol-misalignment estimates.
In particular, we present a new whitening transformation to whiten the noise of the double baud-rate samples.
Finally, we investigate the decoding performance of various estimation-and-decoding schemes for RRC-APNC.
Extensive simulations show that: (i) Our double baud-rate estimator yields substantially more accurate symbol-misalignment estimates than the baud-rate estimator does.
The mean-square-error (MSE) gains are up to 8 dB.
(ii) An overall estimation-and-decoding scheme in which both estimation and decoding are based on double baud-rate samples yields much better performance than other schemes.
Compared with a scheme in which both estimation and decoding are based on baud-rate samples), the double baud-rate sampling scheme yields 4.5 dB gains on symbol error rate (SER) performance in an AWGN channel, and 2 dB gains on packet error rate (PER) performance in a Rayleigh fading channel.
Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots.
Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design.
First, we measure current Twitter's capabilities of detecting the new social spambots.
Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots.
Then, we benchmark several state-of-the-art techniques proposed by the academic literature.
Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots.
Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon.
We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors.
Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots.
Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.
The file system provides the mechanism for online storage and access to file contents, including data and programs.
This paper covers the high-level details of file systems, as well as related topics such as the disk cache, the file system interface to the kernel, and the user-level APIs that use the features of the file system.
It will give you a thorough understanding of how a file system works in general.
The main component of the operating system is the file system.
It is used to create, manipulate, store, and retrieve data.
At the highest level, a file system is a way to manage information on a secondary storage medium.
There are so many layers under and above the file system.
All the layers are to be fully described here.
This paper will give the explanatory knowledge of the file system designers and the researchers in the area.
The complete path from the user process to secondary storage device is to be mentioned.
File system is the area where the researchers are doing lot of job and there is always a need to do more work.
The work is going on for the efficient, secure, energy saving techniques for the file systems.
As we know that the hardware is going to be fast in performance and low-priced day by day.
The software is not built to comeback with the hardware technology.
So there is a need to do research in this area to bridge the technology gap.
We study opportunistic scheduling and the sum capacity of cellular networks with a full-duplex multi-antenna base station and a large number of single-antenna half-duplex users.
Simultaneous uplink and downlink over the same band results in uplink-to-downlink interference, degrading performance.
We present a simple opportunistic joint uplink-downlink scheduling algorithm that exploits multiuser diversity and treats interference as noise.
We show that in homogeneous networks, our algorithm achieves the same sum capacity as what would have been achieved if there was no uplink-to-downlink interference, asymptotically in the number of users.
The algorithm does not require interference CSI at the base station or uplink users.
It is also shown that for a simple class of heterogeneous networks without sufficient channel diversity, it is not possible to achieve the corresponding interference-free system capacity.
We discuss the potential for using device-to-device side-channels to overcome this limitation in heterogeneous networks.
In this paper a new distributed asynchronous algorithm is proposed for time synchronization in networks with random communication delays, measurement noise and communication dropouts.
Three different types of the drift correction algorithm are introduced, based on different kinds of local time increments.
Under nonrestrictive conditions concerning network properties, it is proved that all the algorithm types provide convergence in the mean square sense and with probability one (w.p.1) of the corrected drifts of all the nodes to the same value (consensus).
An estimate of the convergence rate of these algorithms is derived.
For offset correction, a new algorithm is proposed containing a compensation parameter coping with the influence of random delays and special terms taking care of the influence of both linearly increasing time and drift correction.
It is proved that the corrected offsets of all the nodes converge in the mean square sense and w.p.1.
An efficient offset correction algorithm based on consensus on local compensation parameters is also proposed.
It is shown that the overall time synchronization algorithm can also be implemented as a flooding algorithm with one reference node.
It is proved that it is possible to achieve bounded error between local corrected clocks in the mean square sense and w.p.1.
Simulation results provide an additional practical insight into the algorithm properties and show its advantage over the existing methods.
The impression of free will is the feeling according to which our choices are neither imposed from our inside nor from outside.
It is the sense we are the ultimate cause of our acts.
In direct opposition with the universal determinism, the existence of free will continues to be discussed.
In this paper, free will is linked to a decisional mechanism: an agent is provided with free will if having performed a predictable choice Cp, it can immediately perform another choice Cr in a random way.
The intangible feeling of free will is replaced by a decision-making process including a predictable decision-making process immediately followed by an unpredictable decisional one.
This paper examines fundamental error characteristics for a general class of matrix completion problems, where the matrix of interest is a product of two a priori unknown matrices, one of which is sparse, and the observations are noisy.
Our main contributions come in the form of minimax lower bounds for the expected per-element squared error for this problem under under several common noise models.
Specifically, we analyze scenarios where the corruptions are characterized by additive Gaussian noise or additive heavier-tailed (Laplace) noise, Poisson-distributed observations, and highly-quantized (e.g., one-bit) observations, as instances of our general result.
Our results establish that the error bounds derived in (Soni et al., 2016) for complexity-regularized maximum likelihood estimators achieve, up to multiplicative constants and logarithmic factors, the minimax error rates in each of these noise scenarios, provided that the nominal number of observations is large enough, and the sparse factor has (on an average) at least one non-zero per column.
Benefiting from its succinctness and robustness, skeleton-based human action recognition has recently attracted much attention.
Most existing methods utilize local networks, such as recurrent networks, convolutional neural networks, and graph convolutional networks, to extract spatio-temporal dynamics hierarchically.
As a consequence, the local and non-local dependencies, which respectively contain more details and semantics, are asynchronously captured in different level of layers.
Moreover, limited to the spatio-temporal domain, these methods ignored patterns in the frequency domain.
To better extract information from multi-domains, we propose a residual frequency attention (rFA) to focus on discriminative patterns in the frequency domain, and a synchronous local and non-local (SLnL) block to simultaneously capture the details and semantics in the spatio-temporal domain.
To optimize the whole process, we also propose a soft-margin focal loss (SMFL), which can automatically conducts adaptive data selection and encourages intrinsic margins in classifiers.
Extensive experiments are performed on several large-scale action recognition datasets and our approach significantly outperforms other state-of-the-art methods.
Deep reinforcement learning algorithms can learn complex behavioral skills, but real-world application of these methods requires a large amount of experience to be collected by the agent.
In practical settings, such as robotics, this involves repeatedly attempting a task, resetting the environment between each attempt.
However, not all tasks are easily or automatically reversible.
In practice, this learning process requires extensive human intervention.
In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.
By learning a value function for the reset policy, we can automatically determine when the forward policy is about to enter a non-reversible state, providing for uncertainty-aware safety aborts.
Our experiments illustrate that proper use of the reset policy can greatly reduce the number of manual resets required to learn a task, can reduce the number of unsafe actions that lead to non-reversible states, and can automatically induce a curriculum.
The 2011 Grand Challenge in Service conference aimed to explore, analyse and evaluate complex service systems, utilising a case scenario of delivering on improved perception of safety in the London Borough of Sutton, which provided a common context to link the contributions.
The key themes that emerged included value co-creation, systems and networks, ICT and complexity, for which we summarise the contributions.
Contributions on value co-creation are based mainly on empirical research and provide a variety of insights including the importance of better understanding collaboration within value co-creation.
Contributions on the systems perspective, considered to arise from networks of value co-creation, include efforts to understand the implications of the interactions within service systems, as well as their interactions with social systems, to co-create value.
Contributions within the technological sphere, providing ever greater connectivity between entities, focus on the creation of new value constellations and new demand being fulfilled through hybrid offerings of physical assets, information and people.
Contributions on complexity, arising from the value co- creation networks of technology enabled services systems, focus on the challenges in understanding, managing and analysing these complex service systems.
The theory and applications all show the importance of understanding service for the future.
Tor is vulnerable to network-level adversaries who can observe both ends of the communication to deanonymize users.
Recent work has shown that Tor is susceptible to the previously unknown active BGP routing attacks, called RAPTOR attacks, which expose Tor users to more network-level adversaries.
In this paper, we aim to mitigate and detect such active routing attacks against Tor.
First, we present a new measurement study on the resilience of the Tor network to active BGP prefix attacks.
We show that ASes with high Tor bandwidth can be less resilient to attacks than other ASes.
Second, we present a new Tor guard relay selection algorithm that incorporates resilience of relays into consideration to proactively mitigate such attacks.
We show that the algorithm successfully improves the security for Tor clients by up to 36% on average (up to 166% for certain clients).
Finally, we build a live BGP monitoring system that can detect routing anomalies on the Tor network in real time by performing an AS origin check and novel detection analytics.
Our monitoring system successfully detects simulated attacks that are modeled after multiple known attack types as well as a real-world hijack attack (performed by us), while having low false positive rates.
This paper describes our system designed for the NLPCC 2016 shared task on word segmentation on micro-blog texts.
Network densification has always been an important factor to cope with the ever increasing capacity demand.
Deploying more base stations (BSs) improves the spatial frequency utilization, which increases the network capacity.
However, such improvement comes at the expense of shrinking the BSs' footprints, which increases the handover (HO) rate and may diminish the foreseen capacity gains.
In this paper, we propose a cooperative HO management scheme to mitigate the HO effect on throughput gains achieved via cellular network densification.
The proposed HO scheme relies on skipping HO to the nearest BS at some instances along the user's trajectory while enabling cooperative BS service during HO execution at other instances.
To this end, we develop a mathematical model, via stochastic geometry, to quantify the performance of the proposed HO scheme in terms of coverage probability and user throughput.
The results show that the proposed cooperative HO scheme outperforms the always best connected based association at high mobility.
Also, the value of BS cooperation along with handover skipping is quantified with respect to the HO skipping only that has recently appeared in the literature.
Particularly, the proposed cooperative HO scheme shows throughput gains of 12% to 27% and 17% on average, when compared to the always best connected and HO skipping only schemes at user velocity ranging from 80 km/h to 160 Km/h, respectively.
Perfect tracking control for real-world Euler-Lagrange systems is challenging due to uncertainties in the system model and external disturbances.
The magnitude of the tracking error can be reduced either by increasing the feedback gains or improving the model of the system.
The latter is clearly preferable as it allows to maintain good tracking performance at low feedback gains.
However, accurate models are often difficult to obtain.
In this article, we address the problem of stable high-performance tracking control for unknown Euler-Lagrange systems.
In particular, we employ Gaussian Process regression to obtain a data-driven model that is used for the feed-forward compensation of unknown dynamics of the system.
The model fidelity is used to adapt the feedback gains allowing low feedback gains in state space regions of high model confidence.
The proposed control law guarantees a globally bounded tracking error with a specific probability.
Simulation studies demonstrate the superiority over state of the art tracking control approaches.
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization.
It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts that are rather controlled by nature but still infringe the boundaries of CNL.
Although it is arguable if CNL can be exploited to approach the robust wide-coverage semantic parsing for use cases like media monitoring, its potential becomes much more obvious in the opposite direction: generation of story highlights from the summarized AMR graphs, which is in the focus of this position paper.
This paper focuses on preserving the privacy of sensitive patterns when inducing decision trees.
We adopt a record augmentation approach for hiding sensitive classification rules in binary datasets.
Such a hiding methodology is preferred over other heuristic solutions like output perturbation or cryptographic techniques - which restrict the usability of the data - since the raw data itself is readily available for public use.
We show some key lemmas which are related to the hiding process and we also demonstrate the methodology with an example and an indicative experiment using a prototype hiding tool.
Recognition of low resolution face images is a challenging problem in many practical face recognition systems.
Methods have been proposed in the face recognition literature for the problem which assume that the probe is low resolution, but a high resolution gallery is available for recognition.
These attempts have been aimed at modifying the probe image such that the resultant image provides better discrimination.
We formulate the problem differently by leveraging the information available in the high resolution gallery image and propose a dictionary learning approach for classifying the low-resolution probe image.
An important feature of our algorithm is that it can handle resolution change along with illumination variations.
Furthermore, we also kernelize the algorithm to handle non-linearity in data and present a joint dictionary learning technique for robust recognition at low resolutions.
The effectiveness of the proposed method is demonstrated using standard datasets and a challenging outdoor face dataset.
It is shown that our method is efficient and can perform significantly better than many competitive low resolution face recognition algorithms.
Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent.
When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable.
However, such models are generated as part of a larger analytical process, which, in particular, comprises data collection and filtering.
Selection bias in data collection or in data pre-processing may affect the model learned.
Although model induction algorithms are designed to learn to generalize, they pursue optimization of predictive accuracy.
It remains unclear how interpretability is instead impacted.
We conduct an experimental analysis to investigate whether interpretable models are able to cope with data selection bias as far as interpretability is concerned.
Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone.
Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically.
Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year.
Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively.
To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model.
Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements.
Networks observed in real world like social networks, collaboration networks etc., exhibit temporal dynamics, i.e.nodes and edges appear and/or disappear over time.
In this paper, we propose a generative, latent space based, statistical model for such networks (called dynamic networks).
We consider the case where the number of nodes is fixed, but the presence of edges can vary over time.
Our model allows the number of communities in the network to be different at different time steps.
We use a neural network based methodology to perform approximate inference in the proposed model and its simplified version.
Experiments done on synthetic and real world networks for the task of community detection and link prediction demonstrate the utility and effectiveness of our model as compared to other similar existing approaches.
While autonomous multirotor micro aerial vehicles (MAVs) are uniquely well suited for certain types of missions benefiting from stationary flight capabilities, their more widespread usage still faces many hurdles, due in particular to their limited range and the difficulty of fully automating their deployment and retrieval.
In this paper we address these issues by solving the problem of the automated landing of a quadcopter on a ground vehicle moving at relatively high speed.
We present our system architecture, including the structure of our Kalman filter for the estimation of the relative position and velocity between the quadcopter and the landing pad, as well as our controller design for the full rendezvous and landing maneuvers.
The system is experimentally validated by successfully landing in multiple trials a commercial quadcopter on the roof of a car moving at speeds of up to 50 km/h.
Minimum vertex cover problem is an NP-Hard problem with the aim of finding minimum number of vertices to cover graph.
In this paper, a learning automaton based algorithm is proposed to find minimum vertex cover in graph.
In the proposed algorithm, each vertex of graph is equipped with a learning automaton that has two actions in the candidate or non-candidate of the corresponding vertex cover set.
Due to characteristics of learning automata, this algorithm significantly reduces the number of covering vertices of graph.
The proposed algorithm based on learning automata iteratively minimize the candidate vertex cover through the update its action probability.
As the proposed algorithm proceeds, a candidate solution nears to optimal solution of the minimum vertex cover problem.
In order to evaluate the proposed algorithm, several experiments conducted on DIMACS dataset which compared to conventional methods.
Experimental results show the major superiority of the proposed algorithm over the other methods.
Optimizing for long term value is desirable in many practical applications, e.g.recommender systems.
The most common approach for long term value optimization is supervised learning using long term value as the target.
Unfortunately, long term metrics take a long time to measure (e.g., will customers finish reading an ebook?), and vanilla forecasters cannot learn from examples until the outcome is observed.
In practical systems where new items arrive frequently, such delay can increase the training-serving skew, thereby negatively affecting the model's predictions for new products.
We argue that intermediate observations (e.g., if customers read a third of the book in 24 hours) can improve a model's predictions.
We formalize the problem as a semi-stochastic model, where instances are selected by an adversary but, given an instance, the intermediate observation and the outcome are sampled from a factored joint distribution.
We propose an algorithm that exploits intermediate observations and theoretically quantify how much it can outperform any prediction method that ignores the intermediate observations.
Motivated by the theoretical analysis, we propose two neural network architectures: Factored Forecaster (FF) which is ideal if our assumptions are satisfied, and Residual Factored Forecaster (RFF) that is more robust to model mis-specification.
Experiments on two real world datasets, a dataset derived from GitHub repositories and another dataset from a popular marketplace, show that RFF outperforms both FF as well as an algorithm that ignores intermediate observations.
Impulsive dynamical systems is a well-established area of dynamical systems theory, and it is used in this work to analyze several basic properties of reset control systems: existence and uniqueness of solutions, and continuous dependence on the initial condition (well-posedness).
The work scope is about reset control systems with a linear and time-invariant base system, and a zero-crossing resetting law.
A necessary and sufficient condition for existence and uniqueness of solutions, based on the well-posedness of reset instants, is developed.
As a result, it is shown that reset control systems (with strictly proper plants) do no have Zeno solutions.
It is also shown that full reset and partial reset (with a special structure) always produce well-posed reset instants.
Moreover, a definition of continuous dependence on the initial condition is developed, and also a sufficient condition for reset control systems to satisfy that property.
Finally, this property is used to analyze sensitivity of reset control systems to sensor noise.
This work also includes a number of illustrative examples motivating the key concepts and main results.
Leveraging large historical data in electronic health record (EHR), we developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses.
Doctor AI is a temporal model using recurrent neural networks (RNN) and was developed and applied to longitudinal time stamped EHR data from 260K patients over 8 years.
Encounter records (e.g. diagnosis codes, medication codes or procedure codes) were input to RNN to predict (all) the diagnosis and medication categories for a subsequent visit.
Doctor AI assesses the history of patients to make multilabel predictions (one label for each diagnosis or medication category).
Based on separate blind test set evaluation, Doctor AI can perform differential diagnosis with up to 79% recall@30, significantly higher than several baselines.
Moreover, we demonstrate great generalizability of Doctor AI by adapting the resulting models from one institution to another without losing substantial accuracy.
The predictive processing (PP) hypothesizes that the predictive inference of our sensorimotor system is encoded implicitly in the regularities between perception and action.
We propose a neural architecture in which such regularities of active inference are encoded hierarchically.
We further suggest that this encoding emerges during the embodied learning process when the appropriate action is selected to minimize the prediction error in perception.
Therefore, this predictive stream in the sensorimotor loop is generated in a top-down manner.
Specifically, it is constantly modulated by the motor actions and is updated by the bottom-up prediction error signals.
In this way, the top-down prediction originally comes from the prior experience from both perception and action representing the higher levels of this hierarchical cognition.
In our proposed embodied model, we extend the PredNet Network, a hierarchical predictive coding network, with the motor action units implemented by a multi-layer perceptron network (MLP) to modulate the network top-down prediction.
Two experiments, a minimalistic world experiment, and a mobile robot experiment are conducted to evaluate the proposed model in a qualitative way.
In the neural representation, it can be observed that the causal inference of predictive percept from motor actions can be also observed while the agent is interacting with the environment.
Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively.
In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections.
We develop and explore novel techniques based on deep learning to generate captions for both individual images and image streams, using temporal consistency constraints to create summaries that are both more compact and less noisy.
We evaluate our techniques with quantitative and qualitative results, and apply captioning to an image retrieval application for finding potentially private images.
Our results suggest that our automatic captioning algorithms, while imperfect, may work well enough to help users manage lifelogging photo collections.
Software is a field of rapid changes: the best technology today becomes obsolete in the near future.
If we review the graduate attributes of any of the software engineering programs across the world, life-long learning is one of them.
The social and psychological aspects of professional development is linked with rewards.
In organizations, where people are provided with learning opportunities and there is a culture that rewards learning, people embrace changes easily.
However, the software industry tends to be short-sighted and its primary focus is more on current project success; it usually ignores the capacity building of the individual or team.
It is hoped that our software engineering colleagues will be motivated to conduct more research into the area of software psychology so as to understand more completely the possibilities for increased effectiveness and personal fulfillment among software engineers working alone and in teams.
In meetings where important decisions get made, what items receive more attention may influence the outcome.
We examine how different types of rhetorical (de-)emphasis -- including hedges, superlatives, and contrastive conjunctions -- correlate with what gets revisited later, controlling for item frequency and speaker.
Our data consists of transcripts of recurring meetings of the Federal Reserve's Open Market Committee (FOMC), where important aspects of U.S. monetary policy are decided on.
Surprisingly, we find that words appearing in the context of hedging, which is usually considered a way to express uncertainty, are more likely to be repeated in subsequent meetings, while strong emphasis indicated by superlatives has a slightly negative effect on word recurrence in subsequent meetings.
We also observe interesting patterns in how these effects vary depending on social factors such as status and gender of the speaker.
For instance, the positive effects of hedging are more pronounced for female speakers than for male speakers.
The data underlying scientific papers should be accessible to researchers both now and in the future, but how best can we ensure that these data are available?
Here we examine the effectiveness of four approaches to data archiving: no stated archiving policy, recommending (but not requiring) archiving, and two versions of mandating data deposition at acceptance.
We control for differences between data types by trying to obtain data from papers that use a single, widespread population genetic analysis, STRUCTURE.
At one extreme, we found that mandated data archiving policies that require the inclusion of a data availability statement in the manuscript improve the odds of finding the data online almost a thousand-fold compared to having no policy.
However, archiving rates at journals with less stringent policies were only very slightly higher than those with no policy at all.
At one extreme, we found that mandated data archiving policies that require the inclusion of a data availability statement in the manuscript improve the odds of finding the data online almost a thousand fold compared to having no policy.
However, archiving rates at journals with less stringent policies were only very slightly higher than those with no policy at all.
We also assessed the effectiveness of asking for data directly from authors and obtained over half of the requested datasets, albeit with about 8 days delay and some disagreement with authors.
Given the long term benefits of data accessibility to the academic community, we believe that journal based mandatory data archiving policies and mandatory data availability statements should be more widely adopted.
The Python--elsA user interface of the elsA cfd (Computational Fluid Dynamics) software has been developed to allow users to specify simulations with confidence, through a global context of description objects grouped inside scripts.
The software main features are generated documentation, context checking and completion, and helpful error management.
Further developments have used this foundation as a coupling framework, allowing (thanks to the descriptive approach) the coupling of external algorithms with the cfd solver in a simple and abstract way, leading to more success in complex simulations.
Along with the description of the technical part of the interface, we try to gather the salient points pertaining to the psychological viewpoint of user experience (ux).
We point out the differences between user interfaces and pure data management systems such as cgns.
Various moral conundrums plague population ethics: The Non-Identity Problem, The Procreation Asymmetry, The Repugnant Conclusion, and more.
I argue that the aforementioned moral conundrums have a structure neatly accounted for, and solved by, some ideas in computability theory.
I introduce a mathematical model based on computability theory and show how previous arguments pertaining to these conundrums fit into the model.
This paper proceeds as follows.
First, I do a very brief survey of the history of computability theory in moral philosophy.
Second, I follow various papers, and show how their arguments fit into, or don't fit into, our model.
Third, I discuss the implications of our model to the question why the human race should or should not continue to exist.
Finally, I show that our model ineluctably leads us to a Confucian moral principle.
We evaluated the effectiveness of an automated bird sound identification system in a situation that emulates a realistic, typical application.
We trained classification algorithms on a crowd-sourced collection of bird audio recording data and restricted our training methods to be completely free of manual intervention.
The approach is hence directly applicable to the analysis of multiple species collections, with labelling provided by crowd-sourced collection.
We evaluated the performance of the bird sound recognition system on a realistic number of candidate classes, corresponding to real conditions.
We investigated the use of two canonical classification methods, chosen due to their widespread use and ease of interpretation, namely a k Nearest Neighbour (kNN) classifier with histogram-based features and a Support Vector Machine (SVM) with time-summarisation features.
We further investigated the use of a certainty measure, derived from the output probabilities of the classifiers, to enhance the interpretability and reliability of the class decisions.
Our results demonstrate that both identification methods achieved similar performance, but we argue that the use of the kNN classifier offers somewhat more flexibility.
Furthermore, we show that employing an outcome certainty measure provides a valuable and consistent indicator of the reliability of classification results.
Our use of generic training data and our investigation of probabilistic classification methodologies that can flexibly address the variable number of candidate species/classes that are expected to be encountered in the field, directly contribute to the development of a practical bird sound identification system with potentially global application.
Further, we show that certainty measures associated with identification outcomes can significantly contribute to the practical usability of the overall system.
We address the problem of activity detection in continuous, untrimmed video streams.
This is a difficult task that requires extracting meaningful spatio-temporal features to capture activities, accurately localizing the start and end times of each activity.
We introduce a new model, Region Convolutional 3D Network (R-C3D), which encodes the video streams using a three-dimensional fully convolutional network, then generates candidate temporal regions containing activities, and finally classifies selected regions into specific activities.
Computation is saved due to the sharing of convolutional features between the proposal and the classification pipelines.
The entire model is trained end-to-end with jointly optimized localization and classification losses.
R-C3D is faster than existing methods (569 frames per second on a single Titan X Maxwell GPU) and achieves state-of-the-art results on THUMOS'14.
We further demonstrate that our model is a general activity detection framework that does not rely on assumptions about particular dataset properties by evaluating our approach on ActivityNet and Charades.
Our code is available at http://ai.bu.edu/r-c3d/.
We propose fast probabilistic algorithms with low (i.e., sublinear in the input size) communication volume to check the correctness of operations in Big Data processing frameworks and distributed databases.
Our checkers cover many of the commonly used operations, including sum, average, median, and minimum aggregation, as well as sorting, union, merge, and zip.
An experimental evaluation of our implementation in Thrill (Bingmann et al., 2016) confirms the low overhead and high failure detection rate predicted by theoretical analysis.
Nowadays impact factor is the significant indicator for journal evaluation.
In impact factor calculation is used number of all citations to journal, regardless of the prestige of cited journals, however, scientific units (paper, researcher, journal or scientific organization) cited by journals with high impact factor or researchers with high Hirsch index are more important than objects cited by journals without impact factor or unknown researcher.
In this paper was offered weighted impact factor for getting more accurate rankings for journals, which consider not only quantity of citations, but also quality of citing journals.
Correlation coefficients among different indicators for journal evaluation: impact factors by Thomson Scientific, weighted impact factors offered by different researchers, average and medians of all citing journals impact factors and 5-year impact factors were analysed.
We present the first generative adversarial network (GAN) for natural image matting.
Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well-composited images.
Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information.
We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in others.
Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production.
In the following paper we present a new semantics for the well-known strategic logic ATL.
It is based on adding roles to concurrent game structures, that is at every state, each agent belongs to exactly one role, and the role specifies what actions are available to him at that state.
We show advantages of the new semantics, provide motivating examples based on sensor networks, and analyze model checking complexity.
Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms.
Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible.
In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically.
Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time.
Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem.
To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals.
Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load.
Convolution Neural Networks, known as ConvNets exceptionally perform well in many complex machine learning tasks.
The architecture of ConvNets demands the huge and rich amount of data and involves with a vast number of parameters that leads the learning takes to be computationally expensive, slow convergence towards the global minima, trap in local minima with poor predictions.
In some cases, architecture overfits the data and make the architecture difficult to generalise for new samples that were not in the training set samples.
To address these limitations, many regularization and optimization strategies are developed for the past few years.
Also, studies suggested that these techniques significantly increase the performance of the networks as well as reducing the computational cost.
In implementing these techniques, one must thoroughly understand the theoretical concept of how this technique works in increasing the expressive power of the networks.
This article is intended to provide the theoretical concepts and mathematical formulation of the most commonly used strategies in developing a ConvNet architecture.
Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology.
The parses contain intents and slots that are directly consumed by downstream domain applications.
In this work we discuss expanding such systems to handle compound entities and intents by introducing a domain-agnostic shallow parser that handles linguistic coordination.
We show that our model for parsing coordination learns domain-independent and slot-independent features and is able to segment conjunct boundaries of many different phrasal categories.
We also show that using adversarial training can be effective for improving generalization across different slot types for coordination parsing.
Deconvolutional layers have been widely used in a variety of deep models for up-sampling, including encoder-decoder networks for semantic segmentation and deep generative models for unsupervised learning.
One of the key limitations of deconvolutional operations is that they result in the so-called checkerboard problem.
This is caused by the fact that no direct relationship exists among adjacent pixels on the output feature map.
To address this problem, we propose the pixel deconvolutional layer (PixelDCL) to establish direct relationships among adjacent pixels on the up-sampled feature map.
Our method is based on a fresh interpretation of the regular deconvolution operation.
The resulting PixelDCL can be used to replace any deconvolutional layer in a plug-and-play manner without compromising the fully trainable capabilities of original models.
The proposed PixelDCL may result in slight decrease in efficiency, but this can be overcome by an implementation trick.
Experimental results on semantic segmentation demonstrate that PixelDCL can consider spatial features such as edges and shapes and yields more accurate segmentation outputs than deconvolutional layers.
When used in image generation tasks, our PixelDCL can largely overcome the checkerboard problem suffered by regular deconvolution operations.
Machine learning applications in medical imaging are frequently limited by the lack of quality labeled data.
In this paper, we explore the self training method, a form of semi-supervised learning, to address the labeling burden.
By integrating reinforcement learning, we were able to expand the application of self training to complex segmentation networks without any further human annotation.
The proposed approach, reinforced self training (ReST), fine tunes a semantic segmentation networks by introducing a policy network that learns to generate pseudolabels.
We incorporate an expert demonstration network, based on inverse reinforcement learning, to enhance clinical validity and convergence of the policy network.
The model was tested on a pulmonary nodule segmentation task in chest X-rays and achieved the performance of a standard U-Net while using only 50% of the labeled data, by exploiting unlabeled data.
When the same number of labeled data was used, a moderate to significant cross validation accuracy improvement was achieved depending on the absolute number of labels used.
This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733% on the private test set (ranked 3rd among 394 teams, not considering the model size constraint), and 87.287% using a model that meets size requirement.
Two sets of 7 individual models belonging to 3 different families were trained separately.
Then, the inference results on a training data were aggregated from these multiple models and fed to train a compact model that meets the model size requirement.
In order to further improve performance we explored and employed data over/sub-sampling in feature space, an additional regularization term during training exploiting label relationship, and learned weights for ensembling different individual models.
A consistent query answer in an inconsistent database is an answer obtained in every (minimal) repair.
The repairs are obtained by resolving all conflicts in all possible ways.
Often, however, the user is able to provide a preference on how conflicts should be resolved.
We investigate here the framework of preferred consistent query answers, in which user preferences are used to narrow down the set of repairs to a set of preferred repairs.
We axiomatize desirable properties of preferred repairs.
We present three different families of preferred repairs and study their mutual relationships.
Finally, we investigate the complexity of preferred repairing and computing preferred consistent query answers.
We address the problem of deploying a reinforcement learning (RL) agent on a physical system such as a datacenter cooling unit or robot, where critical constraints must never be violated.
We show how to exploit the typically smooth dynamics of these systems and enable RL algorithms to never violate constraints during learning.
Our technique is to directly add to the policy a safety layer that analytically solves an action correction formulation per each state.
The novelty of obtaining an elegant closed-form solution is attained due to a linearized model, learned on past trajectories consisting of arbitrary actions.
This is to mimic the real-world circumstances where data logs were generated with a behavior policy that is implausible to describe mathematically; such cases render the known safety-aware off-policy methods inapplicable.
We demonstrate the efficacy of our approach on new representative physics-based environments, and prevail where reward shaping fails by maintaining zero constraint violations.
Building on the Ethernet Passive Optical Network (EPON) and Gigabit PON (GPON) standards, Next-Generation (NG) PONs (i) provide increased data rates, split ratios, wavelengths counts, and fiber lengths, as well as (ii) allow for all-optical integration of access and metro networks.
In this paper we provide a comprehensive probabilistic analysis of the capacity (maximum mean packet throughput) and packet delay of subnetworks that can be used to form NG-PONs.
Our analysis can cover a wide range of NG-PONs through taking the minimum capacity of the subnetworks making up the NG-PON and weighing the packet delays of the subnetworks.
Our numerical and simulation results indicate that our analysis quite accurately characterizes the throughput-delay performance of EPON/GPON tree networks, including networks upgraded with higher data rates and wavelength counts.
Our analysis also characterizes the trade-offs and bottlenecks when integrating EPON/GPON tree networks across a metro area with a ring, a Passive Star Coupler (PSC), or an Arrayed Waveguide Grating (AWG) for uniform and non-uniform traffic.
To the best of our knowledge, the presented analysis is the first to consider multiple PONs interconnected via a metro network.
We consider contractive systems whose trajectories evolve on a compact and convex state-space.
It is well-known that if the time-varying vector field of the system is periodic then the system admits a unique globally asymptotically stable periodic solution.
Obtaining explicit information on this periodic solution and its dependence on various parameters is important both theoretically and in numerous applications.
We develop an approach for approximating such a periodic trajectory using the periodic trajectory of a simpler system (e.g. an LTI system).
Our approximation includes an error bound that is based on the input-to-state stability property of contractive systems.
We show that in some cases this error bound can be computed explicitly.
We also use the bound to derive a new theoretical result, namely, that a contractive system with an additive periodic input behaves like a low pass filter.
We demonstrate our results using several examples from systems biology.
We provide code that produces beautiful poetry.
Our sonnet-generation algorithm includes several novel elements that improve over the state-of-the-art, leading to rhythmic and inspiring poems.
The work discussed here is the winner of the 2018 PoetiX Literary Turing Test Award for computer-generated poetry.
Non-availability of reliable and sustainable electric power is a major problem in the developing world.
Renewable energy sources like solar are not very lucrative in the current stage due to various uncertainties like weather, storage, land use among others.
There also exists various other issues like mis-commitment of power, absence of intelligent fault analysis, congestion, etc.
In this paper, we propose a novel deep learning-based system for predicting faults and selecting power generators optimally so as to reduce costs and ensure higher reliability in solar power systems.
The results are highly encouraging and they suggest that the approaches proposed in this paper have the potential to be applied successfully in the developing world.
The goal of this paper is to analyze the behavior and intent of recent types of privacy invasive Android adware.
There are two recent trends in this area: more financial motives instead of ego motives, and the development of more dynamic analysis tools.
This paper starts with a review of Android mobile operating system security, and also addresses the pros and cons of open source operating system security.
Static analysis of malware provides high quality results and leads to a good understanding as shown in this paper.
However, as malware grows in number and complexity, there have been recent efforts to automate the detection mechanisms and many of the static tasks.
As Android's market share is rapidly growing around the world.
Android security will be a crucial area of research for IT security professionals and their academic counterparts.
The upside of the current situation is that malware is being quickly exposed, thanks to open source software development tools.
This cooperation is important in curbing the widespread theft of personal information with monetary value.
With the emergence of the Hospital Readmission Reduction Program of the Center for Medicare and Medicaid Services on October 1, 2012, forecasting unplanned patient readmission risk became crucial to the healthcare domain.
There are tangible works in the literature emphasizing on developing readmission risk prediction models; However, the models are not accurate enough to be deployed in an actual clinical setting.
Our study considers patient readmission risk as the objective for optimization and develops a useful risk prediction model to address unplanned readmissions.
Furthermore, Genetic Algorithm and Greedy Ensemble is used to optimize the developed model constraints.
With the Internet of Things (IoT) becoming a major component of our daily life, understanding how to improve the quality of service (QoS) for IoT applications through fog computing is becoming an important problem.
In this paper, we introduce a general framework for IoT-fog-cloud applications, and propose a delay-minimizing collaboration and offloading policy for fog-capable devices that aims to reduce the service delay for IoT applications.
We then develop an analytical model to evaluate our policy and show how the proposed framework helps to reduce IoT service delay.
Hazy images are common in real scenarios and many dehazing methods have been developed to automatically remove the haze from images.
Typically, the goal of image dehazing is to produce clearer images from which human vision can better identify the object and structural details present in the images.
When the ground-truth haze-free image is available for a hazy image, quantitative evaluation of image dehazing is usually based on objective metrics, such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM).
However, in many applications, large-scale images are collected not for visual examination by human.
Instead, they are used for many high-level vision tasks, such as automatic classification, recognition and categorization.
One fundamental problem here is whether various dehazing methods can produce clearer images that can help improve the performance of the high-level tasks.
In this paper, we empirically study this problem in the important task of image classification by using both synthetic and real hazy image datasets.
From the experimental results, we find that the existing image-dehazing methods cannot improve much the image-classification performance and sometimes even reduce the image-classification performance.
We investigate underwater acoustic (UWA) channel equalization and introduce hierarchical and adaptive nonlinear channel equalization algorithms that are highly efficient and provide significantly improved bit error rate (BER) performance.
Due to the high complexity of nonlinear equalizers and poor performance of linear ones, to equalize highly difficult underwater acoustic channels, we employ piecewise linear equalizers.
However, in order to achieve the performance of the best piecewise linear model, we use a tree structure to hierarchically partition the space of the received signal.
Furthermore, the equalization algorithm should be completely adaptive, since due to the highly non-stationary nature of the underwater medium, the optimal MSE equalizer as well as the best piecewise linear equalizer changes in time.
To this end, we introduce an adaptive piecewise linear equalization algorithm that not only adapts the linear equalizer at each region but also learns the complete hierarchical structure with a computational complexity only polynomial in the number of nodes of the tree.
Furthermore, our algorithm is constructed to directly minimize the final squared error without introducing any ad-hoc parameters.
We demonstrate the performance of our algorithms through highly realistic experiments performed on accurately simulated underwater acoustic channels.
The ever increasing adoption of mobile devices with limited energy storage capacity, on the one hand, and more awareness of the environmental impact of massive data centres and server pools, on the other hand, have both led to an increased interest in energy management algorithms.
The main contribution of this paper is to present several new constant factor approximation algorithms for energy aware scheduling problems where the objective is to minimize weighted completion time plus the cost of the energy consumed, in the one machine non-preemptive setting, while allowing release dates and deadlines.Unlike previous known algorithms these new algorithms can handle general job-dependent energy cost functions, extending the application of these algorithms to settings outside the typical CPU-energy one.
These new settings include problems where in addition, or instead, of energy costs we also have maintenance costs, wear and tear, replacement costs, etc., which in general depend on the speed at which the machine runs but also depend on the types of jobs processed.
Our algorithms also extend to approximating weighted tardiness plus energy cost, an inherently more difficult problem that has not been addressed in the literature.
The Internet is a ubiquitous and affordable communications network suited for e-commerce and medical image communications.
Security has become a major issue as data communication channels can be intruded by intruders during transmission.
Though, different methods have been proposed and used to protect the transmission of data from illegal and unauthorized access, code breakers have come up with various methods to crack them.
DNA based Cryptography brings forward a new hope for unbreakable algorithms.
This paper outlines an encryption scheme with DNA technology and JPEG Zigzag Coding for Secure Transmission of Images.
Kinetic approaches, i.e., methods based on the lattice Boltzmann equations, have long been recognized as an appealing alternative for solving incompressible Navier-Stokes equations in computational fluid dynamics.
However, such approaches have not been widely adopted in graphics mainly due to the underlying inaccuracy, instability and inflexibility.
In this paper, we try to tackle these problems in order to make kinetic approaches practical for graphical applications.
To achieve more accurate and stable simulations, we propose to employ the non-orthogonal central-moment-relaxation model, where we develop a novel adaptive relaxation method to retain both stability and accuracy in turbulent flows.
To achieve flexibility, we propose a novel continuous-scale formulation that enables samples at arbitrary resolutions to easily communicate with each other in a more continuous sense and with loose geometrical constraints, which allows efficient and adaptive sample construction to better match the physical scale.
Such a capability directly leads to an automatic sample construction which generates static and dynamic scales at initialization and during simulation, respectively.
This effectively makes our method suitable for simulating turbulent flows with arbitrary geometrical boundaries.
Our simulation results with applications to smoke animations show the benefits of our method, with comparisons for justification and verification.
This paper describes how to obtain accurate 3D body models and texture of arbitrary people from a single, monocular video in which a person is moving.
Based on a parametric body model, we present a robust processing pipeline achieving 3D model fits with 5mm accuracy also for clothed people.
Our main contribution is a method to nonrigidly deform the silhouette cones corresponding to the dynamic human silhouettes, resulting in a visual hull in a common reference frame that enables surface reconstruction.
This enables efficient estimation of a consensus 3D shape, texture and implanted animation skeleton based on a large number of frames.
We present evaluation results for a number of test subjects and analyze overall performance.
Requiring only a smartphone or webcam, our method enables everyone to create their own fully animatable digital double, e.g., for social VR applications or virtual try-on for online fashion shopping.
In this paper, attempt is made to solve a few problems using the Polynomial Point Collocation Method (PPCM), the Radial Point Collocation Method (RPCM), Smoothed Particle Hydrodynamics (SPH), and the Finite Point Method (FPM).
A few observations on the accuracy of these methods are recorded.
All the simulations in this paper are three dimensional linear elastostatic simulations, without accounting for body forces.
Analyzing database access logs is a key part of performance tuning, intrusion detection, benchmark development, and many other database administration tasks.
Unfortunately, it is common for production databases to deal with millions or even more queries each day, so these logs must be summarized before they can be used.
Designing an appropriate summary encoding requires trading off between conciseness and information content.
For example: simple workload sampling may miss rare, but high impact queries.
In this paper, we present LogR, a lossy log compression scheme suitable use for many automated log analytics tools, as well as for human inspection.
We formalize and analyze the space/fidelity trade-off in the context of a broader family of "pattern" and "pattern mixture" log encodings to which LogR belongs.
We show through a series of experiments that LogR compressed encodings can be created efficiently, come with provable information-theoretic bounds on their accuracy, and outperform state-of-art log summarization strategies.
This paper discusses the conceptual design and proof-of-concept flight demonstration of a novel variable pitch quadrotor biplane Unmanned Aerial Vehicle concept for payload delivery.
The proposed design combines vertical takeoff and landing (VTOL), precise hover capabilities of a quadrotor helicopter and high range, endurance and high forward cruise speed characteristics of a fixed wing aircraft.
The proposed UAV is designed for a mission requirement of carrying and delivering 6 kg payload to a destination at 16 km from the point of origin.
First, the design of proprotors is carried out using a physics based modified Blade Element Momentum Theory (BEMT) analysis, which is validated using experimental data generated for the purpose.
Proprotors have conflicting requirement for optimal hover and forward flight performance.
Next, the biplane wings are designed using simple lifting line theory.
The airframe design is followed by power plant selection and transmission design.
Finally, weight estimation is carried out to complete the design process.
The proprotor design with 24 deg preset angle and -24 deg twist is designed based on 70% weightage to forward flight and 30% weightage to hovering flight conditions.
The operating RPM of the proprotors is reduced from 3200 during hover to 2000 during forward flight to ensure optimal performance during cruise flight.
The estimated power consumption during forward flight mode is 64% less than that required for hover, establishing the benefit of this hybrid concept.
A proof-of-concept scaled prototype is fabricated using commercial-off-the-shelf parts.
A PID controller is developed and implemented on the PixHawk board to enable stable hovering flight and attitude tracking.
Nowadays, millimeter-wave communication centered at the 60 GHz radio frequency band is increasingly the preferred technology for near-field communication since it provides transmission bandwidth that is several GHz wide.
The IEEE 802.11ad standard has been developed for commercial wireless local area networks in the 60 GHz transmission environment.
Receivers designed to process IEEE 802.11ad waveforms employ very high rate analog-to-digital converters, and therefore, reducing the receiver sampling rate can be useful.
In this work, we study the problem of low-rate channel estimation over the IEEE 802.11ad 60 GHz communication link by harnessing sparsity in the channel impulse response.
In particular, we focus on single carrier modulation and exploit the special structure of the 802.11ad waveform embedded in the channel estimation field of its single carrier physical layer frame.
We examine various sub-Nyquist sampling methods for this problem and recover the channel using compressed sensing techniques.
Our numerical experiments show feasibility of our procedures up to one-seventh of the Nyquist rates with minimal performance deterioration.
This paper presents Verisig, a hybrid system approach to verifying safety properties of closed-loop systems using neural networks as controllers.
Although techniques exist for verifying input/output properties of the neural network itself, these methods cannot be used to verify properties of the closed-loop system (since they work with piecewise-linear constraints that do not capture non-linear plant dynamics).
To overcome this challenge, we focus on sigmoid-based networks and exploit the fact that the sigmoid is the solution to a quadratic differential equation, which allows us to transform the neural network into an equivalent hybrid system.
By composing the network's hybrid system with the plant's, we transform the problem into a hybrid system verification problem which can be solved using state-of-the-art reachability tools.
We show that reachability is decidable for networks with one hidden layer and decidable for general networks if Schanuel's conjecture is true.
We evaluate the applicability and scalability of Verisig in two case studies, one from reinforcement learning and one in which the neural network is used to approximate a model predictive controller.
Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems.
The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms.
While this has enabled a wider audience to target specialized hardware, the optimization principles known from software design are no longer sufficient to implement high-performance codes, due to fundamental differences between software and hardware architectures.
In this work, we propose a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications.
We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures with little off-chip data movement.
To quantify the effect of our transformations, we use them to optimize a set of high-throughput FPGA kernels, demonstrating that they are sufficient to scale up parallelism within the hardware constraints of the target device.
With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS.
The polarization process of polar codes over a ternary alphabet is studied.
Recently it has been shown that the scaling of the blocklength of polar codes with prime alphabet size scales polynomially with respect to the inverse of the gap between code rate and channel capacity.
However, except for the binary case, the degree of the polynomial in the bound is extremely large.
In this work, it is shown that a much lower degree polynomial can be computed numerically for the ternary case.
Similar results are conjectured for the general case of prime alphabet size.
Last mile link is often a bottleneck for end user.
However, users typically have multiple ways of accessing the Internet (cellular, ADSL, public Wifi).
This observation led to creation of protocols like mTCP or R-MTP.
Current bandwidth aggregation protocols are packet based.
However, this is not always practical - for example, non-TCP protocols are often blocked on firewalls.
Moreover, a lot of effort was devoted over the years into making single-path TCP work well over various types of links.
In this paper we introduce protocol which uses multiple TCP streams to establish single reliable connection attempting to maximize bandwidth and minimize latency.
We present an approach for the verification and validation (V&V) of robot assistants in the context of human-robot interactions (HRI), to demonstrate their trustworthiness through corroborative evidence of their safety and functional correctness.
Key challenges include the complex and unpredictable nature of the real world in which assistant and service robots operate, the limitations on available V&V techniques when used individually, and the consequent lack of confidence in the V&V results.
Our approach, called corroborative V&V, addresses these challenges by combining several different V&V techniques; in this paper we use formal verification (model checking), simulation-based testing, and user validation in experiments with a real robot.
We demonstrate our corroborative V&V approach through a handover task, the most critical part of a complex cooperative manufacturing scenario, for which we propose some safety and liveness requirements to verify and validate.
We construct formal models, simulations and an experimental test rig for the HRI.
To capture requirements we use temporal logic properties, assertion checkers and textual descriptions.
This combination of approaches allows V&V of the HRI task at different levels of modelling detail and thoroughness of exploration, thus overcoming the individual limitations of each technique.
Should the resulting V&V evidence present discrepancies, an iterative process between the different V&V techniques takes place until corroboration between the V&V techniques is gained from refining and improving the assets (i.e., system and requirement models) to represent the HRI task in a more truthful manner.
Therefore, corroborative V&V affords a systematic approach to 'meta-V&V,' in which different V&V techniques can be used to corroborate and check one another, increasing the level of certainty in the results of V&V.
Multispectral image analysis is a relatively promising field of research with applications in several areas, such as medical imaging and satellite monitoring.
A considerable number of current methods of analysis are based on parametric statistics.
Alternatively, some methods in Computational Intelligence are inspired by biology and other sciences.
Here we claim that Philosophy can be also considered as a source of inspiration.
This work proposes the Objective Dialectical Method (ODM): a method for classification based on the Philosophy of Praxis.
ODM is instrumental in assembling evolvable mathematical tools to analyze multispectral images.
In the case study described in this paper, multispectral images are composed of diffusion-weighted (DW) magnetic resonance (MR) images.
The results are compared to ground-truth images produced by polynomial networks using a morphological similarity index.
The classification results are used to improve the usual analysis of the apparent diffusion coefficient map.
Such results proved that gray and white matter can be distinguished in DW-MR multispectral analysis and, consequently, DW-MR images can also be used to furnish anatomical information.
We tested 14 very different classification algorithms (random forest, gradient boosting machines, SVM - linear, polynomial, and RBF - 1-hidden-layer neural nets, extreme learning machines, k-nearest neighbors and a bagging of knn, naive Bayes, learning vector quantization, elastic net logistic regression, sparse linear discriminant analysis, and a boosting of linear classifiers) on 115 real life binary datasets.
We followed the Demsar analysis and found that the three best classifiers (random forest, gbm and RBF SVM) are not significantly different from each other.
We also discuss that a change of less then 0.0112 in the error rate should be considered as an irrelevant change, and used a Bayesian ANOVA analysis to conclude that with high probability the differences between these three classifiers is not of practical consequence.
We also verified the execution time of "standard implementations" of these algorithms and concluded that RBF SVM is the fastest (significantly so) both in training time and in training plus testing time.
Most existing video summarisation methods are based on either supervised or unsupervised learning.
In this paper, we propose a reinforcement learning-based weakly supervised method that exploits easy-to-obtain, video-level category labels and encourages summaries to contain category-related information and maintain category recognisability.
Specifically, We formulate video summarisation as a sequential decision-making process and train a summarisation network with deep Q-learning (DQSN).
A companion classification network is also trained to provide rewards for training the DQSN.
With the classification network, we develop a global recognisability reward based on the classification result.
Critically, a novel dense ranking-based reward is also proposed in order to cope with the temporally delayed and sparse reward problems for long sequence reinforcement learning.
Extensive experiments on two benchmark datasets show that the proposed approach achieves state-of-the-art performance.
In this paper we present a new routing paradigm that generalizes opportunistic routing for wireless multihop networks.
In multirate anypath routing, each node uses both a set of next hops and a selected transmission rate to reach a destination.
Using this rate, a packet is broadcast to the nodes in the set and one of them forwards the packet on to the destination.
To date, there has been no theory capable of jointly optimizing both the set of next hops and the transmission rate used by each node.
We solve this by introducing two polynomial-time routing algorithms and provide the proof of their optimality.
The proposed algorithms run in roughly the same running time as regular shortest-path algorithms, and are therefore suitable for deployment in routing protocols.
We conducted measurements in an 802.11b testbed network, and our trace-driven analysis shows that multirate anypath routing performs on average 80% and up to 6.4 times better than anypath routing with a fixed rate of 11 Mbps.
If the rate is fixed at 1 Mbps instead, performance improves by up to one order of magnitude.
In discussions hosted on discussion forums for MOOCs, references to online learning resources are often of central importance.
They contextualize the discussion, anchoring the discussion participants' presentation of the issues and their understanding.
However they are usually mentioned in free text, without appropriate hyperlinking to their associated resource.
Automated learning resource mention hyperlinking and categorization will facilitate discussion and searching within MOOC forums, and also benefit the contextualization of such resources across disparate views.
We propose the novel problem of learning resource mention identification in MOOC forums.
As this is a novel task with no publicly available data, we first contribute a large-scale labeled dataset, dubbed the Forum Resource Mention (FoRM) dataset, to facilitate our current research and future research on this task.
We then formulate this task as a sequence tagging problem and investigate solution architectures to address the problem.
Importantly, we identify two major challenges that hinder the application of sequence tagging models to the task: (1) the diversity of resource mention expression, and (2) long-range contextual dependencies.
We address these challenges by incorporating character-level and thread context information into a LSTM-CRF model.
First, we incorporate a character encoder to address the out-of-vocabulary problem caused by the diversity of mention expressions.
Second, to address the context dependency challenge, we encode thread contexts using an RNN-based context encoder, and apply the attention mechanism to selectively leverage useful context information during sequence tagging.
Experiments on FoRM show that the proposed method improves the baseline deep sequence tagging models notably, significantly bettering performance on instances that exemplify the two challenges.
The practice of scientific research is often thought of as individuals and small teams striving for disciplinary advances.
Yet as a whole, this endeavor more closely resembles a complex system of natural computation, in which information is obtained, generated, and disseminated more effectively than would be possible by individuals acting in isolation.
Currently, the structure of this integrated and innovative landscape of scientific ideas is not well understood.
Here we use tools from network science to map the landscape of interconnected research topics covered in the multidisciplinary journal PNAS since 2000.
We construct networks in which nodes represent topics of study and edges give the degree to which topics occur in the same papers.
The network displays small-world architecture, with dense connectivity within scientific clusters and sparse connectivity between clusters.
Notably, clusters tend not to align with assigned article classifications, but instead contain topics from various disciplines.
Using a temporal graph, we find that small-worldness has increased over time, suggesting growing efficiency and integration of ideas.
Finally, we define a novel measure of interdisciplinarity, which is positively associated with PNAS's impact factor.
Broadly, this work suggests that complex and dynamic patterns of knowledge emerge from scientific research, and that structures reflecting intellectual integration may be beneficial for obtaining scientific insight.
This paper introduces a method, based on deep reinforcement learning, for automatically generating a general purpose decision making function.
A Deep Q-Network agent was trained in a simulated environment to handle speed and lane change decisions for a truck-trailer combination.
In a highway driving case, it is shown that the method produced an agent that matched or surpassed the performance of a commonly used reference model.
To demonstrate the generality of the method, the exact same algorithm was also tested by training it for an overtaking case on a road with oncoming traffic.
Furthermore, a novel way of applying a convolutional neural network to high level input that represents interchangeable objects is also introduced.
Missing data has a ubiquitous presence in real-life applications of machine learning techniques.
Imputation methods are algorithms conceived for restoring missing values in the data, based on other entries in the database.
The choice of the imputation method has an influence on the performance of the machine learning technique, e.g., it influences the accuracy of the classification algorithm applied to the data.
Therefore, selecting and applying the right imputation method is important and usually requires a substantial amount of human intervention.
In this paper we propose the use of genetic programming techniques to search for the right combination of imputation and classification algorithms.
We build our work on the recently introduced Python-based TPOT library, and incorporate a heterogeneous set of imputation algorithms as part of the machine learning pipeline search.
We show that genetic programming can automatically find increasingly better pipelines that include the most effective combinations of imputation methods, feature pre-processing, and classifiers for a variety of classification problems with missing data.
Voice disguise, purposeful modification of one's speaker identity with the aim of avoiding being identified as oneself, is a low-effort way to fool speaker recognition, whether performed by a human or an automatic speaker verification (ASV) system.
We present an evaluation of the effectiveness of age stereotypes as a voice disguise strategy, as a follow up to our recent work where 60 native Finnish speakers attempted to sound like an elderly and like a child.
In that study, we presented evidence that both ASV and human observers could easily miss the target speaker but we did not address how believable the presented vocal age stereotypes were; this study serves to fill that gap.
The interesting cases would be speakers who succeed in being missed by the ASV system, and which a typical listener cannot detect as being a disguise.
We carry out a perceptual test to study the quality of the disguised speech samples.
The listening test was carried out both locally and with the help of Amazon's Mechanical Turk (MT) crowd-workers.
A total of 91 listeners participated in the test and were instructed to estimate both the speaker's chronological and intended age.
The results indicate that age estimations for the intended old and child voices for female speakers were towards the target age groups, while for male speakers, the age estimations corresponded to the direction of the target voice only for elderly voices.
In the case of intended child's voice, listeners estimated the age of male speakers to be older than their chronological age for most of the speakers and not the intended target age.
Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding.
Previous work on text segmentation focused on unsupervised methods such as clustering or graph search, due to the paucity in labeled data.
In this work, we formulate text segmentation as a supervised learning problem, and present a large new dataset for text segmentation that is automatically extracted and labeled from Wikipedia.
Moreover, we develop a segmentation model based on this dataset and show that it generalizes well to unseen natural text.
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems.
Both techniques target the scenario where two thread teams are created/activated during the factorization, with each team in charge of performing an independent task/branch of execution.
The first technique promotes worker sharing (WS) between the two tasks, allowing the threads of the task that completes first to be reallocated for use by the costlier task.
The second technique allows a fast task to alert the slower task of completion, enforcing the early termination (ET) of the second task, and a smooth transition of the factorization procedure into the next iteration.
The two mechanisms are instantiated via a new malleable thread-level implementation of the Basic Linear Algebra Subprograms (BLAS), and their benefits are illustrated via an implementation of the LU factorization with partial pivoting enhanced with look-ahead.
Concretely, our experimental results on a six core Intel-Xeon processor show the benefits of combining WS+ET, reporting competitive performance in comparison with a task-parallel runtime-based solution.
The dominant object detection approaches treat the recognition of each region separately and overlook crucial semantic correlations between objects in one scene.
This paradigm leads to substantial performance drop when facing heavy long-tail problems, where very few samples are available for rare classes and plenty of confusing categories exists.
We exploit diverse human commonsense knowledge for reasoning over large-scale object categories and reaching semantic coherency within one image.
Particularly, we present Hybrid Knowledge Routed Modules (HKRM) that incorporates the reasoning routed by two kinds of knowledge forms: an explicit knowledge module for structured constraints that are summarized with linguistic knowledge (e.g. shared attributes, relationships) about concepts; and an implicit knowledge module that depicts some implicit constraints (e.g. common spatial layouts).
By functioning over a region-to-region graph, both modules can be individualized and adapted to coordinate with visual patterns in each image, guided by specific knowledge forms.
HKRM are light-weight, general-purpose and extensible by easily incorporating multiple knowledge to endow any detection networks the ability of global semantic reasoning.
Experiments on large-scale object detection benchmarks show HKRM obtains around 34.5% improvement on VisualGenome (1000 categories) and 30.4% on ADE in terms of mAP.
Codes and trained model can be found in https://github.com/chanyn/HKRM.
We present a novel application of LSTM recurrent neural networks to multilabel classification of diagnoses given variable-length time series of clinical measurements.
Our method outperforms a strong baseline on a variety of metrics.
Neural Machine Translation (NMT) can be improved by including document-level contextual information.
For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner.
The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model's own previous hidden states.
Experiments show that hierarchical attention significantly improves the BLEU score over a strong NMT baseline with the state-of-the-art in context-aware methods, and that both the encoder and decoder benefit from context in complementary ways.
Very often features come with their own vectorial descriptions which provide detailed information about their properties.
We refer to these vectorial descriptions as feature side-information.
In the standard learning scenario, input is represented as a vector of features and the feature side-information is most often ignored or used only for feature selection prior to model fitting.
We believe that feature side-information which carries information about features intrinsic property will help improve model prediction if used in a proper way during learning process.
In this paper, we propose a framework that allows for the incorporation of the feature side-information during the learning of very general model families to improve the prediction performance.
We control the structures of the learned models so that they reflect features similarities as these are defined on the basis of the side-information.
We perform experiments on a number of benchmark datasets which show significant predictive performance gains, over a number of baselines, as a result of the exploitation of the side-information.
Traditional methods for assessing illness severity and predicting in-hospital mortality among critically ill patients require time-consuming, error-prone calculations using static variable thresholds.
These methods do not capitalize on the emerging availability of streaming electronic health record data or capture time-sensitive individual physiological patterns, a critical task in the intensive care unit.
We propose a novel acuity score framework (DeepSOFA) that leverages temporal measurements and interpretable deep learning models to assess illness severity at any point during an ICU stay.
We compare DeepSOFA with SOFA (Sequential Organ Failure Assessment) baseline models using the same model inputs and find that at any point during an ICU admission, DeepSOFA yields significantly more accurate predictions of in-hospital mortality.
A DeepSOFA model developed in a public database and validated in a single institutional cohort had a mean AUC for the entire ICU stay of 0.90 (95% CI 0.90-0.91) compared with baseline SOFA models with mean AUC 0.79 (95% CI 0.79-0.80) and 0.85 (95% CI 0.85-0.86).
Deep models are well-suited to identify ICU patients in need of life-saving interventions prior to the occurrence of an unexpected adverse event and inform shared decision-making processes among patients, providers, and families regarding goals of care and optimal resource utilization.
Controllers for autonomous robotic systems can be specified using state machines.
However, these are typically developed in an ad hoc manner without formal semantics, which makes it difficult to analyse the controller.
Simulations are often used during the development, but a rigorous connection between the designed controller and the implementation is often overlooked.
This paper presents a state-machine based notation, RoboChart, together with a tool to automatically create code from the state machines, establishing a rigorous connection between specification and implementation.
In RoboChart, a robot's controller is specified either graphically or using a textual description language.
The controller code for simulation is automatically generated through a direct mapping from the specification.
We demonstrate our approach using two case studies (self-organized aggregation and swarm taxis) in swarm robotics.
The simulations are presented using two different simulators showing the general applicability of our approach.
Predicting both the time and the location of human movements is valuable but challenging for a variety of applications.
To address this problem, we propose an approach considering both the periodicity and the sociality of human movements.
We first define a new concept, Social Spatial-Temporal Event (SSTE), to represent social interactions among people.
For the time prediction, we characterise the temporal dynamics of SSTEs with an ARMA (AutoRegressive Moving Average) model.
To dynamically capture the SSTE kinetics, we propose a Kalman Filter based learning algorithm to learn and incrementally update the ARMA model as a new observation becomes available.
For the location prediction, we propose a ranking model where the periodicity and the sociality of human movements are simultaneously taken into consideration for improving the prediction accuracy.
Extensive experiments conducted on real data sets validate our proposed approach.
Knowledge workers face an ever increasing flood of information in their daily lives.
To counter this and provide better support for information management and knowledge work in general, we have been investigating solutions inspired by human forgetting since 2013.
These solutions are based on Semantic Desktop (SD) and Managed Forgetting (MF) technology.
A key concept of the latter is the so-called Memory Buoyancy (MB), which is intended to represent an information item's current value for the user and allows to employ forgetting mechanisms.
The SD thus continuously performs information value assessment updating MB and triggering respective MF measures.
We extended an SD-based organizational memory system, which we have been using in daily work for over seven years now, with MF mechanisms directly embedding them in daily activities, too, and enabling us to test and optimize them in real-world scenarios.
In this paper, we first present our initial version of MB and discuss success and failure stories we have been experiencing with it during three years of practical usage.
We learned from cognitive psychology that our previous research on context can be beneficial for MF.
Thus, we created an advanced MB version especially taking user context, and in particular context switches, into account.
These enhancements as well as a first prototypical implementation are presented, too.
This paper studies the problem of passive grasp stability under an external disturbance, that is, the ability of a grasp to resist a disturbance through passive responses at the contacts.
To obtain physically consistent results, such a model must account for friction phenomena at each contact; the difficulty is that friction forces depend in non-linear fashion on contact behavior (stick or slip).
We develop the first polynomial-time algorithm which either solves such complex equilibrium constraints for two-dimensional grasps, or otherwise concludes that no solution exists.
To achieve this, we show that the number of possible `slip states' (where each contact is labeled as either sticking or slipping) that must be considered is polynomial (in fact quadratic) in the number of contacts, and not exponential as previously thought.
Our algorithm captures passive response behaviors at each contact, while accounting for constraints on friction forces such as the maximum dissipation principle.
The most commonly used weighted least square state estimator in power industry is nonlinear and formulated by using conventional measurements such as line flow and injection measurements.
PMUs (Phasor Measurement Units) are gradually adding them to improve the state estimation process.
In this paper the way of corporation the PMU data to the conventional measurements and a linear formulation of the state estimation using only PMU measured data are investigated.
Six cases are tested while gradually increasing the number of PMUs which are added to the measurement set and the effect of PMUs on the accuracy of variables are illustrated and compared by applying them on IEEE 14, 30 test systems.
Driven by successes in deep learning, computer vision research has begun to move beyond object detection and image classification to more sophisticated tasks like image captioning or visual question answering.
Motivating such endeavors is the desire for models to capture not only objects present in an image, but more fine-grained aspects of a scene such as relationships between objects and their attributes.
Scene graphs provide a formal construct for capturing these aspects of an image.
Despite this, there have been only a few recent efforts to generate scene graphs from imagery.
Previous works limit themselves to settings where bounding box information is available at train time and do not attempt to generate scene graphs with attributes.
In this paper we propose a method, based on recent advancements in Generative Adversarial Networks, to overcome these deficiencies.
We take the approach of first generating small subgraphs, each describing a single statement about a scene from a specific region of the input image chosen using an attention mechanism.
By doing so, our method is able to produce portions of the scene graphs with attribute information without the need for bounding box labels.
Then, the complete scene graph is constructed from these subgraphs.
We show that our model improves upon prior work in scene graph generation on state-of-the-art data sets and accepted metrics.
Further, we demonstrate that our model is capable of handling a larger vocabulary size than prior work has attempted.
Many problems in machine learning and related application areas are fundamentally variants of conditional modeling and sampling across multi-aspect data, either multi-view, multi-modal, or simply multi-group.
For example, sampling from the distribution of English sentences conditioned on a given French sentence or sampling audio waveforms conditioned on a given piece of text.
Central to many of these problems is the issue of missing data: we can observe many English, French, or German sentences individually but only occasionally do we have data for a sentence pair.
Motivated by these applications and inspired by recent progress in variational autoencoders for grouped data, we develop factVAE, a deep generative model capable of handling multi-aspect data, robust to missing observations, and with a prior that encourages disentanglement between the groups and the latent dimensions.
The effectiveness of factVAE is demonstrated on a variety of rich real-world datasets, including motion capture poses and pictures of faces captured from varying poses and perspectives.
In this paper we present a unified framework for solving a general class of problems arising in the context of set-membership estimation/identification theory.
More precisely, the paper aims at providing an original approach for the computation of optimal conditional and robust projection estimates in a nonlinear estimation setting where the operator relating the data and the parameter to be estimated is assumed to be a generic multivariate polynomial function and the uncertainties affecting the data are assumed to belong to semialgebraic sets.
By noticing that the computation of both the conditional and the robust projection optimal estimators requires the solution to min-max optimization problems that share the same structure, we propose a unified two-stage approach based on semidefinite-relaxation techniques for solving such estimation problems.
The key idea of the proposed procedure is to recognize that the optimal functional of the inner optimization problems can be approximated to any desired precision by a multivariate polynomial function by suitably exploiting recently proposed results in the field of parametric optimization.
Two simulation examples are reported to show the effectiveness of the proposed approach.
We consider the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable.
Different from traditional super-resolution formulation, the low-resolution input is further degraded by noises and blurring.
This complicated setting makes supervised learning and accurate kernel estimation impossible.
To solve this problem, we resort to unsupervised learning without paired data, inspired by the recent successful image-to-image translation applications.
With generative adversarial networks (GAN) as the basic component, we propose a Cycle-in-Cycle network structure to tackle the problem within three steps.
First, the noisy and blurry input is mapped to a noise-free low-resolution space.
Then the intermediate image is up-sampled with a pre-trained deep model.
Finally, we fine-tune the two modules in an end-to-end manner to get the high-resolution output.
Experiments on NTIRE2018 datasets demonstrate that the proposed unsupervised method achieves comparable results as the state-of-the-art supervised models.
A broad class of software engineering problems can be generalized as the "total recall problem".
This short paper claims that identifying and exploring total recall language processing problems in software engineering is an important task with wide applicability.
To make that case, we show that by applying and adapting the state of the art active learning and text mining, solutions of the total recall problem, can help solve two important software engineering tasks: (a) supporting large literature reviews and (b) identifying software security vulnerabilities.
Furthermore, we conjecture that (c) test case prioritization and (d) static warning identification can also be categorized as the total recall problem.
The widespread applicability of "total recall" to software engineering suggests that there exists some underlying framework that encompasses not just natural language processing, but a wide range of important software engineering tasks.
The next generation of PaaS technology accomplishes the true promise of object-oriented and 4GLs development with less effort.
Now PaaS is becoming one of the core technical services for application development organizations.
PaaS offers a resourceful and agile approach to develop, operate and deploy applications in a cost-effective manner.
It is now turning out to be one of the preferred choices throughout the world, especially for globally distributed development environment.
However it still lacks the scale of popularity and acceptance which Software-as-a-Service (SaaS) and Infrastructure-as-a-Service (IaaS) have attained.
PaaS offers a promising future with novel technology architecture and evolutionary development approach.
In this article, we identify the strengths, weaknesses, opportunities and threats for the PaaS industry.
We then identify the various issues that will affect the different stakeholders of PaaS industry.
This research will outline a set of recommendations for the PaaS practitioners to better manage this technology.
For PaaS technology researchers, we also outline the number of research areas that need attention in coming future.
Finally, we also included an online survey to outline PaaS technology market leaders.
This will facilitate PaaS technology practitioners to have a more deep insight into market trends and technologies.
This paper analyzes how the distortion created by hardware impairments in a multiple-antenna base station affects the uplink spectral efficiency (SE), with focus on Massive MIMO.
This distortion is correlated across the antennas, but has been often approximated as uncorrelated to facilitate (tractable) SE analysis.
To determine when this approximation is accurate, basic properties of distortion correlation are first uncovered.
Then, we separately analyze the distortion correlation caused by third-order non-linearities and by quantization.
Finally, we study the SE numerically and show that the distortion correlation can be safely neglected in Massive MIMO when there are sufficiently many users.
Under i.i.d.Rayleigh fading and equal signal-to-noise ratios (SNRs), this occurs for more than five transmitting users.
Other channel models and SNR variations have only minor impact on the accuracy.
We also demonstrate the importance of taking the distortion characteristics into account in the receive combining.
Multiple Sclerosis (MS) is an autoimmune disease that leads to lesions in the central nervous system.
Magnetic resonance (MR) images provide sufficient imaging contrast to visualize and detect lesions, particularly those in the white matter.
Quantitative measures based on various features of lesions have been shown to be useful in clinical trials for evaluating therapies.
Therefore robust and accurate segmentation of white matter lesions from MR images can provide important information about the disease status and progression.
In this paper, we propose a fully convolutional neural network (CNN) based method to segment white matter lesions from multi-contrast MR images.
The proposed CNN based method contains two convolutional pathways.
The first pathway consists of multiple parallel convolutional filter banks catering to multiple MR modalities.
In the second pathway, the outputs of the first one are concatenated and another set of convolutional filters are applied.
The output of this last pathway produces a membership function for lesions that may be thresholded to obtain a binary segmentation.
The proposed method is evaluated on a dataset of 100 MS patients, as well as the ISBI 2015 challenge data consisting of 14 patients.
The comparison is performed against four publicly available MS lesion segmentation methods.
Significant improvement in segmentation quality over the competing methods is demonstrated on various metrics, such as Dice and false positive ratio.
While evaluating on the ISBI 2015 challenge data, our method produces a score of 90.48, where a score of 90 is considered to be comparable to a human rater.
Computationally efficient classification system architecture is proposed.
It utilizes fast tensor-vector multiplication algorithm to apply linear operators upon input signals .
The approach is applicable to wide variety of recognition system architectures ranging from single stage matched filter bank classifiers to complex neural networks with unlimited number of hidden layers.
We introduce the class of synchronous subsequential relations, a subclass of the synchronous relations which embodies some properties of subsequential relations.
If we take relations of this class as forming the possible transitions of an infinite automaton, then most decision problems (apart from membership) still remain undecidable (as they are for synchronous and subsequential rational relations), but on the positive side, they can be approximated in a meaningful way we make precise in this paper.
This might make the class useful for some applications, and might serve to establish an intermediate position in the trade-off between issues of expressivity and (un)decidability.
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform.
Our set up allows for rapid acquisition and annotation of data with corresponding ground truth.
While more constrained in its scopes -- the iCub world is essentially a robotics research lab -- we demonstrate how the proposed data-set poses challenges to current recognition systems.
The iCubWorld data-set is publicly available.
The data-set can be downloaded from: http://www.iit.it/en/projects/data-sets.html.
We are concerned with robust and accurate forecasting of multiphase flow rates in wells and pipelines during oil and gas production.
In practice, the possibility to physically measure the rates is often limited; besides, it is desirable to estimate future values of multiphase rates based on the previous behavior of the system.
In this work, we demonstrate that a Long Short-Term Memory (LSTM) recurrent artificial network is able not only to accurately estimate the multiphase rates at current time (i.e., act as a virtual flow meter), but also to forecast the rates for a sequence of future time instants.
For a synthetic severe slugging case, LSTM forecasts compare favorably with the results of hydrodynamical modeling.
LSTM results for a realistic noizy dataset of a variable rate well test show that the model can also successfully forecast multiphase rates for a system with changing flow patterns.
Recently, it was shown that if multiplicative weights are assigned to the edges of a Tanner graph used in belief propagation decoding, it is possible to use deep learning techniques to find values for the weights which improve the error-correction performance of the decoder.
Unfortunately, this approach requires many multiplications, which are generally expensive operations.
In this paper, we suggest a more hardware-friendly approach in which offset min-sum decoding is augmented with learnable offset parameters.
Our method uses no multiplications and has a parameter count less than half that of the multiplicative algorithm.
This both speeds up training and provides a feasible path to hardware architectures.
After describing our method, we compare the performance of the two neural decoding algorithms and show that our method achieves error-correction performance within 0.1 dB of the multiplicative approach and as much as 1 dB better than traditional belief propagation for the codes under consideration.
In recent years, the number of Internet of Things (IoT) devices/sensors has increased to a great extent.
To support the computational demand of real-time latency-sensitive applications of largely geo-distributed IoT devices/sensors, a new computing paradigm named "Fog computing" has been introduced.
Generally, Fog computing resides closer to the IoT devices/sensors and extends the Cloud-based computing, storage and networking facilities.
In this chapter, we comprehensively analyse the challenges in Fogs acting as an intermediate layer between IoT devices/ sensors and Cloud datacentres and review the current developments in this field.
We present a taxonomy of Fog computing according to the identified challenges and its key features.We also map the existing works to the taxonomy in order to identify current research gaps in the area of Fog computing.
Moreover, based on the observations, we propose future directions for research.
We present Deeply Supervised Object Detector (DSOD), a framework that can learn object detectors from scratch.
State-of-the-art object objectors rely heavily on the off-the-shelf networks pre-trained on large-scale classification datasets like ImageNet, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks.
Model fine-tuning for the detection task could alleviate this bias to some extent but not fundamentally.
Besides, transferring pre-trained models from classification to detection between discrepant domains is even more difficult (e.g.RGB to depth images).
A better solution to tackle these two critical problems is to train object detectors from scratch, which motivates our proposed DSOD.
Previous efforts in this direction mostly failed due to much more complicated loss functions and limited training data in object detection.
In DSOD, we contribute a set of design principles for training object detectors from scratch.
One of the key findings is that deep supervision, enabled by dense layer-wise connections, plays a critical role in learning a good detector.
Combining with several other principles, we develop DSOD following the single-shot detection (SSD) framework.
Experiments on PASCAL VOC 2007, 2012 and MS COCO datasets demonstrate that DSOD can achieve better results than the state-of-the-art solutions with much more compact models.
For instance, DSOD outperforms SSD on all three benchmarks with real-time detection speed, while requires only 1/2 parameters to SSD and 1/10 parameters to Faster RCNN.
Our code and models are available at: https://github.com/szq0214/DSOD .
Traditional information retrieval (such as that offered by web search engines) impedes users with information overload from extensive result pages and the need to manually locate the desired information therein.
Conversely, question-answering systems change how humans interact with information systems: users can now ask specific questions and obtain a tailored answer - both conveniently in natural language.
Despite obvious benefits, their use is often limited to an academic context, largely because of expensive domain customizations, which means that the performance in domain-specific applications often fails to meet expectations.
This paper proposes cost-efficient remedies: (i) we leverage metadata through a filtering mechanism, which increases the precision of document retrieval, and (ii) we develop a novel fuse-and-oversample approach for transfer learning in order to improve the performance of answer extraction.
Here knowledge is inductively transferred from a related, yet different, tasks to the domain-specific application, while accounting for potential differences in the sample sizes across both tasks.
The resulting performance is demonstrated with actual use cases from a finance company and the film industry, where fewer than 400 question-answer pairs had to be annotated in order to yield significant performance gains.
As a direct implication to management, this presents a promising path to better leveraging of knowledge stored in information systems.
In this paper, we propose a combination of pedestrian data collection and analysis and modeling that may yield higher competitive advantage in the business environment.
The data collection is only based on simple inventory and questionnaire surveys on a hypermarket to obtain trajectory path of pedestrian movement.
Though the data has limitation by using static trajectories, our techniques showed that it is possible to obtain aggregation of flow pattern and alley attractiveness similar to the result of aggregation using dynamic trajectory.
A case study of a real hypermarket demonstrates that daily necessity products are closely related to higher flow pattern.
Using the proposed method, we are also able to quantify pedestrian behavior that shoppers tend to walk about 7 times higher than the ideal shortest path
This paper deals with a new type of warehousing system, Robotic Mobile Fulfillment Systems (RMFS).
In such systems, robots are sent to carry storage units, so-called "pods", from the inventory and bring them to human operators working at stations.
At the stations, the items are picked according to customers' orders.
There exist new decision problems in such systems, for example, the reallocation of pods after their visits at work stations or the selection of pods to fulfill orders.
In order to analyze decision strategies for these decision problems and relations between them, we develop a simulation framework called "RAWSim-O" in this paper.
Moreover, we show a real-world application of our simulation framework by integrating simple robot prototypes based on vacuum cleaning robots.
Intelligent Transportation Systems (ITSs) require ultra-low end-to-end delays and multi-gigabit-per-second data transmission.
Millimetre Waves (mmWaves) communications can fulfil these requirements.
However, the increased mobility of Connected and Autonomous Vehicles (CAVs), requires frequent beamforming - thus introducing increased overhead.
In this paper, a new beamforming algorithm is proposed able to achieve overhead-free beamforming training.
Leveraging from the CAVs sensory data, broadcast with Dedicated Short Range Communications (DSRC) beacons, the position and the motion of a CAV can be estimated and beamform accordingly.
To minimise the position errors, an analysis of the distinct error components was presented.
The network performance is further enhanced by adapting the antenna beamwidth with respect to the position error.
Our algorithm outperforms the legacy IEEE 802.11ad approach proving it a viable solution for the future ITS applications and services.
A natural language interface (NLI) to structured query is intriguing due to its wide industrial applications and high economical values.
In this work, we tackle the problem of domain adaptation for NLI with limited data on target domain.
Two important approaches are considered: (a) effective general-knowledge-learning on source domain semantic parsing, and (b) data augmentation on target domain.
We present a Structured Query Inference Network (SQIN) to enhance learning for domain adaptation, by separating schema information from NL and decoding SQL in a more structural-aware manner; we also propose a GAN-based augmentation technique (AugmentGAN) to mitigate the issue of lacking target domain data.
We report solid results on GeoQuery, Overnight, and WikiSQL to demonstrate state-of-the-art performances for both in-domain and domain-transfer tasks.
This paper proposes models of learning process in teams of individuals who collectively execute a sequence of tasks and whose actions are determined by individual skill levels and networks of interpersonal appraisals and influence.
The closely-related proposed models have increasing complexity, starting with a centralized manager-based assignment and learning model, and finishing with a social model of interpersonal appraisal, assignments, learning, and influences.
We show how rational optimal behavior arises along the task sequence for each model, and discuss conditions of suboptimality.
Our models are grounded in replicator dynamics from evolutionary games, influence networks from mathematical sociology, and transactive memory systems from organization science.
Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.
Unfortunately, there is a tradeoff between cheap but simple variational families (e.g.~fully factorized) or expensive and complicated inference procedures.
We show that natural gradient ascent with adaptive weight noise implicitly fits a variational posterior to maximize the evidence lower bound (ELBO).
This insight allows us to train full-covariance, fully factorized, or matrix-variate Gaussian variational posteriors using noisy versions of natural gradient, Adam, and K-FAC, respectively, making it possible to scale up to modern-size ConvNets.
On standard regression benchmarks, our noisy K-FAC algorithm makes better predictions and matches Hamiltonian Monte Carlo's predictive variances better than existing methods.
Its improved uncertainty estimates lead to more efficient exploration in active learning, and intrinsic motivation for reinforcement learning.
In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance.
The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually.
In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase.
We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points.
Natural disasters are a large threat for people especially in developing countries such as Laos.
ICT-based disaster management systems aim at supporting disaster warning and response efforts.
However, the ability to directly communicate in both directions between local and administrative level is often not supported, and a tight integration into administrative workflows is missing.
In this paper, we present the smartphone-based disaster and reporting system Mobile4D.
It allows for bi-directional communication while being fully involved in administrative processes.
We present the system setup and discuss integration into administrative structures in Lao PDR.
This work is dedicated to introducing, executing, and assessing a three-stage speaker verification framework to enhance the degraded speaker verification performance in emotional talking environments.
Our framework is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage.
The proposed framework has been assessed on two distinct and independent emotional speech datasets: our collected dataset and Emotional Prosody Speech and Transcripts dataset.
Our results demonstrate that speaker verification based on both gender cues and emotion cues is superior to each of speaker verification based on gender cues only, emotion cues only, and neither gender cues nor emotion cues.
The achieved average speaker verification performance based on the suggested methodology is very similar to that attained in subjective assessment by human listeners.
Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators.
Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server.
Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic.
To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity.
It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe.
We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data.
We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic.
Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.
We empirically study sorting in the evolving data model.
In this model, a sorting algorithm maintains an approximation to the sorted order of a list of data items while simultaneously, with each comparison made by the algorithm, an adversary randomly swaps the order of adjacent items in the true sorted order.
Previous work studies only two versions of quicksort, and has a gap between the lower bound of Omega(n) and the best upper bound of O(n log log n).
The experiments we perform in this paper provide empirical evidence that some quadratic-time algorithms such as insertion sort and bubble sort are asymptotically optimal for any constant rate of random swaps.
In fact, these algorithms perform as well as or better than algorithms such as quicksort that are more efficient in the traditional algorithm analysis model.
As microblogging services like Twitter are becoming more and more influential in today's globalised world, its facets like sentiment analysis are being extensively studied.
We are no longer constrained by our own opinion.
Others opinions and sentiments play a huge role in shaping our perspective.
In this paper, we build on previous works on Twitter sentiment analysis using Distant Supervision.
The existing approach requires huge computation resource for analysing large number of tweets.
In this paper, we propose techniques to speed up the computation process for sentiment analysis.
We use tweet subjectivity to select the right training samples.
We also introduce the concept of EFWS (Effective Word Score) of a tweet that is derived from polarity scores of frequently used words, which is an additional heuristic that can be used to speed up the sentiment classification with standard machine learning algorithms.
We performed our experiments using 1.6 million tweets.
Experimental evaluations show that our proposed technique is more efficient and has higher accuracy compared to previously proposed methods.
We achieve overall accuracies of around 80% (EFWS heuristic gives an accuracy around 85%) on a training dataset of 100K tweets, which is half the size of the dataset used for the baseline model.
The accuracy of our proposed model is 2-3% higher than the baseline model, and the model effectively trains at twice the speed of the baseline model.
As a data-centric cache-enabled architecture, Named Data Networking (NDN) is considered to be an appropriate alternative to the current host-centric IP-based Internet infrastructure.
Leveraging in-network caching, name-based routing, and receiver-driven sessions, NDN can greatly enhance the way Internet resources are being used.
A critical issue in NDN is the procedure of cache allocation and management.
Our main contribution in this research is the analysis of memory requirements to allocate suitable Content-Store size to NDN routers, with respect to combined impacts of long-term centrality-based metric and Exponential Weighted Moving Average (EWMA) of short-term parameters such as users behaviors and outgoing traffic.
To determine correlations in such large data sets, data mining methods can prove valuable to researchers.
In this paper, we apply a data-fusion approach, namely Principal Component Analysis (PCA), to discover relations from short- and long-term parameters of the router.
The output of PCA, exploited to mine out raw data sets, is used to allocate a proper cache size to the router.
Evaluation results show an increase in the hit ratio of Content-Stores in sources, and NDN routers.
Moreover, for the proposed cache size allocation scheme, the number of unsatisfied and pending Interests in NDN routers is smaller than the Degree-Centrality cache size scheme.
Research into the classification of time series has made enormous progress in the last decade.
The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification.
The largest dataset in the UCR archive holds 10 thousand time series only; which may explain why the primary research focus has been in creating algorithms that have high accuracy on relatively small datasets.
This paper introduces Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series, and classifies a time series in milliseconds.
The models are ensembles of highly randomized Proximity Trees.
Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to one exemplar time series or another; allowing us to leverage the decades of work into developing relevant measures for time series.
Proximity Forest gains both efficiency and accuracy by stochastic selection of both exemplars and similarity measures.
Our work is motivated by recent time series applications that provide orders of magnitude more time series than the UCR benchmarks.
Our experiments demonstrate that Proximity Forest is highly competitive on the UCR archive: it ranks among the most accurate classifiers while being significantly faster.
We demonstrate on a 1M time series Earth observation dataset that Proximity Forest retains this accuracy on datasets that are many orders of magnitude greater than those in the UCR repository, while learning its models at least 100,000 times faster than current state of the art models Elastic Ensemble and COTE.
Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges.
To take advantage of such graphs, one must be able to find patterns, outliers and communities.
These tasks are better performed in an interactive environment, where human expertise can guide the process.
For large graphs, though, there are some challenges: the excessive processing requirements are prohibitive, and drawing hundred-thousand nodes results in cluttered images hard to comprehend.
To cope with these problems, we propose an innovative framework suited for any kind of tree-like graph visual design.
GMine integrates (a) a representation for graphs organized as hierarchies of partitions - the concepts of SuperGraph and Graph-Tree; and (b) a graph summarization methodology - CEPS.
Our graph representation deals with the problem of tracing the connection aspects of a graph hierarchy with sub linear complexity, allowing one to grasp the neighborhood of a single node or of a group of nodes in a single click.
As a proof of concept, the visual environment of GMine is instantiated as a system in which large graphs can be investigated globally and locally.
In this article we propose a method for measuring internet connection stability which is fast and has negligible overhead for the process of its complexity.
This method finds a relative value for representing the stability of internet connections and can also be extended for aggregated internet connections.
The method is documented with help of a real time implementation and results are shared.
This proposed measurement scheme uses HTTP GET method for each connections.
The normalized responses to identified sites like gateways of ISPs, google.com etc are used for calculating current link stability.
The novelty of the approach is that historic values are used to calculate overall link stability.
In this discussion, we also document a method to use the calculated values as a dynamic threshold metric.
This is used in routing decisions and for load-balancing each of the connections in an aggregated bandwidth pipe.
This scheme is a very popular practice in aggregated internet connections.
Synthetic biology is a rapidly emerging research area, with expected wide-ranging impact in biology, nanofabrication, and medicine.
A key technical challenge lies in embedding computation in molecular contexts where electronic micro-controllers cannot be inserted.
This necessitates effective representation of computation using molecular components.
While previous work established the Turing-completeness of chemical reactions, defining representations that are faithful, efficient, and practical remains challenging.
This paper introduces CRN++, a new language for programming deterministic (mass-action) chemical kinetics to perform computation.
We present its syntax and semantics, and build a compiler translating CRN++ programs into chemical reactions, thereby laying the foundation of a comprehensive framework for molecular programming.
Our language addresses the key challenge of embedding familiar imperative constructs into a set of chemical reactions happening simultaneously and manipulating real-valued concentrations.
Although some deviation from ideal output value cannot be avoided, we develop methods to minimize the error, and implement error analysis tools.
We demonstrate the feasibility of using CRN++ on a suite of well-known algorithms for discrete and real-valued computation.
CRN++ can be easily extended to support new commands or chemical reaction implementations, and thus provides a foundation for developing more robust and practical molecular programs.
The design of the precoder the maximizes the mutual information in linear vector Gaussian channels with an arbitrary input distribution is studied.
Precisely, the precoder optimal left singular vectors and singular values are derived.
The characterization of the right singular vectors is left, in general, as an open problem whose computational complexity is then studied in three cases: Gaussian signaling, low SNR, and high SNR.
For the Gaussian signaling case and the low SNR regime, the dependence of the mutual information on the right singular vectors vanishes, making the optimal precoder design problem easy to solve.
In the high SNR regime, however, the dependence on the right singular vectors cannot be avoided and we show the difficulty of computing the optimal precoder through an NP-hardness analysis.
This is the preprint version of our paper on 2015 International Conference on Virtual Rehabilitation (ICVR2015).
In this paper, we described the imagination scenarios of a touch-less interaction technology for hemiplegia, which can support either hand or foot interaction with the smartphone or head mounted device (HMD).
The computer vision interaction technology is implemented in our previous work, which provides a core support for gesture interaction by accurately detecting and tracking the hand or foot gesture.
The patients interact with the application using hand/foot gesture motion in the camera view.
The use of programming languages such as Java and C in Open Source Software (OSS) has been well studied.
However, many other popular languages such as XSL or XML have received minor attention.
In this paper, we discuss some trends in OSS development that we observed when considering multiple programming language evolution of OSS.
Based on the revision data of 22 OSS projects, we tracked the evolution of language usage and other artefacts such as documentation files, binaries and graphics files.
In these systems several different languages and artefact types including C/C++, Java, XML, XSL, Makefile, Groovy, HTML, Shell scripts, CSS, Graphics files, JavaScript, JSP, Ruby, Phyton, XQuery, OpenDocument files, PHP, etc. have been used.
We found that the amount of code written in different languages differs substantially.
Some of our findings can be summarized as follows: (1) JavaScript and CSS files most often co-evolve with XSL; (2) Most Java developers but only every second C/C++ developer work with XML; (3) and more generally, we observed a significant increase of usage of XML and XSL during recent years and found that Java or C are hardly ever the only language used by a developer.
In fact, a developer works with more than 5 different artefact types (or 4 different languages) in a project on average.
As the fundamental phrase of collecting and analyzing data, data integration is used in many applications, such as data cleaning, bioinformatics and pattern recognition.
In big data era, one of the major problems of data integration is to obtain the global schema of data sources since the global schema could be hardly derived from massive data sources directly.
In this paper, we attempt to solve such schema integration problem.
For different scenarios, we develop batch and incremental schema integration algorithms.
We consider the representation difference of attribute names in various data sources and propose ED Join and Semantic Join algorithms to integrate attributes with different representations.
Extensive experimental results demonstrate that the proposed algorithms could integrate schemas efficiently and effectively.
Thanks to recent advances in CNNs, solid improvements have been made in semantic segmentation of high resolution remote sensing imagery.
However, most of the previous works have not fully taken into account the specific difficulties that exist in remote sensing tasks.
One of such difficulties is that objects are small and crowded in remote sensing imagery.
To tackle with this challenging task we have proposed a novel architecture called local feature extraction (LFE) module attached on top of dilated front-end module.
The LFE module is based on our findings that aggressively increasing dilation factors fails to aggregate local features due to sparsity of the kernel, and detrimental to small objects.
The proposed LFE module solves this problem by aggregating local features with decreasing dilation factor.
We tested our network on three remote sensing datasets and acquired remarkably good results for all datasets especially for small objects.
This paper considers the problem of completing assemblies of passive objects in nonconvex environments, cluttered with convex obstacles of unknown position, shape and size that satisfy a specific separation assumption.
A differential drive robot equipped with a gripper and a LIDAR sensor, capable of perceiving its environment only locally, is used to position the passive objects in a desired configuration.
The method combines the virtues of a deliberative planner generating high-level, symbolic commands, with the formal guarantees of convergence and obstacle avoidance of a reactive planner that requires little onboard computation and is used online.
The validity of the proposed method is verified both with formal proofs and numerical simulations.
Reversible interactions model different scenarios, like biochemical systems and human as well as automatic negotiations.
We abstract interactions via multiparty sessions enriched with named checkpoints.
Computations can either go forward or roll back to some checkpoints, where possibly different choices may be taken.
In this way communications can be undone and different conversations may be tried.
Interactions are typed with global types, which control also rollbacks.
Typeability of session participants in agreement with global types ensures session fidelity and progress of reversible communications.
We study the computational power of deciding whether a given truth-table can be described by a circuit of a given size (the Minimum Circuit Size Problem, or MCSP for short), and of the variant denoted as MKTP where circuit size is replaced by a polynomially-related Kolmogorov measure.
All prior reductions from supposedly-intractable problems to MCSP / MKTP hinged on the power of MCSP / MKTP to distinguish random distributions from distributions produced by hardness-based pseudorandom generator constructions.
We develop a fundamentally different approach inspired by the well-known interactive proof system for the complement of Graph Isomorphism (GI).
It yields a randomized reduction with zero-sided error from GI to MKTP.
We generalize the result and show that GI can be replaced by any isomorphism problem for which the underlying group satisfies some elementary properties.
Instantiations include Linear Code Equivalence, Permutation Group Conjugacy, and Matrix Subspace Conjugacy.
Along the way we develop encodings of isomorphism classes that are efficiently decodable and achieve compression that is at or near the information-theoretic optimum; those encodings may be of independent interest.
Digital Rights Management (DRM) prevents end-users from using content in a manner inconsistent with its creator's wishes.
The license describing these use-conditions typically accompanies the content as its metadata.
A resulting problem is that the license and the content can get separated and lose track of each other.
The best metadata have two distinct qualities--they are created automatically without user intervention, and they are embedded within the data that they describe.
If licenses are also created and transported this way, data will always have licenses, and the licenses will be readily examinable.
When two or more datasets are combined, a new dataset, and with it a new license, are created.
This new license is a function of the licenses of the component datasets and any additional conditions that the person combining the datasets might want to impose.
Following the notion of a data-purpose algebra, we model this phenomenon by interpreting the transfer and conjunction of data as inducing an algebraic operation on the corresponding licenses.
When a dataset passes from one source to the next its license is transformed in a deterministic way, and similarly when datasets are combined the associated licenses are combined in a non-trivial algebraic manner.
Modern, computer-savvy, licensing regimes such as Creative Commons allow writing the license in a special kind of language called Creative Commons Rights Expression Language (ccREL). ccREL allows creating and embedding the license using RDFa utilizing XHTML.
This is preferred over DRM which includes the rights in a binary file completely opaque to nearly all users.
The colocation of metadata with human-visible XHTML makes the license more transparent.
In this paper we describe a methodology for creating and embedding licenses in geographic data utilizing ccREL, and programmatically examining embedded licenses in component data...
We present an algorithmic framework for learning multiple related tasks.
Our framework exploits a form of prior knowledge that relates the output spaces of these tasks.
We present PAC learning results that analyze the conditions under which such learning is possible.
We present results on learning a shallow parser and named-entity recognition system that exploits our framework, showing consistent improvements over baseline methods.
Contemporary Deep Neural Network (DNN) contains millions of synaptic connections with tens to hundreds of layers.
The large computation and memory requirements pose a challenge to the hardware design.
In this work, we leverage the intrinsic activation sparsity of DNN to substantially reduce the execution cycles and the energy consumption.
An end-to-end training algorithm is proposed to develop a lightweight run-time predictor for the output activation sparsity on the fly.
From our experimental results, the computation overhead of the prediction phase can be reduced to less than 5% of the original feedforward phase with negligible accuracy loss.
Furthermore, an energy-efficient hardware architecture, SparseNN, is proposed to exploit both the input and output sparsity.
SparseNN is a scalable architecture with distributed memories and processing elements connected through a dedicated on-chip network.
Compared with the state-of-the-art accelerators which only exploit the input sparsity, SparseNN can achieve a 10%-70% improvement in throughput and a power reduction of around 50%.
A heuristic procedure based on novel recursive formulation of sinusoid (RFS) and on regression with predictive least-squares (LS) enables to decompose both uniformly and nonuniformly sampled 1-d signals into a sparse set of sinusoids (SSS).
An optimal SSS is found by Levenberg-Marquardt (LM) optimization of RFS parameters of near-optimal sinusoids combined with common criteria for the estimation of the number of sinusoids embedded in noise.
The procedure estimates both the cardinality and the parameters of SSS.
The proposed algorithm enables to identify the RFS parameters of a sinusoid from a data sequence containing only a fraction of its cycle.
In extreme cases when the frequency of a sinusoid approaches zero the algorithm is able to detect a linear trend in data.
Also, an irregular sampling pattern enables the algorithm to correctly reconstruct the under-sampled sinusoid.
Parsimonious nature of the obtaining models opens the possibilities of using the proposed method in machine learning and in expert and intelligent systems needing analysis and simple representation of 1-d signals.
The properties of the proposed algorithm are evaluated on examples of irregularly sampled artificial signals in noise and are compared with high accuracy frequency estimation algorithms based on linear prediction (LP) approach, particularly with respect to Cramer-Rao Bound (CRB).
We consider problems originating in economics that may be solved automatically using mathematical software.
We present and make freely available a new benchmark set of such problems.
The problems have been shown to fall within the framework of non-linear real arithmetic, and so are in theory soluble via Quantifier Elimination (QE) technology as usually implemented in computer algebra systems.
Further, they all can be phrased in prenex normal form with only existential quantifiers and so are also admissible to those Satisfiability Module Theory (SMT) solvers that support the QF_NRA.
There is a great body of work considering QE and SMT application in science and engineering, but we demonstrate here that there is potential for this technology also in the social sciences.
This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models.
During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data.
Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters.
We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation.
Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.
We explore multitask models for neural translation of speech, augmenting them in order to reflect two intuitive notions.
First, we introduce a model where the second task decoder receives information from the decoder of the first task, since higher-level intermediate representations should provide useful information.
Second, we apply regularization that encourages transitivity and invertibility.
We show that the application of these notions on jointly trained models improves performance on the tasks of low-resource speech transcription and translation.
It also leads to better performance when using attention information for word discovery over unsegmented input.
For the problem of multi-class linear classification and feature selection, we propose approximate message passing approaches to sparse multinomial logistic regression (MLR).
First, we propose two algorithms based on the Hybrid Generalized Approximate Message Passing (HyGAMP) framework: one finds the maximum a posteriori (MAP) linear classifier and the other finds an approximation of the test-error-rate minimizing linear classifier.
Then we design computationally simplified variants of these two algorithms.
Next, we detail methods to tune the hyperparameters of their assumed statistical models using Stein's unbiased risk estimate (SURE) and expectation-maximization (EM), respectively.
Finally, using both synthetic and real-world datasets, we demonstrate improved error-rate and runtime performance relative to existing state-of-the-art approaches to sparse MLR.
Peridynamics is a non-local generalization of continuum mechanics tailored to address discontinuous displacement fields arising in fracture mechanics.
As many non-local approaches, peridynamics requires considerable computing resources to solve practical problems.
Several implementations of peridynamics utilizing CUDA, OpenCL, and MPI were developed to address this important issue.
On modern supercomputers, asynchronous many task systems are emerging to address the new architecture of computational nodes.
This paper presents a peridynamics EMU nodal discretization implementation with the C++ Standard Library for Concurrency and Parallelism (HPX), an open source asynchronous many task run time system.
The code is designed for modular expandability, so as to simplify it to extend with new material models or discretizations.
The code is convergent for implicit time integration and recovers theoretical solutions.
Explicit time integration, convergence results are presented to showcase the agreement of results with theoretical claims in previous works.
Two benchmark tests on code scalability are applied demonstrating agreement between this code's scalability and theoretical estimations.
In the Internet-of-Things, the number of connected devices is expected to be extremely huge, i.e., more than a couple of ten billion.
It is however well-known that the security for the Internet-of-Things is still open problem.
In particular, it is difficult to certify the identification of connected devices and to prevent the illegal spoofing.
It is because the conventional security technologies have advanced for mainly protecting logical network and not for physical network like the Internet-of-Things.
In order to protect the Internet-of-Things with advanced security technologies, we propose a new concept (datachain layer) which is a well-designed combination of physical chip identification and blockchain.
With a proposed solution of the physical chip identification, the physical addresses of connected devices are uniquely connected to the logical addresses to be protected by blockchain.
Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the programming language of source code files.
However, determining the programming language of a code snippet or a few lines of source code is still a challenging task.
Online forums such as Stack Overflow and code repositories such as GitHub contain a large number of code snippets.
In this paper, we describe Source Code Classification (SCC), a classifier that can identify the programming language of code snippets written in 21 different programming languages.
A Multinomial Naive Bayes (MNB) classifier is employed which is trained using Stack Overflow posts.
It is shown to achieve an accuracy of 75% which is higher than that with Programming Languages Identification (PLI a proprietary online classifier of snippets) whose accuracy is only 55.5%.
The average score for precision, recall and the F1 score with the proposed tool are 0.76, 0.75 and 0.75, respectively.
In addition, it can distinguish between code snippets from a family of programming languages such as C, C++ and C#, and can also identify the programming language version such as C# 3.0, C# 4.0 and C# 5.0.
The algorithm-to-hardware High-level synthesis (HLS) tools today are purported to produce hardware comparable in quality to handcrafted designs, particularly with user directive driven or domains specific HLS.
However, HLS tools are not readily equipped for when an application/algorithm needs to scale.
We present a (work-in-progress) semi-automated framework to map applications over a packet-switched network of modules (single FPGA) and then to seamlessly partition such a network over multiple FPGAs over quasi-serial links.
We illustrate the framework through three application case studies: LDPC Decoding, Particle Filter based Object Tracking, and Matrix Vector Multiplication over GF(2).
Starting with high-level representations of each case application, we first express them in an intermediate message passing formulation, a model of communicating processing elements.
Once the processing elements are identified, these are either handcrafted or realized using HLS.
The rest of the flow is automated where the processing elements are plugged on to a configurable network-on-chip (CONNECT) topology of choice, followed by partitioning the 'on-chip' links to work seamlessly across chips/FPGAs.
One significant challenge in cognitive radio networks is to design a framework in which the selfish secondary users are obliged to interact with each other truthfully.
Moreover, due to the vulnerability of these networks against jamming attacks, designing anti-jamming defense mechanisms is equally important.
%providing the security defense is also of great importance.
In this paper, we propose a truthful mechanism, robust against the jamming, for a dynamic stochastic cognitive radio network consisting of several selfish secondary users and a malicious user.
In this model, each secondary user participates in an auction and wish to use the unjammed spectrum, and the malicious user aims at jamming a channel by corrupting the communication link.
A truthful auction mechanism is designed among the secondary users.
Furthermore, a zero-sum game is formulated between the set of secondary users and the malicious user.
This joint problem is then cast as a randomized two-level auctions in which the first auction allocates the vacant channels, and then the second one assigns the remaining unallocated channels.
We have also changed this solution to a trustful distributed scheme.
Simulation results show that the distributed algorithm can achieve a performance that is close to the centralized algorithm, without the added overhead and complexity.
Modern multiprocessor system-on-chips (SoCs) integrate multiple heterogeneous cores to achieve high energy efficiency.
The power consumption of each core contributes to an increase in the temperature across the chip floorplan.
In turn, higher temperature increases the leakage power exponentially, and leads to a positive feedback with nonlinear dynamics.
This paper presents a power-temperature stability and safety analysis technique for multiprocessor systems.
This analysis reveals the conditions under which the power-temperature trajectory converges to a stable fixed point.
We also present a simple formula to compute the stable fixed point and maximum thermally-safe power consumption at runtime.
Hardware measurements on a state-of-the-art mobile processor show that our analytical formulation can predict the stable fixed point with an average error of 2.6%.
Hence, our approach can be used at runtime to ensure thermally safe operation and guard against thermal threats.
Facial attribute recognition is conventionally computed from a single image.
In practice, each subject may have multiple face images.
Taking the eye size as an example, it should not change, but it may have different estimation in multiple images, which would make a negative impact on face recognition.
Thus, how to compute these attributes corresponding to each subject rather than each single image is a profound work.
To address this question, we deploy deep training for facial attributes prediction, and we explore the inconsistency issue among the attributes computed from each single image.
Then, we develop two approaches to address the inconsistency issue.
Experimental results show that the proposed methods can handle facial attribute estimation on either multiple still images or video frames, and can correct the incorrectly annotated labels.
The experiments are conducted on two large public databases with annotations of facial attributes.
In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine.
It can manipulate and dereference pointers to an external variable-size random-access memory.
The model is trained from pure input-output examples using backpropagation.
We evaluate the new model on a number of simple algorithmic tasks whose solutions require pointer manipulation and dereferencing.
Our results show that the proposed model can learn to solve algorithmic tasks of such type and is capable of operating on simple data structures like linked-lists and binary trees.
For easier tasks, the learned solutions generalize to sequences of arbitrary length.
Moreover, memory access during inference can be done in a constant time under some assumptions.
Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions.
The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions.
In this work we propose pushing further this model by introducing two contributions into the encoding stage.
First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames.
Logic Programming is a Turing complete language.
As a consequence, designing algorithms that decide termination and non-termination of programs or decide inductive/coinductive soundness of formulae is a challenging task.
For example, the existing state-of-the-art algorithms can only semi-decide coinductive soundness of queries in logic programming for regular formulae.
Another, less famous, but equally fundamental and important undecidable property is productivity.
If a derivation is infinite and coinductively sound, we may ask whether the computed answer it determines actually computes an infinite formula.
If it does, the infinite computation is productive.
This intuition was first expressed under the name of computations at infinity in the 80s.
In modern days of the Internet and stream processing, its importance lies in connection to infinite data structure processing.
Recently, an algorithm was presented that semi-decides a weaker property -- of productivity of logic programs.
A logic program is productive if it can give rise to productive derivations.
In this paper we strengthen these recent results.
We propose a method that semi-decides productivity of individual derivations for regular formulae.
Thus we at last give an algorithmic counterpart to the notion of productivity of derivations in logic programming.
This is the first algorithmic solution to the problem since it was raised more than 30 years ago.
We also present an implementation of this algorithm.
Enabling fully automated testing of mobile applications has recently become an important topic of study for both researchers and practitioners.
A plethora of tools and approaches have been proposed to aid mobile developers both by augmenting manual testing practices and by automating various parts of the testing process.
However, current approaches for automated testing fall short in convincing developers about their benefits, leading to a majority of mobile testing being performed manually.
With the goal of helping researchers and practitioners - who design approaches supporting mobile testing - to understand developer's needs, we analyzed survey responses from 102 open source contributors to Android projects about their practices when performing testing.
The survey focused on questions regarding practices and preferences of developers/testers in-the-wild for (i) designing and generating test cases, (ii) automated testing practices, and (iii) perceptions of quality metrics such as code coverage for determining test quality.
Analyzing the information gleaned from this survey, we compile a body of knowledge to help guide researchers and professionals toward tailoring new automated testing approaches to the need of a diverse set of open source developers.
This document is the first part of the author's habilitation thesis (HDR), defended on June 4, 2018 at the University of Bordeaux.
Given the nature of this document, the contributions that involve the author have been emphasized; however, these four chapters were specifically written for distribution to a larger audience.
We hope they can serve as a broad introduction to the domain of highly dynamic networks, with a focus on temporal graph concepts and their interaction with distributed computing.
High triangle density -- the graph property stating that a constant fraction of two-hop paths belong to a triangle -- is a common signature of social networks.
This paper studies triangle-dense graphs from a structural perspective.
We prove constructively that significant portions of a triangle-dense graph are contained in a disjoint union of dense, radius 2 subgraphs.
This result quantifies the extent to which triangle-dense graphs resemble unions of cliques.
We also show that our algorithm recovers planted clusterings in approximation-stable k-median instances.
The behavior of heterogeneous multi-agent systems is studied when the coupling matrices are possibly all different and/or singular (that is, its rank is less than the system dimension).
Rank-deficient coupling allows exchange of limited state information, which is suitable for study of output coupling in multi-agent systems.
We present a coordinate change that transforms the heterogeneous multi-agent system into a singularly perturbed form.
The slow dynamics is still a reduced-order multi-agent system consisting of a weighted average of the vector fields of all agents, and some sub-dynamics of agents.
The weighted average is an emergent dynamics, which we call a blended dynamics.
By analyzing or synthesizing the blended dynamics, one can predict or design the behavior of heterogeneous multi-agent system when the coupling gain is sufficiently large.
For this result, stability of the blended dynamics is required.
Since stability of individual agent is not asked, stability of the blended dynamics is the outcome of trading stability among the agents.
It can be seen that, under stability of the blended dynamics, the initial conditions of individual agents are forgotten as time goes on, and thus, the behavior of the synthesized multi-agent system are initialization-free and suitable for plug-and-play operation.
As a showcase, we apply the proposed tool to two application problems; distributed state estimation for linear systems, and practical synchronization of heterogeneous Van der Pol oscillators (for which phase cohesiveness is achieved).
We also present underlying intuition for two more applications; estimation of the number of nodes in a network, and a problem of distributed optimization.
Brain mapping research in most neuroanatomical laboratories relies on conventional processing techniques, which often introduce histological artifacts such as tissue tears and tissue loss.
In this paper we present techniques and algorithms for automatic registration and 3D reconstruction of conventionally produced mouse brain slices in a standardized atlas space.
This is achieved first by constructing a virtual 3D mouse brain model from annotated slices of Allen Reference Atlas (ARA).
Virtual re-slicing of the reconstructed model generates ARA-based slice images corresponding to the microscopic images of histological brain sections.
These image pairs are aligned using a geometric approach through contour images.
Histological artifacts in the microscopic images are detected and removed using Constrained Delaunay Triangulation before performing global alignment.
Finally, non-linear registration is performed by solving Laplace's equation with Dirichlet boundary conditions.
Our methods provide significant improvements over previously reported registration techniques for the tested slices in 3D space, especially on slices with significant histological artifacts.
Further, as an application we count the number of neurons in various anatomical regions using a dataset of 51 microscopic slices from a single mouse brain.
This work represents a significant contribution to this subfield of neuroscience as it provides tools to neuroanatomist for analyzing and processing histological data.
In cellular networks, the locations of the radio access network (RAN) elements are determined mainly based on the long-term traffic behaviour.
However, when the random and hard-to-predict spatio-temporal distribution of the traffic (load,demand) does not fully match the fixed locations of the RAN elements (supply), some performance degradation becomes inevitable.
The concept of multi-tier cells (heterogeneous networks, HetNets) has been introduced in 4G networks to alleviate this mismatch.
However, as the traffic distribution deviates more and more from the long-term average, even the HetNet architecture will have difficulty in coping up with the erratic supply-demand mismatch, unless the RAN is grossly over-engineered (which is a financially non-viable solution).
In this article, we study the opportunistic utilization of low-altitude unmanned aerial platforms equipped with base stations (BSs), i.e., drone-BSs, in 5G networks.
In particular, we envisage a multi-tier drone-cell network complementing the terrestrial HetNets.
The variety of equipment, and non-rigid placement options allow utilizing multitier drone-cell networks to serve diversified demands.
Hence, drone-cells bring the supply to where the demand is, which sets new frontiers for the heterogeneity in 5G networks.
We investigate the advancements promised by drone-cells, and discuss the challenges associated with their operation and management.
We propose a drone-cell management framework (DMF) benefiting from the synergy among software defined networking (SDN), network functions virtualization (NFV), and cloud-computing.
We demonstrate DMF mechanisms via a case study, and numerically show that it can reduce the cost of utilizing drone-cells in multitenancy cellular networks.
Though the ability of human beings to deal with probabilities has been put into question, the assessment of rarity is a crucial competence underlying much of human decision-making and is pervasive in spontaneous narrative behaviour.
This paper proposes a new model of rarity and randomness assessment, designed to be cognitively plausible.
Intuitive randomness is defined as a function of structural complexity.
It is thus possible to assign probability to events without being obliged to consider the set of alternatives.
The model is tested on Lottery sequences and compared with subjects' preferences.
The steered response power phase transform (SRP-PHAT) is a beamformer method very attractive in acoustic localization applications due to its robustness in reverberant environments.
This paper presents a spatial grid design procedure, called the geometrically sampled grid (GSG), which aims at computing the spatial grid by taking into account the discrete sampling of time difference of arrival (TDOA) functions and the desired spatial resolution.
A new SRP-PHAT localization algorithm based on the GSG method is also introduced.
The proposed method exploits the intersections of the discrete hyperboloids representing the TDOA information domain of the sensor array, and projects the whole TDOA information on the space search grid.
The GSG method thus allows to design the sampled spatial grid which represents the best search grid for a given sensor array, it allows to perform a sensitivity analysis of the array and to characterize its spatial localization accuracy, and it may assist the system designer in the reconfiguration of the array.
Experimental results using both simulated data and real recordings show that the localization accuracy is substantially improved both for high and for low spatial resolution, and that it is closely related to the proposed power response sensitivity measure.
Small variance asymptotics is emerging as a useful technique for inference in large scale Bayesian non-parametric mixture models.
This paper analyses the online learning of robot manipulation tasks with Bayesian non-parametric mixture models under small variance asymptotics.
The analysis yields a scalable online sequence clustering (SOSC) algorithm that is non-parametric in the number of clusters and the subspace dimension of each cluster.
SOSC groups the new datapoint in its low dimensional subspace by online inference in a non-parametric mixture of probabilistic principal component analyzers (MPPCA) based on Dirichlet process, and captures the state transition and state duration information online in a hidden semi-Markov model (HSMM) based on hierarchical Dirichlet process.
A task-parameterized formulation of our approach autonomously adapts the model to changing environmental situations during manipulation.
We apply the algorithm in a teleoperation setting to recognize the intention of the operator and remotely adjust the movement of the robot using the learned model.
The generative model is used to synthesize both time-independent and time-dependent behaviours by relying on the principles of shared and autonomous control.
Experiments with the Baxter robot yield parsimonious clusters that adapt online with new demonstrations and assist the operator in performing remote manipulation tasks.
A binary tanglegram is a pair <S,T> of binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges.
For applications, for example in phylogenetics or software engineering, it is required that the individual trees are drawn crossing-free.
A natural optimization problem, denoted tanglegram layout problem, is thus to minimize the number of crossings between inter-tree edges.
The tanglegram layout problem is NP-hard and is currently considered both in application domains and theory.
In this paper we present an experimental comparison of a recursive algorithm of Buchin et al., our variant of their algorithm, the algorithm hierarchy sort of Holten and van Wijk, and an integer quadratic program that yields optimal solutions.
Even though it is unrealistic to expect citizens to pinpoint the policy implementation that they prefer from the set of alternatives, it is still possible to infer such information through an exercise of ranking the importance of policy objectives according to their opinion.
Assuming that the mapping between policy options and objective evaluations is a priori known (through models and simulations), this can be achieved either implicitly through appropriate analysis of social media content related to the policy objective in question or explicitly through the direct feedback provided in the frame of a game.
This document focuses on the presentation of a policy model, which reduces the policy to a multi-objective optimization problem and mitigates the shortcoming of the lack of social objective functions (public opinion models) with a black-box, games-for-crowds approach.
Music summarization allows for higher efficiency in processing, storage, and sharing of datasets.
Machine-oriented approaches, being agnostic to human consumption, optimize these aspects even further.
Such summaries have already been successfully validated in some MIR tasks.
We now generalize previous conclusions by evaluating the impact of generic summarization of music from a probabilistic perspective.
We estimate Gaussian distributions for original and summarized songs and compute their relative entropy, in order to measure information loss incurred by summarization.
Our results suggest that relative entropy is a good predictor of summarization performance in the context of tasks relying on a bag-of-features model.
Based on this observation, we further propose a straightforward yet expressive summarizer, which minimizes relative entropy with respect to the original song, that objectively outperforms previous methods and is better suited to avoid potential copyright issues.
We show how to combine Bayes nets and game theory to predict the behavior of hybrid systems involving both humans and automated components.
We call this novel framework "Semi Network-Form Games," and illustrate it by predicting aircraft pilot behavior in potential near mid-air collisions.
At present, at the beginning of such potential collisions, a collision avoidance system in the aircraft cockpit advises the pilots what to do to avoid the collision.
However studies of mid-air encounters have found wide variability in pilot responses to avoidance system advisories.
In particular, pilots rarely perfectly execute the recommended maneuvers, despite the fact that the collision avoidance system's effectiveness relies on their doing so.
Rather pilots decide their actions based on all information available to them (advisory, instrument readings, visual observations).
We show how to build this aspect into a semi network-form game model of the encounter and then present computational simulations of the resultant model.
Cyclic redundancy check (CRC) codes check if a codeword is correctly received.
This paper presents an algorithm to design CRC codes that are optimized for the code-specific error behavior of a specified feedforward convolutional code.
The algorithm utilizes two distinct approaches to computing undetected error probability of a CRC code used with a specific convolutional code.
The first approach enumerates the error patterns of the convolutional code and tests if each of them is detectable.
The second approach reduces complexity significantly by exploiting the equivalence of the undetected error probability to the frame error rate of an equivalent catastrophic convolutional code.
The error events of the equivalent convolutional code are exactly the undetectable errors for the original concatenation of CRC and convolutional codes.
This simplifies the computation because error patterns do not need to be individually checked for detectability.
As an example, we optimize CRC codes for a commonly used 64-state convolutional code for information length k=1024 demonstrating significant reduction in undetected error probability compared to the existing CRC codes with the same degrees.
For a fixed target undetected error probability, the optimized CRC codes typically require 2 fewer bits.
With the growing usage of Bitcoin and other cryptocurrencies, many scalability challenges have emerged.
A promising scaling solution, exemplified by the Lightning Network, uses a network of bidirectional payment channels that allows fast transactions between two parties.
However, routing payments on these networks efficiently is non-trivial, since payments require finding paths with sufficient funds, and channels can become unidirectional over time blocking further transactions through them.
Today's payment channel networks exacerbate these problems by attempting to deliver all payments atomically.
In this paper, we present the Spider network, a new packet-switched architecture for payment channel networks.
Spider splits payments into transaction units and transmits them over time across different paths.
Spider uses congestion control, payment scheduling, and imbalance-aware routing to optimize delivery of payments.
Our results show that Spider improves the volume and number of successful payments on the network by 10-45% and 5-40% respectively compared to state-of-the-art approaches.
Shifting to a lexicalized grammar reduces the number of parsing errors and improves application results.
However, such an operation affects a syntactic parser in all its aspects.
One of our research objectives is to design a realistic model for grammar lexicalization.
We carried out experiments for which we used a grammar with a very simple content and formalism, and a very informative syntactic lexicon, the lexicon-grammar of French elaborated by the LADL.
Lexicalization was performed by applying the parameterized-graph approach.
Our results tend to show that most information in the lexicon-grammar can be transferred into a grammar and exploited successfully for the syntactic parsing of sentences.
One approach to achieving artificial general intelligence (AGI) is through the emergence of complex structures and dynamic properties arising from decentralized networks of interacting artificial intelligence (AI) agents.
Understanding the principles of consensus in societies and finding ways to make consensus more reliable becomes critically important as connectivity and interaction speed increase in modern distributed systems of hybrid collective intelligences, which include both humans and computer systems.
We propose a new form of reputation-based consensus with greater resistance to reputation gaming than current systems have.
We discuss options for its implementation, and provide initial practical results.
Prior investigations have offered contrasting results on a troubling question: whether the alphabetical ordering of bylines confers citation advantages on those authors whose surnames put them first in the list.
The previous studies analyzed the surname effect at publication level, i.e. whether papers with the first author early in the alphabet trigger more citations than papers with a first author late in the alphabet.
We adopt instead a different approach, by analyzing the surname effect on citability at the individual level, i.e. whether authors with alphabetically earlier surnames result as being more cited.
Examining the question at both the overall and discipline levels, the analysis finds no evidence whatsoever that alphabetically earlier surnames gain advantage.
The same lack of evidence occurs for the subpopulation of scientists with very high publication rates, where alphabetical advantage might gain more ground.
The field of observation consists of 14,467 scientists in the sciences.
A plain well-trained deep learning model often does not have the ability to learn new knowledge without forgetting the previously learned knowledge, which is known as catastrophic forgetting.
Here we propose a novel method, SupportNet, to efficiently and effectively solve the catastrophic forgetting problem in the class incremental learning scenario.
SupportNet combines the strength of deep learning and support vector machine (SVM), where SVM is used to identify the support data from the old data, which are fed to the deep learning model together with the new data for further training so that the model can review the essential information of the old data when learning the new information.
Two powerful consolidation regularizers are applied to stabilize the learned representation and ensure the robustness of the learned model.
We validate our method with comprehensive experiments on various tasks, which show that SupportNet drastically outperforms the state-of-the-art incremental learning methods and even reaches similar performance as the deep learning model trained from scratch on both old and new data.
Our program is accessible at: https://github.com/lykaust15/SupportNet
In this paper, we consider the network power minimization problem in a downlink cloud radio access network (C-RAN), taking into account the power consumed at the baseband unit (BBU) for computation and the power consumed at the remote radio heads and fronthaul links for transmission.
The power minimization problem for transmission is a fast time-scale issue whereas the power minimization problem for computation is a slow time-scale issue.
Therefore, the joint network power minimization problem is a mixed time-scale problem.
To tackle the time-scale challenge, we introduce large system analysis to turn the original fast time-scale problem into a slow time-scale one that only depends on the statistical channel information.
In addition, we propose a bound improving branch-and-bound algorithm and a combinational algorithm to find the optimal and suboptimal solutions to the power minimization problem for computation, respectively, and propose an iterative coordinate descent algorithm to find the solutions to the power minimization problem for transmission.
Finally, a distributed algorithm based on hierarchical decomposition is proposed to solve the joint network power minimization problem.
In summary, this work provides a framework to investigate how execution efficiency and computing capability at BBU as well as delay constraint of tasks can affect the network power minimization problem in C-RANs.
Data mining techniques have been widely used to mine knowledgeable information from medical data bases.
In data mining classification is a supervised learning that can be used to design models describing important data classes, where class attribute is involved in the construction of the classifier.
Nearest neighbor (KNN) is very simple, most popular, highly efficient and effective algorithm for pattern recognition.KNN is a straight forward classifier, where samples are classified based on the class of their nearest neighbor.
Medical data bases are high volume in nature.
If the data set contains redundant and irrelevant attributes, classification may produce less accurate result.
Heart disease is the leading cause of death in INDIA.
In Andhra Pradesh heart disease was the leading cause of mortality accounting for 32%of all deaths, a rate as high as Canada (35%) and USA.Hence there is a need to define a decision support system that helps clinicians decide to take precautionary steps.
In this paper we propose a new algorithm which combines KNN with genetic algorithm for effective classification.
Genetic algorithms perform global search in complex large and multimodal landscapes and provide optimal solution.
Experimental results shows that our algorithm enhance the accuracy in diagnosis of heart disease.
In this paper, we propose a probabilistic parsing model, which defines a proper conditional probability distribution over non-projective dependency trees for a given sentence, using neural representations as inputs.
The neural network architecture is based on bi-directional LSTM-CNNs which benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM and CNN.
On top of the neural network, we introduce a probabilistic structured layer, defining a conditional log-linear model over non-projective trees.
We evaluate our model on 17 different datasets, across 14 different languages.
By exploiting Kirchhoff's Matrix-Tree Theorem (Tutte, 1984), the partition functions and marginals can be computed efficiently, leading to a straight-forward end-to-end model training procedure via back-propagation.
Our parser achieves state-of-the-art parsing performance on nine datasets.
A point of a digital space is called simple if it can be deleted from the space without altering topology.
This paper introduces the notion simple set of points of a digital space.
The definition is based on contractible spaces and contractible transformations.
A set of points in a digital space is called simple if it can be contracted to a point without changing topology of the space.
It is shown that contracting a simple set of points does not change the homotopy type of a digital space, and the number of points in a digital space without simple points can be reduces by contracting simple sets.
Using the process of contracting, we can substantially compress a digital space while preserving the topology.
The paper proposes a method for thinning a digital space which shows that this approach can contribute to computer science such as medical imaging, computer graphics and pattern analysis.
The concept of h-index has been proposed to easily assess a researcher's performance with a single number.
However, by using only this number, we lose significant information about the distribution of citations per article in an author's publication list.
In this article, we study an author's citation curve and we define two new areas related to this curve.
We call these "penalty areas", since the greater they are, the more an author's performance is penalized.
We exploit these areas to establish new indices, namely PI and XPI, aiming at categorizing researchers in two distinct categories: "influentials" and "mass producers"; the former category produces articles which are (almost all) with high impact, and the latter category produces a lot of articles with moderate or no impact at all.
Using data from Microsoft Academic Service, we evaluate the merits mainly of PI as a useful tool for scientometric studies.
We establish its effectiveness into separating the scientists into influentials and mass producers; we demonstrate its robustness against self-citations, and its uncorrelation to traditional indices.
Finally, we apply PI to rank prominent scientists in the areas of databases, networks and multimedia, exhibiting the strength of the index in fulfilling its design goal.
libact is a Python package designed to make active learning easier for general users.
The package not only implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly.
Furthermore, the package provides a unified interface for implementing more strategies, models and application-specific labelers.
The package is open-source on Github, and can be easily installed from Python Package Index repository.
Disparity estimation is a difficult problem in stereo vision because the correspondence technique fails in images with textureless and repetitive regions.
Recent body of work using deep convolutional neural networks (CNN) overcomes this problem with semantics.
Most CNN implementations use an autoencoder method; stereo images are encoded, merged and finally decoded to predict the disparity map.
In this paper, we present a CNN implementation inspired by dense networks to reduce the number of parameters.
Furthermore, our approach takes into account semantic reasoning in disparity estimation.
Our proposed network, called DenseMapNet, is compact, fast and can be trained end-to-end.
DenseMapNet requires 290k parameters only and runs at 30Hz or faster on color stereo images in full resolution.
Experimental results show that DenseMapNet accuracy is comparable with other significantly bigger CNN-based methods.
We present a new back propagation based training algorithm for discrete-time spiking neural networks (SNN).
Inspired by recent deep learning algorithms on binarized neural networks, binary activation with a straight-through gradient estimator is used to model the leaky integrate-fire spiking neuron, overcoming the difficulty in training SNNs using back propagation.
Two SNN training algorithms are proposed: (1) SNN with discontinuous integration, which is suitable for rate-coded input spikes, and (2) SNN with continuous integration, which is more general and can handle input spikes with temporal information.
Neuromorphic hardware designed in 40nm CMOS exploits the spike sparsity and demonstrates high classification accuracy (>98% on MNIST) and low energy (48.4-773 nJ/image).
This paper describes the Pressure Ulcers Online Website, which is a first step solution towards a new and innovative platform for helping people to detect, understand and manage pressure ulcers.
It outlines the reasons why the project has been developed and provides a central point of contact for pressure ulcer analysis and ongoing research.
Using state-of-the-art technologies in convolutional neural networks and transfer learning along with end-to-end web technologies, this platform allows pressure ulcers to be analysed and findings to be reported.
As the system evolves through collaborative partnerships, future versions will provide decision support functions to describe the complex characteristics of pressure ulcers along with information on wound care across multiple user boundaries.
This project is therefore intended to raise awareness and support for people suffering with or providing care for pressure ulcers.
Despite being a relatively new communication technology, Low-Power Wide Area Networks (LPWANs) have shown their suitability to empower a major part of Internet of Things applications.
Nonetheless, most LPWAN solutions are built on star topology (or single-hop) networks, often causing lifetime shortening in stations located far from the gateway.
In this respect, recent studies show that multi-hop routing for uplink communications can reduce LPWANs' energy consumption significantly.
However, it is a troublesome task to identify such energetically optimal routings through trial-and-error brute-force approaches because of time and, especially, energy consumption constraints.
In this work we show the benefits of facing this exploration/exploitation problem by running centralized variations of the multi-arm bandit's epsilon-greedy, a well-known online decision-making method that combines best known action selection and knowledge expansion.
Important energy savings are achieved when proper randomness parameters are set, which are often improved when conveniently applying similarity, a concept introduced in this work that allows harnessing the gathered knowledge by sporadically selecting unexplored routing combinations akin to the best known one.
Frequent itemset mining leads to the discovery of associations and correlations among items in large transactional databases.
Apriori is a classical frequent itemset mining algorithm, which employs iterative passes over database combining with generation of candidate itemsets based on frequent itemsets found at the previous iteration, and pruning of clearly infrequent itemsets.
The Dynamic Itemset Counting (DIC) algorithm is a variation of Apriori, which tries to reduce the number of passes made over a transactional database while keeping the number of itemsets counted in a pass relatively low.
In this paper, we address the problem of accelerating DIC on the Intel Xeon Phi many-core system for the case when the transactional database fits in main memory.
Intel Xeon Phi provides a large number of small compute cores with vector processing units.
The paper presents a parallel implementation of DIC based on OpenMP technology and thread-level parallelism.
We exploit the bit-based internal layout for transactions and itemsets.
This technique reduces the memory space for storing the transactional database, simplifies the support count via logical bitwise operation, and allows for vectorization of such a step.
Experimental evaluation on the platforms of the Intel Xeon CPU and the Intel Xeon Phi coprocessor with large synthetic and real databases showed good performance and scalability of the proposed algorithm.
This paper considers a cellular system with a full-duplex base station and half-duplex users.
The base station can activate one user in uplink or downlink (half-duplex mode), or two different users one in each direction simultaneously (full-duplex mode).
Simultaneous transmissions in uplink and downlink causes self-interference at the base station and uplink-to-downlink interference at the downlink user.
Although uplink-to-downlink interference is typically treated as noise, it is shown that successive interference decoding and cancellation (SIC mode) can lead to significant improvement in network utility, especially when user distribution is concentrated around a few hotspots.
The proposed temporal fair user scheduling algorithm and corresponding power optimization utilizes full-duplex and SIC modes as well as half-duplex transmissions based on their impact on network utility.
Simulation results reveal that the proposed strategy can achieve up to 95% average cell throughput improvement in typical indoor scenarios with respect to a conventional network in which the base station is half-duplex.
In this paper, we give a distributed joint source channel coding scheme for arbitrary correlated sources for arbitrary point in the Slepian-Wolf rate region, and arbitrary link capacities using LDPC codes.
We consider the Slepian-Wolf setting of two sources and one destination, with one of the sources derived from the other source by some correlation model known at the decoder.
Distributed encoding and separate decoding is used for the two sources.
We also give a distributed source coding scheme when the source correlation has memory to achieve any point in the Slepian-Wolf rate achievable region.
In this setting, we perform separate encoding but joint decoding.
Co-localization is the problem of localizing objects of the same class using only the set of images that contain them.
This is a challenging task because the object detector must be built without negative examples that can lead to more informative supervision signals.
The main idea of our method is to cluster the feature space of a generically pre-trained CNN, to find a set of CNN features that are consistently and highly activated for an object category, which we call category-consistent CNN features.
Then, we propagate their combined activation map using superpixel geodesic distances for co-localization.
In our first set of experiments, we show that the proposed method achieves state-of-the-art performance on three related benchmarks: PASCAL 2007, PASCAL-2012, and the Object Discovery dataset.
We also show that our method is able to detect and localize truly unseen categories, on six held-out ImageNet categories with accuracy that is significantly higher than previous state-of-the-art.
Our intuitive approach achieves this success without any region proposals or object detectors, and can be based on a CNN that was pre-trained purely on image classification tasks without further fine-tuning.
Designing high performance channel assignment schemes to harness the potential of multi-radio multi-channel deployments in wireless mesh networks (WMNs) is an active research domain.
A pragmatic channel assignment approach strives to maximize network capacity by restraining the endemic interference and mitigating its adverse impact on network performance.
Interference prevalent in WMNs is multi-faceted, radio co-location interference (RCI) being a crucial aspect that is seldom addressed in research endeavors.
In this effort, we propose a set of intelligent channel assignment algorithms, which focus primarily on alleviating the RCI.
These graph theoretic schemes are structurally inspired by the spatio-statistical characteristics of interference.
We present the theoretical design foundations for each of the proposed algorithms, and demonstrate their potential to significantly enhance network capacity in comparison to some well-known existing schemes.
We also demonstrate the adverse impact of radio co- location interference on the network, and the efficacy of the proposed schemes in successfully mitigating it.
The experimental results to validate the proposed theoretical notions were obtained by running an exhaustive set of ns-3 simulations in IEEE 802.11g/n environments.
The Remote-PHY (R-PHY) modular cable network for Data over Cable Service Interface Specification (DOCSIS) service conducts the physical layer processing for the transmissions over the broadcast cable in a remote node.
In contrast, the cloud radio access network (CRAN) for Long-Term Evolution (LTE) cellular wireless services conducts all baseband physical layer processing in a central baseband unit and the remaining physical layer processing steps towards radio frequency (RF) transmission in remote nodes.
Both DOCSIS and LTE are based on Orthogonal Frequency Division Multiplexing (OFDM) physical layer processing.
We propose to unify cable and wireless cellular access networks by utilizing the hybrid fiber-coax (HFC) cable network infrastructure as fiber fronthaul network for cellular wireless services.
For efficient operation of such a unified access network, we propose a novel Remote-FFT (R-FFT) node that conducts the physical layer processing from the Fast-Fourier Transform (FFT) module towards the RF transmission, whereby DOCSIS and LTE share a common FFT module.
The frequency domain in-phase and quadrature (I/Q) symbols for both DOCSIS and LTE are transmitted over the fiber between remote node and cable headend, where the remaining physical layer processing is conducted.
We further propose to cache repetitive quadrature amplitude modulation (QAM) symbols in the R-FFT node to reduce the fronthaul bitrate requirements and enable statistical multiplexing.
We evaluate the fronthaul bitrate reductions achieved by R-FFT node caching, the fronthaul transmission bitrates arising from the unified DOCSIS and LTE service, and illustrate the delay implications of moving part of the cable R-PHY remote node physical layer processing to the headend.
Automated brain lesions detection is an important and very challenging clinical diagnostic task because the lesions have different sizes, shapes, contrasts, and locations.
Deep Learning recently has shown promising progress in many application fields, which motivates us to apply this technology for such important problem.
In this paper, we propose a novel and end-to-end trainable approach for brain lesions classification and detection by using deep Convolutional Neural Network (CNN).
In order to investigate the applicability, we applied our approach on several brain diseases including high and low-grade glioma tumor, ischemic stroke, Alzheimer diseases, by which the brain Magnetic Resonance Images (MRI) have been applied as an input for the analysis.
We proposed a new operating unit which receives features from several projections of a subset units of the bottom layer and computes a normalized l2-norm for next layer.
We evaluated the proposed approach on two different CNN architectures and number of popular benchmark datasets.
The experimental results demonstrate the superior ability of the proposed approach.
Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations.
Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining.
We propose an improved fixedpoint optimization algorithm that estimates the quantization step size dynamically during the retraining.
In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high- to low-precision.
The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs).
Avionics is one kind of domain where prevention prevails.
Nonetheless fails occur.
Sometimes due to pilot misreacting, flooded in information.
Sometimes information itself would be better verified than trusted.
To avoid some kind of failure, it has been thought to add,in midst of the ARINC664 aircraft data network, a new kind of monitoring.
During the past years, psychological diseases related to unhealthy work environments, such as burnouts, have drawn more and more public attention.
One of the known causes of these affective problems is time pressure.
In order to form a theoretical background for time pressure detection in software repositories, this paper combines interdisciplinary knowledge by analyzing 1270 papers found on Scopus database and containing terms related to time pressure.
By clustering those papers based on their abstract, we show that time pressure has been widely studied across different fields, but relatively little in software engineering.
From a literature review of the most relevant papers, we infer a list of testable hypotheses that we want to verify in future studies in order to assess the impact of time pressures on software developers mental health.
Given the increasing number of devices that is going to get connected to wireless networks with the advent of Internet of Things, spectrum scarcity will present a major challenge.
Application of opportunistic spectrum access mechanisms to IoT networks will become increasingly important to solve this.
In this paper, we present a cognitive radio network architecture which uses multi-stage online learning techniques for spectrum assignment to devices, with the aim of improving the throughput and energy efficiency of the IoT devices.
In the first stage, we use an AI technique to learn the quality of a user-channel pairing.
The next stage utilizes a non-parametric Bayesian learning algorithm to estimate the Primary User OFF time in each channel.
The third stage augments the Bayesian learner with implicit exploration to accelerate the learning procedure.
The proposed method leads to significant improvement in throughput and energy efficiency of the IoT devices while keeping the interference to the primary users minimal.
We provide comprehensive empirical validation of the method with other learning based approaches.
This paper provides a methodology to study the PHY layer vulnerability of wireless protocols in hostile radio environments.
Our approach is based on testing the vulnerabilities of a system by analyzing the individual subsystems.
By targeting an individual subsystem or a combination of subsystems at a time, we can infer the weakest part and revise it to improve the overall system performance.
We apply our methodology to 4G LTE downlink by considering each control channel as a subsystem.
We also develop open-source software enabling research and education using software-defined radios.
We present experimental results with open-source LTE systems and shows how the different subsystems behave under targeted interference.
The analysis for the LTE downlink shows that the synchronization signals (PSS/SSS) are very resilient to interference, whereas the downlink pilots or Cell-Specific Reference signals (CRS) are the most susceptible to a synchronized protocol-aware interferer.
We also analyze the severity of control channel attacks for different LTE configurations.
Our methodology and tools allow rapid evaluation of the PHY layer reliability in harsh signaling environments, which is an asset to improve current standards and develop new robust wireless protocols.
This paper discusses the controllability problem of complex networks.
It is shown that almost any weighted complex network with noise on the strength of communication links is controllable in the sense of Kalman controllability.
The concept of almost controllability is elaborated by both theoretical discussions and experimental verifications.
In this paper, efficient resource allocation for the uplink transmission of wireless powered IoT networks is investigated.
We adopt LoRa technology as an example in the IoT network, but this work is still suitable for other communication technologies.
Allocating limited resources, like spectrum and energy resources, among a massive number of users faces critical challenges.
We consider grouping wireless powered IoT users into available channels first and then investigate power allocation for users grouped in the same channel to improve the network throughput.
Specifically, the user grouping problem is formulated as a many to one matching game.
It is achieved by considering IoT users and channels as selfish players which belong to two disjoint sets.
Both selfish players focus on maximizing their own utilities.
Then we propose an efficient channel allocation algorithm (ECAA) with low complexity for user grouping.
Additionally, a Markov Decision Process (MDP) is used to model unpredictable energy arrival and channel conditions uncertainty at each user, and a power allocation algorithm is proposed to maximize the accumulative network throughput over a finite-horizon of time slots.
By doing so, we can distribute the channel access and dynamic power allocation local to IoT users.
Numerical results demonstrate that our proposed ECAA algorithm achieves near-optimal performance and is superior to random channel assignment, but has much lower computational complexity.
Moreover, simulations show that the distributed power allocation policy for each user is obtained with better performance than a centralized offline scheme.
The CFR+ algorithm for solving imperfect information games is a variant of the popular CFR algorithm, with faster empirical performance on a range of problems.
It was introduced with a theoretical upper bound on solution error, but subsequent work showed an error in one step of the proof.
We provide updated proofs to recover the original bound.
The Internet of Things (IoT) represents a comprehensive environment that consists of a large number of smart devices interconnecting heterogeneous physical objects to the Internet.
Many domains such as logistics, manufacturing, agriculture, urban computing, home automation, ambient assisted living and various ubiquitous computing applications have utilised IoT technologies.
Meanwhile, Business Process Management Systems (BPMS) have become a successful and efficient solution for coordinated management and optimised utilisation of resources/entities.
However, past BPMS have not considered many issues they will face in managing large scale connected heterogeneous IoT entities.
Without fully understanding the behaviour, capability and state of the IoT entities, the BPMS can fail to manage the IoT integrated information systems.
In this paper, we analyse existing BPMS for IoT and identify the limitations and their drawbacks based on Mobile Cloud Computing perspective.
Later, we discuss a number of open challenges in BPMS for IoT.
Deep learning hyper-parameter optimization is a tough task.
Finding an appropriate network configuration is a key to success, however most of the times this labor is roughly done.
In this work we introduce a novel library to tackle this problem, the Deep Learning Optimization Library: DLOPT.
We briefly describe its architecture and present a set of use examples.
This is an open source project developed under the GNU GPL v3 license and it is freely available at https://github.com/acamero/dlopt
This is the preprint version of our paper on 2015 International Conference on Virtual Rehabilitation (ICVR2015).
The purpose of this work is designing and implementing a rehabilitation software for dysphonic patients.
Constant training is a key factor for this type of therapy.
The patient can play the game as well as conduct the voice training simultaneously guided by therapists at clinic or exercise independently at home.
The voice information can be recorded and extracted for evaluating the long-time rehabilitation progress.
Data management has always been a multi-domain problem even in the simplest cases.
It involves, quality of service, security, resource management, cost management, incident identification, disaster avoidance and/or recovery, as well as many other concerns.
In our case, this situation gets ever more complicated because of the divergent nature of a cloud federation like BASMATI.
In this federation, the BASMATI Unified Data Management Framework (BUDaMaF), tries to create an automated uniform way of managing all the data transactions, as well as the data stores themselves, in a polyglot multi-cloud, consisting of a plethora of different machines and data store systems.
Classification of social media data is an important approach in understanding user behavior on the Web.
Although information on social media can be of different modalities such as texts, images, audio or videos, traditional approaches in classification usually leverage only one prominent modality.
Techniques that are able to leverage multiple modalities are often complex and susceptible to the absence of some modalities.
In this paper, we present simple models that combine information from different modalities to classify social media content and are able to handle the above problems with existing techniques.
Our models combine information from different modalities using a pooling layer and an auxiliary learning task is used to learn a common feature space.
We demonstrate the performance of our models and their robustness to the missing of some modalities in the emotion classification domain.
Our approaches, although being simple, can not only achieve significantly higher accuracies than traditional fusion approaches but also have comparable results when only one modality is available.
Distributed parameter estimation for large-scale systems is an active research problem.
The goal is to derive a distributed algorithm in which each agent obtains a local estimate of its own subset of the global parameter vector, based on local measurements as well as information received from its neighbours.
A recent algorithm has been proposed, which yields the optimal solution (i.e., the one that would be obtained using a centralized method) in finite time, provided the communication network forms an acyclic graph.
If instead, the graph is cyclic, the only available alternative algorithm, which is based on iterative matrix inversion, achieving the optimal solution, does so asymptotically.
However, it is also known that, in the cyclic case, the algorithm designed for acyclic graphs produces a solution which, although non optimal, is highly accurate.
In this paper we do a theoretical study of the accuracy of this algorithm, in communication networks forming cyclic graphs.
To this end, we provide bounds for the sub-optimality of the estimation error and the estimation error covariance, for a class of systems whose topological sparsity and signal-to-noise ratio satisfy certain condition.
Our results show that, at each node, the accuracy improves exponentially with the so-called loop-free depth.
Also, although the algorithm no longer converges in finite time in the case of cyclic graphs, simulation results show that the convergence is significantly faster than that of methods based on iterative matrix inversion.
Our results suggest that, depending on the loop-free depth, the studied algorithm may be the preferred option even in applications with cyclic communication graphs.
Cyber data attacks are the worst-case interacting bad data to power system state estimation and cannot be detected by existing bad data detectors.
In this paper, we for the first time analyze the likelihood of cyber data attacks by characterizing the actions of a malicious intruder.
We propose to use Markov decision process to model an intruder's strategy, where the objective is to maximize the cumulative reward across time.
Linear programming method is employed to find the optimal attack policy from the intruder's perspective.
Numerical experiments are conducted to study the intruder's attack strategy in test power systems.
Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing.
In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmark models in neural networks.
We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors.
Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed.
Non-rigorous predictions for the optimal errors existed for special cases of GLMs, e.g. for the perceptron, in the field of statistical physics based on the so-called replica method.
Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm.
Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.
We believe that this random version of GLMs can serve as a challenging benchmark for multi-purpose algorithms.
This paper is divided in two parts that can be read independently: The first part (main part) presents the model and main results, discusses some applications and sketches the main ideas of the proof.
The second part (supplementary informations) is much more detailed and provides more examples as well as all the proofs.
Did the demise of the Soviet Union in 1991 influence the scientific performance of the researchers in Eastern European countries?
Did this historical event affect international collaboration by researchers from the Eastern European countries with those of Western countries?
Did it also change international collaboration among researchers from the Eastern European countries?
Trying to answer these questions, this study aims to shed light on international collaboration by researchers from the Eastern European countries (Russia, Ukraine, Belarus, Moldova, Bulgaria, the Czech Republic, Hungary, Poland, Romania and Slovakia).
The number of publications and normalized citation impact values are compared for these countries based on InCites (Thomson Reuters), from 1981 up to 2011.
The international collaboration by researchers affiliated to institutions in Eastern European countries at the time points of 1990, 2000 and 2011 was studied with the help of Pajek and VOSviewer software, based on data from the Science Citation Index (Thomson Reuters).
Our results show that the breakdown of the communist regime did not lead, on average, to a huge improvement in the publication performance of the Eastern European countries and that the increase in international co-authorship relations by the researchers affiliated to institutions in these countries was smaller than expected.
Most of the Eastern European countries are still subject to changes and are still awaiting their boost in scientific development.
In many image processing applications, such as segmentation and classification, the selection of robust features descriptors is crucial to improve the discrimination capabilities in real world scenarios.
In particular, it is well known that image textures constitute power visual cues for feature extraction and classification.
In the past few years the local binary pattern (LBP) approach, a texture descriptor method proposed by Ojala et al., has gained increased acceptance due to its computational simplicity and more importantly for encoding a powerful signature for describing textures.
However, the original algorithm presents some limitations such as noise sensitivity and its lack of rotational invariance which have led to many proposals or extensions in order to overcome such limitations.
In this paper we performed a quantitative study of the Ojala's original LBP proposal together with other recently proposed LBP extensions in the presence of rotational, illumination and noisy changes.
In the experiments we have considered two different databases: Brodatz and CUReT for different sizes of LBP masks.
Experimental results demonstrated the effectiveness and robustness of the described texture descriptors for images that are subjected to geometric or radiometric changes.
The paper deals with the modelling and control of a double winded induction generator.
The controlled process is an induction generator with distinct excitation winding.
At the generator terminal is connected a load (electrical consumer).
There are presented the results obtained by using a minimum variance adaptive control system.
The main goal of the control structure is to keep the generator output (terminal voltage) constant by controlling the excitation voltage from the distinct winding.
The study cases in the paper are for the validation of the reduced order model of induction generator (5th order model) used only to design the adaptive controller.
There is also validated the control structure.
There were considered variations of the mechanical torque.
Deep neural networks have achieved increasingly accurate results on a wide variety of complex tasks.
However, much of this improvement is due to the growing use and availability of computational resources (e.g use of GPUs, more layers, more parameters, etc).
Most state-of-the-art deep networks, despite performing well, over-parameterize approximate functions and take a significant amount of time to train.
With increased focus on deploying deep neural networks on resource constrained devices like smart phones, there has been a push to evaluate why these models are so resource hungry and how they can be made more efficient.
This work evaluates and compares three distinct methods for deep model compression and acceleration: weight pruning, low rank factorization, and knowledge distillation.
Comparisons on VGG nets trained on CIFAR10 show that each of the models on their own are effective, but that the true power lies in combining them.
We show that by combining pruning and knowledge distillation methods we can create a compressed network 85 times smaller than the original, all while retaining 96% of the original model's accuracy.
Many network optimization problems can be formulated as stochastic network design problems in which edges are present or absent stochastically.
Furthermore, protective actions can guarantee that edges will remain present.
We consider the problem of finding the optimal protection strategy under a budget limit in order to maximize some connectivity measurements of the network.
Previous approaches rely on the assumption that edges are independent.
In this paper, we consider a more realistic setting where multiple edges are not independent due to natural disasters or regional events that make the states of multiple edges stochastically correlated.
We use Markov Random Fields to model the correlation and define a new stochastic network design framework.
We provide a novel algorithm based on Sample Average Approximation (SAA) coupled with a Gibbs or XOR sampler.
The experimental results on real road network data show that the policies produced by SAA with the XOR sampler have higher quality and lower variance compared to SAA with Gibbs sampler.
In this paper, we report on our efforts for using Deep Learning for classifying artifacts and their features in digital visuals as a part of the Neoclassica framework.
It was conceived to provide scholars with new methods for analyzing and classifying artifacts and aesthetic forms from the era of Classicism.
The framework accommodates both traditional knowledge representation as a formal ontology and data-driven knowledge discovery, where cultural patterns will be identified by means of algorithms in statistical analysis and machine learning.
We created a Deep Learning approach trained on photographs to classify the objects inside these photographs.
In a next step, we will apply a different Deep Learning approach.
It is capable of locating multiple objects inside an image and classifying them with a high accuracy.
Fence instructions are fundamental primitives that ensure consistency in a weakly consistent shared memory multi-core processor.
The execution cost of these instructions is significant and adds a non-trivial overhead to parallel programs.
In a naive architecture implementation, we track the ordering constraints imposed by a fence by its entry in the reorder buffer and its execution overhead entails stalling the processor's pipeline until the store buffer is drained and also conservatively invalidating speculative loads.
These actions create a cascading effect of increased overhead on the execution of the following instructions in the program.
We find these actions to be overly restrictive and that they can be further relaxed thereby allowing aggressive optimizations.
The current work proposes a lightweight mechanism in which we assign ordering tags, called versions, to load and store instructions when they reside in the load/store queues and the write buffer.
The version assigned to a memory access allows us to fully exploit the relaxation allowed by the weak consistency model and restricts its execution in such a way that the ordering constraints by the model are satisfied.
We utilize the information captured through the assigned versions to reduce stalls caused by waiting for the store buffer to drain and to avoid unnecessary squashing of speculative loads, thereby minimizing the re-execution penalty.
This method is particularly effective for the release consistency model that employs uni-directional fence instructions.
We show that this mechanism reduces the ordering instruction latency by 39.6% and improves program performance by 11% on average over the baseline implementation.
Human nonverbal emotional communication in dyadic dialogs is a process of mutual influence and adaptation.
Identifying the direction of influence, or cause-effect relation between participants is a challenging task, due to two main obstacles.
First, distinct emotions might not be clearly visible.
Second, participants cause-effect relation is transient and variant over time.
In this paper, we address these difficulties by using facial expressions that can be present even when strong distinct facial emotions are not visible.
We also propose to apply a relevant interval selection approach prior to causal inference to identify those transient intervals where adaptation process occurs.
To identify the direction of influence, we apply the concept of Granger causality to the time series of facial expressions on the set of relevant intervals.
We tested our approach on synthetic data and then applied it to newly, experimentally obtained data.
Here, we were able to show that a more sensitive facial expression detection algorithm and a relevant interval detection approach is most promising to reveal the cause-effect pattern for dyadic communication in various instructed interaction conditions.
Automated Computer Aided diagnostic tools can be used for the early detection of glaucoma to prevent irreversible vision loss.
In this work, we present a Multi-task Convolutional Neural Network (CNN) that jointly segments the Optic Disc (OD), Optic Cup (OC) and predicts the presence of glaucoma in color fundus images.
The CNN utilizes a combination of image appearance features and structural features obtained from the OD-OC segmentation to obtain a robust prediction.
The use of fewer network parameters and the sharing of the CNN features for multiple related tasks ensures the good generalizability of the architecture, allowing it to be trained on small training sets.
The cross-testing performance of the proposed method on an independent validation set acquired using a different camera and image resolution was found to be good with an average dice score of 0.92 for OD, 0.84 for OC and AUC of 0.95 on the task of glaucoma classification illustrating its potential as a mass screening tool for the early detection of glaucoma.
During the last few years, there has been plenty of research for reducing energy consumption in telecommunication infrastructure.
However, many of the proposals remain unim-plemented due to the lack of flexibility in legacy networks.
In this paper we demonstrate how the software defined networking (SDN) capabilities of current networking equipment can be used to implement some of these energy saving algorithms.
In particular, we developed an ONOS application to realize an energy-aware traffic scheduler to a bundle link made up of Energy Efficient Ethernet (EEE) links between two SDN switches.
We show how our application is able to dynamically adapt to the traffic characteristics and save energy by concentrating the traffic on as few ports as possible.
This way, unused ports remain in Low Power Idle (LPI) state most of the time, saving energy.
Data-driven saliency has recently gained a lot of attention thanks to the use of Convolutional Neural Networks for predicting gaze fixations.
In this paper we go beyond standard approaches to saliency prediction, in which gaze maps are computed with a feed-forward network, and present a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms.
The core of our solution is a Convolutional LSTM that focuses on the most salient regions of the input image to iteratively refine the predicted saliency map.
Additionally, to tackle the center bias typical of human eye fixations, our model can learn a set of prior maps generated with Gaussian functions.
We show, through an extensive evaluation, that the proposed architecture outperforms the current state of the art on public saliency prediction datasets.
We further study the contribution of each key component to demonstrate their robustness on different scenarios.
This paper presents a real-time face recognition system using kinect sensor.
The algorithm is implemented on GPU using opencl and significant speed improvements are observed.
We use kinect depth image to increase the robustness and reduce computational cost of conventional LBP based face recognition.
The main objective of this paper was to perform robust, high speed fusion based face recognition and tracking.
The algorithm is mainly composed of three steps.
First step is to detect all faces in the video using viola jones algorithm.
The second step is online database generation using a tracking window on the face.
A modified LBP feature vector is calculated using fusion information from depth and greyscale image on GPU.
This feature vector is used to train a svm classifier.
Third step involves recognition of multiple faces based on our modified feature vector.
Region-based memory management (RBMM) is a form of compile time memory management, well-known from the functional programming world.
In this paper we describe our work on implementing RBMM for the logic programming language Mercury.
One interesting point about Mercury is that it is designed with strong type, mode, and determinism systems.
These systems not only provide Mercury programmers with several direct software engineering benefits, such as self-documenting code and clear program logic, but also give language implementors a large amount of information that is useful for program analyses.
In this work, we make use of this information to develop program analyses that determine the distribution of data into regions and transform Mercury programs by inserting into them the necessary region operations.
We prove the correctness of our program analyses and transformation.
To execute the annotated programs, we have implemented runtime support that tackles the two main challenges posed by backtracking.
First, backtracking can require regions removed during forward execution to be "resurrected"; and second, any memory allocated during a computation that has been backtracked over must be recovered promptly and without waiting for the regions involved to come to the end of their life.
We describe in detail our solution of both these problems.
We study in detail how our RBMM system performs on a selection of benchmark programs, including some well-known difficult cases for RBMM.
Even with these difficult cases, our RBMM-enabled Mercury system obtains clearly faster runtimes for 15 out of 18 benchmarks compared to the base Mercury system with its Boehm runtime garbage collector, with an average runtime speedup of 24%, and an average reduction in memory requirements of 95%.
In fact, our system achieves optimal memory consumption in some programs.
We study one-head machines through symbolic and topological dynamics.
In particular, a subshift is associated to the subshift, and we are interested in its complexity in terms of realtime recognition.
We emphasize the class of one-head machines whose subshift can be recognized by a deterministic pushdown automaton.
We prove that this class corresponds to particular restrictions on the head movement, and to equicontinuity in associated dynamical systems.
Vehicle safety depends on (a) the range of identified hazards and (b) the operational situations for which mitigations of these hazards are acceptably decreasing risk.
Moreover, with an increasing degree of autonomy, risk ownership is likely to increase for vendors towards regulatory certification.
Hence, highly automated vehicles have to be equipped with verified controllers capable of reliably identifying and mitigating hazards in all possible operational situations.
To this end, available methods for the design and verification of automated vehicle controllers have to be supported by models for hazard analysis and mitigation.
In this paper, we describe (1) a framework for the analysis and design of planners (i.e., high-level controllers) capable of run-time hazard identification and mitigation, (2) an incremental algorithm for constructing planning models from hazard analysis, and (3) an exemplary application to the design of a fail-operational controller based on a given control system architecture.
Our approach equips the safety engineer with concepts and steps to (2a) elaborate scenarios of endangerment and (2b) design operational strategies for mitigating such scenarios.
Visual segmentation is a key perceptual function that partitions visual space and allows for detection, recognition and discrimination of objects in complex environments.
The processes underlying human segmentation of natural images are still poorly understood.
Existing datasets rely on manual labeling that conflate perceptual, motor, and cognitive factors.
In part, this is because we lack an ideal observer model of segmentation to guide constrained experiments.
On the other hand, despite recent progress in machine learning, modern algorithms still fall short of human segmentation performance.
Our goal here is two-fold (i) propose a model to probe human visual segmentation mechanisms and (ii) develop an efficient algorithm for image segmentation.
To this aim, we propose a novel probabilistic generative model of visual segmentation that for the first time combines 1) knowledge about the sensitivity of neurons in the visual cortex to statistical regularities in natural images; and 2) non-parametric Bayesian priors over segmentation maps (ie partitions of the visual space).
We provide an algorithm for learning and inference, validate it on synthetic data, and illustrate how the two components of our model improve segmentation of natural images.
We then show that the posterior distribution over segmentations captures well the variability across human subjects, indicating that our model provides a viable approach to probe human visual segmentation.
Natural language generation lies at the core of generative dialogue systems and conversational agents.
We describe an ensemble neural language generator, and present several novel methods for data representation and augmentation that yield improved results in our model.
We test the model on three datasets in the restaurant, TV and laptop domains, and report both objective and subjective evaluations of our best model.
Using a range of automatic metrics, as well as human evaluators, we show that our approach achieves better results than state-of-the-art models on the same datasets.
In many scientific fields, the order of coauthors on a paper conveys information about each individual's contribution to a piece of joint work.
We argue that in prior network analyses of coauthorship networks, the information on ordering has been insufficiently considered because ties between authors are typically symmetrized.
This is basically the same as assuming that each co-author has contributed equally to a paper.
We introduce a solution to this problem by adopting a coauthorship credit allocation model proposed by Kim and Diesner (2014), which in its core conceptualizes co-authoring as a directed, weighted, and self-looped network.
We test and validate our application of the adopted framework based on a sample data of 861 authors who have published in the journal Psychometrika.
Results suggest that this novel sociometric approach can complement traditional measures based on undirected networks and expand insights into coauthoring patterns such as the hierarchy of collaboration among scholars.
As another form of validation, we also show how our approach accurately detects prominent scholars in the Psychometric Society affiliated with the journal.
This paper deals with Low-Density Construction-A (LDA) lattices, which are obtained via Construction A from non-binary Low-Density Parity-Check codes.
More precisely, a proof is provided that Voronoi constellations of LDA lattices achieve the capacity of the AWGN channel under lattice encoding and decoding.
This is obtained after showing the same result for more general Construction-A lattice constellations.
The theoretical analysis is carried out in a way that allows to describe how the prime number underlying Construction A behaves as a function of the lattice dimension.
Moreover, no dithering is required in the transmission scheme, simplifying some previous solutions of the problem.
Remarkably, capacity is achievable with LDA lattice codes whose parity-check matrices have constant row and column Hamming weights.
Some expansion properties of random bipartite graphs constitute an extremely important tool for dealing with sparse matrices and allow to find a lower bound of the minimum Euclidean distance of LDA lattices in our ensemble.
Salient object detection is a fundamental problem and has been received a great deal of attentions in computer vision.
Recently deep learning model became a powerful tool for image feature extraction.
In this paper, we propose a multi-scale deep neural network (MSDNN) for salient object detection.
The proposed model first extracts global high-level features and context information over the whole source image with recurrent convolutional neural network (RCNN).
Then several stacked deconvolutional layers are adopted to get the multi-scale feature representation and obtain a series of saliency maps.
Finally, we investigate a fusion convolution module (FCM) to build a final pixel level saliency map.
The proposed model is extensively evaluated on four salient object detection benchmark datasets.
Results show that our deep model significantly outperforms other 12 state-of-the-art approaches.
The past few years have seen a surge of interest in the field of probabilistic logic learning and statistical relational learning.
In this endeavor, many probabilistic logics have been developed.
ProbLog is a recent probabilistic extension of Prolog motivated by the mining of large biological networks.
In ProbLog, facts can be labeled with probabilities.
These facts are treated as mutually independent random variables that indicate whether these facts belong to a randomly sampled program.
Different kinds of queries can be posed to ProbLog programs.
We introduce algorithms that allow the efficient execution of these queries, discuss their implementation on top of the YAP-Prolog system, and evaluate their performance in the context of large networks of biological entities.
Detecting and evaluating regions of brain under various circumstances is one of the most interesting topics in computational neuroscience.
However, the majority of the studies on detecting communities of a functional connectivity network of the brain is done on networks obtained from coherency attributes, and not from correlation.
This lack of studies, in part, is due to the fact that many common methods for clustering graphs require the nodes of the network to be `positively' linked together, a property that is guaranteed by a coherency matrix, by definition.
However, correlation matrices reveal more information regarding how each pair of nodes are linked together.
In this study, for the first time we simultaneously examine four inherently different network clustering methods (spectral, heuristic, and optimization methods) applied to the functional connectivity networks of the CA1 region of the hippocampus of an anaesthetized rat during pre-ictal and post-ictal states.
The networks are obtained from correlation matrices, and its results are compared with the ones obtained by applying the same methods to coherency matrices.
The correlation matrices show a much finer community structure compared to the coherency matrices.
Furthermore, we examine the potential smoothing effect of choosing various window sizes for computing the correlation/coherency matrices.
Latent tree learning models represent sentences by composing their words according to an induced parse tree, all based on a downstream task.
These models often outperform baselines which use (externally provided) syntax trees to drive the composition order.
This work contributes (a) a new latent tree learning model based on shift-reduce parsing, with competitive downstream performance and non-trivial induced trees, and (b) an analysis of the trees learned by our shift-reduce model and by a chart-based model.
There are now several large scale deployments of differential privacy used to collect statistical information about users.
However, these deployments periodically recollect the data and recompute the statistics using algorithms designed for a single use.
As a result, these systems do not provide meaningful privacy guarantees over long time scales.
Moreover, existing techniques to mitigate this effect do not apply in the "local model" of differential privacy that these systems use.
In this paper, we introduce a new technique for local differential privacy that makes it possible to maintain up-to-date statistics over time, with privacy guarantees that degrade only in the number of changes in the underlying distribution rather than the number of collection periods.
We use our technique for tracking a changing statistic in the setting where users are partitioned into an unknown collection of groups, and at every time period each user draws a single bit from a common (but changing) group-specific distribution.
We also provide an application to frequency and heavy-hitter estimation.
Compositionality of semantic concepts in image synthesis and analysis is appealing as it can help in decomposing known and generatively recomposing unknown data.
For instance, we may learn concepts of changing illumination, geometry or albedo of a scene, and try to recombine them to generate physically meaningful, but unseen data for training and testing.
In practice however we often do not have samples from the joint concept space available: We may have data on illumination change in one data set and on geometric change in another one without complete overlap.
We pose the following question: How can we learn two or more concepts jointly from different data sets with mutual consistency where we do not have samples from the full joint space?
We present a novel answer in this paper based on cyclic consistency over multiple concepts, represented individually by generative adversarial networks (GANs).
Our method, ConceptGAN, can be understood as a drop in for data augmentation to improve resilience for real world applications.
Qualitative and quantitative evaluations demonstrate its efficacy in generating semantically meaningful images, as well as one shot face verification as an example application.
Authentication and authorization are two key elements of a software application.
In modern day, OAuth 2.0 framework and OpenID Connect protocol are widely adopted standards fulfilling these requirements.
These protocols are implemented into authorization servers.
It is common to call these authorization servers as identity servers or identity providers since they hold user identity information.
Applications registered to an identity provider can use OpenID Connect to retrieve ID token for authentication.
Access token obtained along with ID token allows the application to consume OAuth 2.0 protected resources.
In this approach, the client application is bound to a single identity provider.
If the client needs to consume a protected resource from a different domain, which only accepts tokens of a defined identity provider, then the client must again follow OpenID Connect protocol to obtain new tokens.
This requires user identity details to be stored in the second identity provider as well.
This paper proposes an extension to OpenID Connect protocol to overcome this issue.
It proposes a client-centric mechanism to exchange identity information as token grants against a trusted identity provider.
Once a grant is accepted, resulting token response contains an access token, which is good enough to access protected resources from token issuing identity provider's domain.
In this paper, we present a spectral graph wavelet approach for shape analysis of carpal bones of human wrist.
We apply a metric called global spectral graph wavelet signature for representation of cortical surface of the carpal bone based on eigensystem of Laplace-Beltrami operator.
Furthermore, we propose a heuristic and efficient way of aggregating local descriptors of a carpal bone surface to global descriptor.
The resultant global descriptor is not only isometric invariant, but also much more efficient and requires less memory storage.
We perform experiments on shape of the carpal bones of ten women and ten men from a publicly-available database.
Experimental results show the excellency of the proposed GSGW compared to recent proposed GPS embedding approach for comparing shapes of the carpal bones across populations.
In the last decade, computer-aided early diagnostics of Alzheimer's Disease (AD) and its prodromal form, Mild Cognitive Impairment (MCI), has been the subject of extensive research.
Some recent studies have shown promising results in the AD and MCI determination using structural and functional Magnetic Resonance Imaging (sMRI, fMRI), Positron Emission Tomography (PET) and Diffusion Tensor Imaging (DTI) modalities.
Furthermore, fusion of imaging modalities in a supervised machine learning framework has shown promising direction of research.
In this paper we first review major trends in automatic classification methods such as feature extraction based methods as well as deep learning approaches in medical image analysis applied to the field of Alzheimer's Disease diagnostics.
Then we propose our own design of a 3D Inception-based Convolutional Neural Network (CNN) for Alzheimer's Disease diagnostics.
The network is designed with an emphasis on the interior resource utilization and uses sMRI and DTI modalities fusion on hippocampal ROI.
The comparison with the conventional AlexNet-based network using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset (http://adni.loni.usc.edu) demonstrates significantly better performance of the proposed 3D Inception-based CNN.
Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia.
This task is closely related to word-sense disambiguation (WSD), where the supervised word-expert approach has prevailed.
In this work we present the results of the word-expert approach to NED, where one classifier is built for each target entity mention string.
The resources necessary to build the system, a dictionary and a set of training instances, have been automatically derived from Wikipedia.
We provide empirical evidence of the value of this approach, as well as a study of the differences between WSD and NED, including ambiguity and synonymy statistics.
MeshFace photos have been widely used in many Chinese business organizations to protect ID face photos from being misused.
The occlusions incurred by random meshes severely degenerate the performance of face verification systems, which raises the MeshFace verification problem between MeshFace and daily photos.
Previous methods cast this problem as a typical low-level vision problem, i.e. blind inpainting.
They recover perceptually pleasing clear ID photos from MeshFaces by enforcing pixel level similarity between the recovered ID images and the ground-truth clear ID images and then perform face verification on them.
Essentially, face verification is conducted on a compact feature space rather than the image pixel space.
Therefore, this paper argues that pixel level similarity and feature level similarity jointly offer the key to improve the verification performance.
Based on this insight, we offer a novel feature oriented blind face inpainting framework.
Specifically, we implement this by establishing a novel DeMeshNet, which consists of three parts.
The first part addresses blind inpainting of the MeshFaces by implicitly exploiting extra supervision from the occlusion position to enforce pixel level similarity.
The second part explicitly enforces a feature level similarity in the compact feature space, which can explore informative supervision from the feature space to produce better inpainting results for verification.
The last part copes with face alignment within the net via a customized spatial transformer module when extracting deep facial features.
All the three parts are implemented within an end-to-end network that facilitates efficient optimization.
Extensive experiments on two MeshFace datasets demonstrate the effectiveness of the proposed DeMeshNet as well as the insight of this paper.
In this paper we will attempt to classify Lindenmayer systems based on properties of sets of rules and the kind of strings those rules generate.
This classification will be referred to as a parametrization of the L-space: the L-space is the phase space in which all possible L-developments are represented.
This space is infinite, because there is no halting algorithm for L-grammars; but it is also subjected to hard conditions, because there are grammars and developments which are not possible states of an L-system: a very well-known example is the space of normal grammars.
Just as the space of normal grammars is parametrized into Regular, Context-Free, Context-Sensitive, and Unrestricted (with proper containment relations holding among them; see Chomsky, 1959: Theorem 1), we contend here that the L-space is a very rich landscape of grammars which cluster into kinds that are not mutually translatable.
With the increasing scale of deployment of Internet of Things (IoT), concerns about IoT security have become more urgent.
In particular, memory corruption attacks play a predominant role as they allow remote compromise of IoT devices.
Control-flow integrity (CFI) is a promising and generic defense technique against these attacks.
However, given the nature of IoT deployments, existing protection mechanisms for traditional computing environments (including CFI) need to be adapted to the IoT setting.
In this paper, we describe the challenges of enabling CFI on microcontroller (MCU) based IoT devices.
We then present CaRE, the first interrupt-aware CFI scheme for low-end MCUs.
CaRE uses a novel way of protecting the CFI metadata by leveraging TrustZone-M security extensions introduced in the ARMv8-M architecture.
Its binary instrumentation approach preserves the memory layout of the target MCU software, allowing pre-built bare-metal binary code to be protected by CaRE.
We describe our implementation on a Cortex-M Prototyping System and demonstrate that CaRE is secure while imposing acceptable performance and memory impact.
In the past few years, several case studies have illustrated that the use of occupancy information in buildings leads to energy-efficient and low-cost HVAC operation.
The widely presented techniques for occupancy estimation include temperature, humidity, CO2 concentration, image camera, motion sensor and passive infrared (PIR) sensor.
So far little studies have been reported in literature to utilize audio and speech processing as indoor occupancy prediction technique.
With rapid advances of audio and speech processing technologies, nowadays it is more feasible and attractive to integrate audio-based signal processing component into smart buildings.
In this work, we propose to utilize audio processing techniques (i.e., speaker recognition and background audio energy estimation) to estimate room occupancy (i.e., the number of people inside a room).
Theoretical analysis and simulation results demonstrate the accuracy and effectiveness of this proposed occupancy estimation technique.
Based on the occupancy estimation, smart buildings will adjust the thermostat setups and HVAC operations, thus, achieving greater quality of service and drastic cost savings.
For many biological image segmentation tasks, including topological knowledge, such as the nesting of classes, can greatly improve results.
However, most `out-of-the-box' CNN models are still blind to such prior information.
In this paper, we propose a novel approach to encode this information, through a multi-level activation layer and three compatible losses.
We benchmark all of them on nuclei segmentation in bright-field microscopy cell images from the 2018 Data Science Bowl challenge, offering an exemplary segmentation task with cells and nested subcellular structures.
Our scheme greatly speeds up learning, and outperforms standard multi-class classification with soft-max activation and a previously proposed method stemming from it, improving the Dice score significantly (p-values<0.007).
Our approach is conceptually simple, easy to implement and can be integrated in any CNN architecture.
It can be generalized to a higher number of classes, with or without further relations of containment.
Resilience is widely recognized as an important design goal though it is one that seems to escape a general and consensual understanding.
Often mixed up with other system attributes; traditionally used with different meanings in as many different disciplines; sought or applied through diverse approaches in various application domains, resilience in fact is a multi-attribute property that implies a number of constitutive abilities.
To further complicate the matter, resilience is not an absolute property but rather it is the result of the match between a system, its current condition, and the environment it is set to operate in.
In this paper we discuss this problem and provide a definition of resilience as a property measurable as a system-environment fit.
This brings to the foreground the dynamic nature of resilience as well as its hard dependence on the context.
A major problem becomes then that, being a dynamic figure, resilience cannot be assessed in absolute terms.
As a way to partially overcome this obstacle, in this paper we provide a number of indicators of the quality of resilience.
Our focus here is that of collective systems, namely those systems resulting from the union of multiple individual parts, sub-systems, or organs.
Through several examples of such systems we observe how our indicators provide insight, at least in the cases at hand, on design flaws potentially affecting the efficiency of the resilience strategies.
A number of conjectures are finally put forward to associate our indicators with factors affecting the quality of resilience.
Coreference resolution is one of the first stages in deep language understanding and its importance has been well recognized in the natural language processing community.
In this paper, we propose a generative, unsupervised ranking model for entity coreference resolution by introducing resolution mode variables.
Our unsupervised system achieves 58.44% F1 score of the CoNLL metric on the English data from the CoNLL-2012 shared task (Pradhan et al., 2012), outperforming the Stanford deterministic system (Lee et al., 2013) by 3.01%.
In this work we propose a new method for the rhythm classification of short single-lead ECG records, using a set of high-level and clinically meaningful features provided by the abductive interpretation of the records.
These features include morphological and rhythm-related features that are used to build two classifiers: one that evaluates the record globally, using aggregated values for each feature; and another one that evaluates the record as a sequence, using a Recurrent Neural Network fed with the individual features for each detected heartbeat.
The two classifiers are finally combined using the stacking technique, providing an answer by means of four target classes: Normal sinus rhythm, Atrial fibrillation, Other anomaly, and Noisy.
The approach has been validated against the 2017 Physionet/CinC Challenge dataset, obtaining a final score of 0.83 and ranking first in the competition.
A three-point monotone difference scheme is proposed for solving a one-dimensional non-stationary convection-diffusion-reaction equation with variable coefficients.
The scheme is based on a parabolic spline and allows to linearly reproduce the numerical solution of the boundary value problem over the integral segment in the form of the function which continuous with its first derivative.
The constructed difference scheme give a highly effective tool for solving problems with a small parameter at the older derivative in a wide range of output data of the problem.
In the test case, numerical and exact solutions of the problem are compared with the significant dominance of the convective term of the equation over the diffusion.
Numerous calculations showed the high efficiency of the new monotonous scheme developed.
In this paper, a solution to the problem of Active Authentication using trace histories is addressed.
Specifically, the task is to perform user verification on mobile devices using historical location traces of the user as a function of time.
Considering the movement of a human as a Markovian motion, a modified Hidden Markov Model (HMM)-based solution is proposed.
The proposed method, namely the Marginally Smoothed HMM (MSHMM), utilizes the marginal probabilities of location and timing information of the observations to smooth-out the emission probabilities while training.
Hence, it can efficiently handle unforeseen observations during the test phase.
The verification performance of this method is compared to a sequence matching (SM) method , a Markov Chain-based method (MC) and an HMM with basic Laplace Smoothing (HMM-lap).
Experimental results using the location information of the UMD Active Authentication Dataset-02 (UMDAA02) and the GeoLife dataset are presented.
The proposed MSHMM method outperforms the compared methods in terms of equal error rate (EER).
Additionally, the effects of different parameters on the proposed method are discussed.
The categorization of emotion names, i.e., the grouping of emotion words that have similar emotional connotations together, is a key tool of Social Psychology used to explore people's knowledge about emotions.
Without exception, the studies following that research line were based on the gauging of the perceived similarity between emotion names by the participants of the experiments.
Here we propose and examine a new approach to study the categories of emotion names - the similarities between target emotion names are obtained by comparing the contexts in which they appear in texts retrieved from the World Wide Web.
This comparison does not account for any explicit semantic information; it simply counts the number of common words or lexical items used in the contexts.
This procedure allows us to write the entries of the similarity matrix as dot products in a linear vector space of contexts.
The properties of this matrix were then explored using Multidimensional Scaling Analysis and Hierarchical Clustering.
Our main findings, namely, the underlying dimension of the emotion space and the categories of emotion names, were consistent with those based on people's judgments of emotion names similarities.
Workflow graphs extend classical flow charts with concurrent fork and join nodes.
They constitute the core of business processing languages such as BPMN or UML Activity Diagrams.
The activities of a workflow graph are executed by humans or machines, generically called resources.
If concurrent activities cannot be executed in parallel by lack of resources, the time needed to execute the workflow increases.
We study the problem of computing the minimal number of resources necessary to fully exploit the concurrency of a given workflow, and execute it as fast as possible (i.e., as fast as with unlimited resources).
We model this problem using free-choice Petri nets, which are known to be equivalent to workflow graphs.
We analyze the computational complexity of two versions of the problem: computing the resource and concurrency thresholds.
We use the results to design an algorithm to approximate the concurrency threshold, and evaluate it on a benchmark suite of 642 industrial examples.
We show that it performs very well in practice: It always provides the exact value, and never takes more than 30 milliseconds for any workflow, even for those with a huge number of reachable markings.
Active communication between robots and humans is essential for effective human-robot interaction.
To accomplish this objective, Cloud Robotics (CR) was introduced to make robots enhance their capabilities.
It enables robots to perform extensive computations in the cloud by sharing their outcomes.
Outcomes include maps, images, processing power, data, activities, and other robot resources.
But due to the colossal growth of data and traffic, CR suffers from serious latency issues.
Therefore, it is unlikely to scale a large number of robots particularly in human-robot interaction scenarios, where responsiveness is paramount.
Furthermore, other issues related to security such as privacy breaches and ransomware attacks can increase.
To address these problems, in this paper, we have envisioned the next generation of social robotic architectures based on Fog Robotics (FR) that inherits the strengths of Fog Computing to augment the future social robotic systems.
These new architectures can escalate the dexterity of robots by shoving the data closer to the robot.
Additionally, they can ensure that human-robot interaction is more responsive by resolving the problems of CR.
Moreover, experimental results are further discussed by considering a scenario of FR and latency as a primary factor comparing to CR models.
Choreographic Programming is a programming paradigm for building concurrent programs that are deadlock-free by construction, as a result of programming communications declaratively and then synthesising process implementations automatically.
Despite strong interest on choreographies, a foundational model that explains which computations can be performed with the hallmark constructs of choreographies is still missing.
In this work, we introduce Core Choreographies (CC), a model that includes only the core primitives of choreographic programming.
Every computable function can be implemented as a choreography in CC, from which we can synthesise a process implementation where independent computations run in parallel.
We discuss the design of CC and argue that it constitutes a canonical model for choreographic programming.
Neural network training process takes long time when the size of training data is huge, without the large set of training values the neural network is unable to learn features.
This dilemma between time and size of data is often solved using fast GPUs, but we present a better solution for a subset of those problems.
To reduce the time for training a regression model using neural network we introduce a loss function called Nth Absolute Root Mean Error (NARME).
It helps to train regression models much faster compared to other existing loss functions.
Experiments show that in most use cases NARME reduces the required number of epochs to almost one-tenth of that required by other commonly used loss functions, and also achieves great accuracy in the small amount of time in which it was trained.
We analyze the AI alignment problem.
This is the problem of aligning an AI's objective function with human preferences.
This problem has been argued to be critical to AI safety, especially in the long run.
But it has also been argued that solving it robustly is extremely challenging, especially in highly complex environments like the Internet.
It seems crucial to accelerate research in this direction.
To this end, we propose a preliminary research program.
Our roadmap aims to decompose alignment into numerous more tractable subproblems.
Our hope is that this will help scholars, engineers and decision-makers to better grasp the upcoming difficulties, and to foresee how they can best contribute to the global effort.
Defeasible logics provide several linguistic features to support the expression of defeasible knowledge.
There is also a wide variety of such logics, expressing different intuitions about defeasible reasoning.
However, the logics can only combine in trivial ways.
This limits their usefulness in contexts where different intuitions are at play in different aspects of a problem.
In particular, in some legal settings, different actors have different burdens of proof, which might be expressed as reasoning in different defeasible logics.
In this paper, we introduce annotated defeasible logic as a flexible formalism permitting multiple forms of defeasibility, and establish some properties of the formalism.
This paper is under consideration for acceptance in Theory and Practice of Logic Programming.
In line with the sensorimotor contingency theory, we investigate the problem of the perception of space from a fundamental sensorimotor perspective.
Despite its pervasive nature in our perception of the world, the origin of the concept of space remains largely mysterious.
For example in the context of artificial perception, this issue is usually circumvented by having engineers pre-define the spatial structure of the problem the agent has to face.
We here show that the structure of space can be autonomously discovered by a naive agent in the form of sensorimotor regularities, that correspond to so called compensable sensory experiences: these are experiences that can be generated either by the agent or its environment.
By detecting such compensable experiences the agent can infer the topological and metric structure of the external space in which its body is moving.
We propose a theoretical description of the nature of these regularities and illustrate the approach on a simulated robotic arm equipped with an eye-like sensor, and which interacts with an object.
Finally we show how these regularities can be used to build an internal representation of the sensor's external spatial configuration.
We resolve in the affirmative conjectures of Repovs and A. Skopenkov (1998), and M. Skopenkov (2003) generalizing the classical Hanani-Tutte theorem to the setting of approximating maps of graphs on 2-dimensional surfaces by embeddings.
Our proof of this result is constructive and almost immediately implies an efficient algorithm for testing if a given piecewise linear map of a graph in a surface is approximable by an embedding.
More precisely, an instance of this problem consists of (i) a graph G whose vertices are partitioned into clusters and whose inter-cluster edges are partitioned into bundles, and (ii) a region R of a 2-dimensional compact surface M given as the union of a set of pairwise disjoint discs corresponding to the clusters and a set of pairwise non-intersecting "pipes" corresponding to the bundles, connecting certain pairs of these discs.
We are to decide whether G can be embedded inside M so that the vertices in every cluster are drawn in the corresponding disc, the edges in every bundle pass only through its corresponding pipe, and every edge crosses the boundary of each disc at most once.
The proof of the theorem concerning to the inverse cyclotomic Discrete Fourier Transform algorithm over finite field is provided.
The discrimination and simplicity of features are very important for effective and efficient pedestrian detection.
However, most state-of-the-art methods are unable to achieve good tradeoff between accuracy and efficiency.
Inspired by some simple inherent attributes of pedestrians (i.e., appearance constancy and shape symmetry), we propose two new types of non-neighboring features (NNF): side-inner difference features (SIDF) and symmetrical similarity features (SSF).
SIDF can characterize the difference between the background and pedestrian and the difference between the pedestrian contour and its inner part.
SSF can capture the symmetrical similarity of pedestrian shape.
However, it's difficult for neighboring features to have such above characterization abilities.
Finally, we propose to combine both non-neighboring and neighboring features for pedestrian detection.
It's found that non-neighboring features can further decrease the average miss rate by 4.44%.
Experimental results on INRIA and Caltech pedestrian datasets demonstrate the effectiveness and efficiency of the proposed method.
Compared to the state-of-the-art methods without using CNN, our method achieves the best detection performance on Caltech, outperforming the second best method (i.e., Checkboards) by 1.63%.
Finding the diameter of a dataset in multidimensional Euclidean space is a well-established problem, with well-known algorithms.
However, most of the algorithms found in the literature do not scale well with large values of data dimension, so the time complexity grows exponentially in most cases, which makes these algorithms impractical.
Therefore, we implemented 4 simple greedy algorithms to be used for approximating the diameter of a multidimensional dataset; these are based on minimum/maximum l2 norms, hill climbing search, Tabu search and Beam search approaches, respectively.
The time complexity of the implemented algorithms is near-linear, as they scale near-linearly with data size and its dimensions.
The results of the experiments (conducted on different machine learning data sets) prove the efficiency of the implemented algorithms and can therefore be recommended for finding the diameter to be used by different machine learning applications when needed.
This paper introduces CuCoTrack, a cuckoo hash based data structure designed to efficiently implement connection tracking.
The proposed scheme exploits the fact that queries always match one existing connection to compress the 5-tuple that identifies the connection.
This reduces significantly the amount of memory needed to store the connections and also the memory bandwidth needed for lookups.
CuCoTrack uses a dynamic fingerprint to avoid collisions thus ensuring that queries are completed in at most two memory accesses and facilitating a hardware implementation.
The proposed scheme has been analyzed theoretically and validated by simulation.
The results show that using 16 bits for the fingerprint is enough to avoid collisions in practical configurations.
To bridge the gap between the capabilities of the state-of-the-art in factoid question answering (QA) and what real users ask, we need large datasets of real user questions that capture the various question phenomena users are interested in, and the diverse ways in which these questions are formulated.
We introduce ComQA, a large dataset of real user questions that exhibit different challenging aspects such as temporal reasoning, compositionality, etc.
ComQA questions come from the WikiAnswers community QA platform.
Through a large crowdsourcing effort, we clean the question dataset, group questions into paraphrase clusters, and annotate clusters with their answers.
ComQA contains 11,214 questions grouped into 4,834 paraphrase clusters.
We detail the process of constructing ComQA, including the measures taken to ensure its high quality while making effective use of crowdsourcing.
We also present an extensive analysis of the dataset and the results achieved by state-of-the-art systems on ComQA, demonstrating that our dataset can be a driver of future research on QA.
We propose a novel and flexible rank-breaking-then-composite-marginal-likelihood (RBCML) framework for learning random utility models (RUMs), which include the Plackett-Luce model.
We characterize conditions for the objective function of RBCML to be strictly log-concave by proving that strict log-concavity is preserved under convolution and marginalization.
We characterize necessary and sufficient conditions for RBCML to satisfy consistency and asymptotic normality.
Experiments on synthetic data show that RBCML for Gaussian RUMs achieves better statistical efficiency and computational efficiency than the state-of-the-art algorithm and our RBCML for the Plackett-Luce model provides flexible tradeoffs between running time and statistical efficiency.
High altitude platform (HAP) drones can provide broadband wireless connectivity to ground users in rural areas by establishing line-of-sight (LoS) links and exploiting effective beamforming techniques.
However, at high altitudes, acquiring the channel state information (CSI) for HAPs, which is a key component to perform beamforming, is challenging.
In this paper, by exploiting an interference alignment (IA) technique, a novel method for achieving the maximum sum-rate in HAP-based communications without CSI is proposed.
In particular, to realize IA, a multiple-antenna tethered balloon is used as a relay between multiple HAP drones and ground stations (GSs).
Here, a multiple-input multiple-output X network system is considered.
The capacity of the considered M*N X network with a tethered balloon relay is derived in closed-form.
Simulation results corroborate the theoretical findings and show that the proposed approach yields the maximum sum-rate in multiple HAPs-GSs communications in absence of CSI.
The results also show the existence of an optimal balloon's altitude for which the sum-rate is maximized.
For autonomous agents to successfully operate in the real world, anticipation of future events and states of their environment is a key competence.
This problem has been formalized as a sequence extrapolation problem, where a number of observations are used to predict the sequence into the future.
Real-world scenarios demand a model of uncertainty of such predictions, as predictions become increasingly uncertain -- in particular on long time horizons.
While impressive results have been shown on point estimates, scenarios that induce multi-modal distributions over future sequences remain challenging.
Our work addresses these challenges in a Gaussian Latent Variable model for sequence prediction.
Our core contribution is a "Best of Many" sample objective that leads to more accurate and more diverse predictions that better capture the true variations in real-world sequence data.
Beyond our analysis of improved model fit, our models also empirically outperform prior work on three diverse tasks ranging from traffic scenes to weather data.
Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints.
Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult.
In this paper, we propose a general solution approach for grouping problems, i.e., reinforcement learning based local search (RLS), which combines reinforcement learning techniques with descent-based local search.
The viability of the proposed approach is verified on a well-known representative grouping problem (graph coloring) where a very simple descent-based coloring algorithm is applied.
Experimental studies on popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves competitive performances compared to a number of well-known coloring algorithms.
A fundamental property of complex networks is the tendency for edges to cluster.
The extent of the clustering is typically quantified by the clustering coefficient, which is the probability that a length-2 path is closed, i.e., induces a triangle in the network.
However, higher-order cliques beyond triangles are crucial to understanding complex networks, and the clustering behavior with respect to such higher-order network structures is not well understood.
Here we introduce higher-order clustering coefficients that measure the closure probability of higher-order network cliques and provide a more comprehensive view of how the edges of complex networks cluster.
Our higher-order clustering coefficients are a natural generalization of the traditional clustering coefficient.
We derive several properties about higher-order clustering coefficients and analyze them under common random graph models.
Finally, we use higher-order clustering coefficients to gain new insights into the structure of real-world networks from several domains.
Over the last five years Deep Neural Nets have offered more accurate solutions to many problems in speech recognition, and computer vision, and these solutions have surpassed a threshold of acceptability for many applications.
As a result, Deep Neural Networks have supplanted other approaches to solving problems in these areas, and enabled many new applications.
While the design of Deep Neural Nets is still something of an art form, in our work we have found basic principles of design space exploration used to develop embedded microprocessor architectures to be highly applicable to the design of Deep Neural Net architectures.
In particular, we have used these design principles to create a novel Deep Neural Net called SqueezeNet that requires as little as 480KB of storage for its model parameters.
We have further integrated all these experiences to develop something of a playbook for creating small Deep Neural Nets for embedded systems.
The fact that the results for 2-receiver broadcast channels (BCs) are not generalized to the 3-receiver ones is of information theoretical importance.
In this paper we study two classes of discrete memoryless BCs with non-causal side information (SI), i.e. multilevel BC (MBC) and 3-receiver less noisy BC.
First, we obtain an achievable rate region and a capacity outer bound for the MBC.
Second, we prove a special capacity region for the 3-receiver less noisy BC.
Third, the obtained special capacity region for the 3-receiver less noisy BC is extended to continuous alphabet fading Gaussian version.
It is worth mentioning that the previous works are special cases of our works.
In diffusion-based molecular communications, messages can be conveyed via the variation in the concentration of molecules in the medium.
In this paper, we intend to analyze the achievable capacity in transmission of information from one node to another in a diffusion channel.
We observe that because of the molecular diffusion in the medium, the channel possesses memory.
We then model the memory of the channel by a two-step Markov chain and obtain the equations describing the capacity of the diffusion channel.
By performing a numerical analysis, we obtain the maximum achievable rate for different levels of the transmitter power, i.e., the molecule production rate.
In this work, our objective is to find out how topological and algebraic properties of unrooted Gaussian tree models determine their security robustness, which is measured by our proposed max-min information (MaMI) metric.
Such metric quantifies the amount of common randomness extractable through public discussion between two legitimate nodes under an eavesdropper attack.
We show some general topological properties that the desired max-min solutions shall satisfy.
Under such properties, we develop conditions under which comparable trees are put together to form partially ordered sets (posets).
Each poset contains the most favorable structure as the poset leader, and the least favorable structure.
Then, we compute the Tutte-like polynomial for each tree in a poset in order to assign a polynomial to any tree in a poset.
Moreover, we propose a novel method, based on restricted integer partitions, to effectively enumerate all poset leaders.
The results not only help us understand the security strength of different Gaussian trees, which is critical when we evaluate the information leakage issues for various jointly Gaussian distributed measurements in networks, but also provide us both an algebraic and a topological perspective in grasping some fundamental properties of such models.
Millimeter wave (mmWave) signals are much more sensitive to blockage, which results in a significant increase of the outage probability, especially for the users at the edge of the cells.
In this paper, we exploit the technique of base station (BS) cooperation to improve the performance of the cell-edge users in the downlink transmission of mmWave cellular networks.
We design two cooperative schemes, which are referred to as fixed-number BS cooperation (FNC) scheme and fixed-region BS cooperation (FRC) scheme, respectively.
In FNC scheme, the cooperative BSs consist of the M nearest BSs around the served cell-edge users, and in FRC scheme, the cooperative BSs include all the BSs located within a given region.
We derive the expressions for the average rate and outage probability of a typical cell-edge user located at the origin based on the stochastic geometry framework.
To reduce the computational complexity of our analytical results for the outage probability, we further propose a Gamma approximation based method to provide approximations with satisfying accuracy.
Our analytical results incorporate the critical characteristics of mmWave channels, i.e., the blockage effects, the different path loss of LOS and NLOS links and the highly directional antenna arrays.
Simulation results show that the performance of the cell-edge users is greatly improved when mmWave networks are combined with the technique of BS cooperation.
Multipath transport protocols like MPTCP transfer data across multiple routes in parallel and deliver it in order at the receiver.
When the delay on one or more of the paths is variable, as is commonly the case, out of order arrivals are frequent and head of line blocking leads to high latency.
This is exacerbated when packet loss, which is also common with wireless links, is tackled using ARQ.
This paper introduces Stochastic Earliest Delivery Path First (S-EDPF), a resilient low delay packet scheduler for multipath transport protocols.
S-EDPF takes explicit account of the stochastic nature of paths and uses this to minimise in-order delivery delay.
S-EDPF also takes account of FEC, jointly scheduling transmission of information and coded packets and in this way allows lossy links to reduce delay and improve resiliency, rather than degrading performance as usually occurs with existing multipath systems.
We implement S-EDPF as a multi-platform application that does not require administration privileges nor modifications to the operating system and has negligible impact on energy consumption.
We present a thorough experimental evaluation in both controlled environments and into the wild, revealing dramatic gains in delay performance compared to existing approaches.
Traditional employment usually provides mechanisms for workers to improve their skills to access better opportunities.
However, crowd work platforms like Amazon Mechanical Turk (AMT) generally do not support skill development (i.e., becoming faster and better at work).
While researchers have started to tackle this problem, most solutions are dependent on experts or requesters willing to help.
However, requesters generally lack the necessary knowledge, and experts are rare and expensive.
To further facilitate crowd workers' skill growth, we present Crowd Coach, a system that enables workers to receive peer coaching while on the job.
We conduct a field experiment and real world deployment to study Crowd Coach in the wild.
Hundreds of workers used Crowd Coach in a variety of tasks, including writing, doing surveys, and labeling images.
We find that Crowd Coach enhances workers' speed without sacrificing their work quality, especially in audio transcription tasks.
We posit that peer coaching systems hold potential for better supporting crowd workers' skill development while on the job.
We finish with design implications from our research.
We consider secret key generation for a "pairwise independent network" model in which every pair of terminals observes correlated sources that are independent of sources observed by all other pairs of terminals.
The terminals are then allowed to communicate publicly with all such communication being observed by all the terminals.
The objective is to generate a secret key shared by a given subset of terminals at the largest rate possible, with the cooperation of any remaining terminals.
Secrecy is required from an eavesdropper that has access to the public interterminal communication.
A (single-letter) formula for secret key capacity brings out a natural connection between the problem of secret key generation and a combinatorial problem of maximal packing of Steiner trees in an associated multigraph.
An explicit algorithm is proposed for secret key generation based on a maximal packing of Steiner trees in a multigraph; the corresponding maximum rate of Steiner tree packing is thus a lower bound for the secret key capacity.
When only two of the terminals or when all the terminals seek to share a secret key, the mentioned algorithm achieves secret key capacity in which case the bound is tight.
This paper proposes a novel deep learning framework named bidirectional-convolutional long short term memory (Bi-CLSTM) network to automatically learn the spectral-spatial feature from hyperspectral images (HSIs).
In the network, the issue of spectral feature extraction is considered as a sequence learning problem, and a recurrent connection operator across the spectral domain is used to address it.
Meanwhile, inspired from the widely used convolutional neural network (CNN), a convolution operator across the spatial domain is incorporated into the network to extract the spatial feature.
Besides, to sufficiently capture the spectral information, a bidirectional recurrent connection is proposed.
In the classification phase, the learned features are concatenated into a vector and fed to a softmax classifier via a fully-connected operator.
To validate the effectiveness of the proposed Bi-CLSTM framework, we compare it with several state-of-the-art methods, including the CNN framework, on three widely used HSIs.
The obtained results show that Bi-CLSTM can improve the classification performance as compared to other methods.
Multi-group multicast beamforming in wireless systems with large antenna arrays and massive audience is investigated in this paper.
Multicast beamforming design is a well-known non-convex quadratically constrained quadratic programming (QCQP) problem.
A conventional method to tackle this problem is to approximate it as a semi-definite programming problem via semi-definite relaxation, whose performance, however, deteriorates considerably as the number of per-group users goes large.
A recent attempt is to apply convex-concave procedure (CCP) to find a stationary solution by treating it as a difference of convex programming problem, whose complexity, however, increases dramatically as the problem size increases.
In this paper, we propose a low-complexity high-performance algorithm for multi-group multicast beamforming design in large-scale wireless systems by leveraging the alternating direction method of multipliers (ADMM) together with CCP.
In specific, the original non-convex QCQP problem is first approximated as a sequence of convex subproblems via CCP.
Each convex subproblem is then reformulated as a novel ADMM form.
Our ADMM reformulation enables that each updating step is performed by solving multiple small-size subproblems with closed-form solutions in parallel.
Numerical results show that our fast algorithm maintains the same favorable performance as state-of-the-art algorithms but reduces the complexity by orders of magnitude.
Understanding emerging areas of a multidisciplinary research field is crucial for researchers,policymakers and other stakeholders.
For them a knowledge structure based on longitudinal bibliographic data can be an effective instrument.
But with the vast amount of available online information it is often hard to understand the knowledge structure for data.
In this paper, we present a novel approach for retrieving online bibliographic data and propose a framework for exploring knowledge structure.
We also present several longitudinal analyses to interpret and visualize the last 20 years of published obesity research data.
In recent years, the mathematical and algorithmic aspects of the phase retrieval problem have received considerable attention.
Many papers in this area mention crystallography as a principal application.
In crystallography, the signal to be recovered is periodic and comprised of atomic distributions arranged homogeneously in the unit cell of the crystal.
The crystallographic problem is both the leading application and one of the hardest forms of phase retrieval.
We have constructed a graded set of benchmark problems for evaluating algorithms that perform this type of phase retrieval.
The data, publicly available online, is provided in an easily interpretable format.
We also propose a simple and unambiguous success/failure criterion based on the actual needs in crystallography.
Baseline runtimes were obtained with an iterative algorithm that is similar but more transparent than those used in crystallography.
Empirically, the runtimes grow exponentially with respect to a new hardness parameter: the sparsity of the signal autocorrelation.
We also review the algorithms used by the leading software packages.
This set of benchmark problems, we hope, will encourage the development of new algorithms for the phase retrieval problem in general, and crystallography in particular.
Scientific Computing typically requires large computational needs which have been addressed with High Performance Distributed Computing.
It is essential to efficiently deploy a number of complex scientific applications, which have different characteristics, and so require distinct computational resources too.
However, in many research laboratories, this high performance architecture is not dedicated.
So, the architecture must be shared to execute a set of scientific applications, with so many different execution times and relative importance to research.
Also, the high performance architectures have different characteristics and costs.
When a new infrastructure has to be acquired to meet the needs of this scenario, the decision-making is hard and complex.
In this work, we present a Gain Function as a model of an utility function, with which it is possible a decision-making with confidence.
With the function is possible to evaluate the best architectural option taking into account aspects of applications and architectures, including the executions time, cost of architecture, the relative importance of each application and also the relative importance of performance and cost on the final evaluation.
This paper presents the Gain Function, examples, and a real case showing their applicabilities.
Summaries of meetings are very important as they convey the essential content of discussions in a concise form.
Generally, it is time consuming to read and understand the whole documents.
Therefore, summaries play an important role as the readers are interested in only the important context of discussions.
In this work, we address the task of meeting document summarization.
Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read.
The extracted utterances contain disfluencies that affect the quality of the extractive summaries.
To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances.
We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach.
The important utterances are then combined together to generate a one-sentence summary.
In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph.
The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment.
The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in non-conversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries.
Experimental results show that our method can generate more informative summaries than the baselines.
In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.
We consider the problem of representing collective behavior of large populations and predicting the evolution of a population distribution over a discrete state space.
A discrete time mean field game (MFG) is motivated as an interpretable model founded on game theory for understanding the aggregate effect of individual actions and predicting the temporal evolution of population distributions.
We achieve a synthesis of MFG and Markov decision processes (MDP) by showing that a special MFG is reducible to an MDP.
This enables us to broaden the scope of mean field game theory and infer MFG models of large real-world systems via deep inverse reinforcement learning.
Our method learns both the reward function and forward dynamics of an MFG from real data, and we report the first empirical test of a mean field game model of a real-world social media population.
Resource usage data, collected using tools such as TACC Stats, capture the resource utilization by nodes within a high performance computing system.
We present methods to analyze the resource usage data to understand the system performance and identify performance anomalies.
The core idea is to model the data as a three-way tensor corresponding to the compute nodes, usage metrics, and time.
Using the reconstruction error between the original tensor and the tensor reconstructed from a low rank tensor decomposition, as a scalar performance metric, enables us to monitor the performance of the system in an online fashion.
This error statistic is then used for anomaly detection that relies on the assumption that the normal/routine behavior of the system can be captured using a low rank approx- imation of the original tensor.
We evaluate the performance of the algorithm using information gathered from system logs and show that the performance anomalies identified by the proposed method correlates with critical errors reported in the system logs.
Results are shown for data collected for 2013 from the Lonestar4 system at the Texas Advanced Computing Center (TACC)
Hyperspectral signature classification is a quantitative analysis approach for hyperspectral imagery which performs detection and classification of the constituent materials at the pixel level in the scene.
The classification procedure can be operated directly on hyperspectral data or performed by using some features extracted from the corresponding hyperspectral signatures containing information like the signature's energy or shape.
In this paper, we describe a technique that applies non-homogeneous hidden Markov chain (NHMC) models to hyperspectral signature classification.
The basic idea is to use statistical models (such as NHMC) to characterize wavelet coefficients which capture the spectrum semantics (i.e., structural information) at multiple levels.
Experimental results show that the approach based on NHMC models can outperform existing approaches relevant in classification tasks.
Unsupervised neural machine translation (NMT) is a recently proposed approach for machine translation which aims to train the model without using any labeled data.
The models proposed for unsupervised NMT often use only one shared encoder to map the pairs of sentences from different languages to a shared-latent space, which is weak in keeping the unique and internal characteristics of each language, such as the style, terminology, and sentence structure.
To address this issue, we introduce an extension by utilizing two independent encoders but sharing some partial weights which are responsible for extracting high-level representations of the input sentences.
Besides, two different generative adversarial networks (GANs), namely the local GAN and global GAN, are proposed to enhance the cross-language translation.
With this new approach, we achieve significant improvements on English-German, English-French and Chinese-to-English translation tasks.
Machine learning algorithms have reached mainstream status and are widely deployed in many applications.
The accuracy of such algorithms depends significantly on the size of the underlying training dataset; in reality a small or medium sized organization often does not have the necessary data to train a reasonably accurate model.
For such organizations, a realistic solution is to train their machine learning models based on their joint dataset (which is a union of the individual ones).
Unfortunately, privacy concerns prevent them from straightforwardly doing so.
While a number of privacy-preserving solutions exist for collaborating organizations to securely aggregate the parameters in the process of training the models, we are not aware of any work that provides a rational framework for the participants to precisely balance the privacy loss and accuracy gain in their collaboration.
In this paper, by focusing on a two-player setting, we model the collaborative training process as a two-player game where each player aims to achieve higher accuracy while preserving the privacy of its own dataset.
We introduce the notion of Price of Privacy, a novel approach for measuring the impact of privacy protection on the accuracy in the proposed framework.
Furthermore, we develop a game-theoretical model for different player types, and then either find or prove the existence of a Nash Equilibrium with regard to the strength of privacy protection for each player.
Using recommendation systems as our main use case, we demonstrate how two players can make practical use of the proposed theoretical framework, including setting up the parameters and approximating the non-trivial Nash Equilibrium.
We investigate different approaches for dialect identification in Arabic broadcast speech, using phonetic, lexical features obtained from a speech recognition system, and acoustic features using the i-vector framework.
We studied both generative and discriminate classifiers, and we combined these features using a multi-class Support Vector Machine (SVM).
We validated our results on an Arabic/English language identification task, with an accuracy of 100%.
We used these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%.
We further report results using the proposed method to discriminate between the five most widely used dialects of Arabic: namely Egyptian, Gulf, Levantine, North African, and MSA, with an accuracy of 52%.
We discuss dialect identification errors in the context of dialect code-switching between Dialectal Arabic and MSA, and compare the error pattern between manually labeled data, and the output from our classifier.
We also release the train and test data as standard corpus for dialect identification.
Spectral inference provides fast algorithms and provable optimality for latent topic analysis.
But for real data these algorithms require additional ad-hoc heuristics, and even then often produce unusable results.
We explain this poor performance by casting the problem of topic inference in the framework of Joint Stochastic Matrix Factorization (JSMF) and showing that previous methods violate the theoretical conditions necessary for a good solution to exist.
We then propose a novel rectification method that learns high quality topics and their interactions even on small, noisy data.
This method achieves results comparable to probabilistic techniques in several domains while maintaining scalability and provable optimality.
In this paper we propose a new family of RRT based algorithms, named RRT+ , that are able to find faster solutions in high-dimensional configuration spaces compared to other existing RRT variants by finding paths in lower dimensional subspaces of the configuration space.
The method can be easily applied to complex hyper-redundant systems and can be adapted by other RRT based planners.
We introduce RRT+ and develop some variants, called PrioritizedRRT+ , PrioritizedRRT+-Connect, and PrioritizedBidirectionalT-RRT+ , that use the new sampling technique and we show that our method provides faster results than the corresponding original algorithms.
Experiments using the state-of-the-art planners available in OMPL show superior performance of RRT+ for high-dimensional motion planning problems.
Understanding the world around us and making decisions about the future is a critical component to human intelligence.
As autonomous systems continue to develop, their ability to reason about the future will be the key to their success.
Semantic anticipation is a relatively under-explored area for which autonomous vehicles could take advantage of (e.g., forecasting pedestrian trajectories).
Motivated by the need for real-time prediction in autonomous systems, we propose to decompose the challenging semantic forecasting task into two subtasks: current frame segmentation and future optical flow prediction.
Through this decomposition, we built an efficient, effective, low overhead model with three main components: flow prediction network, feature-flow aggregation LSTM, and end-to-end learnable warp layer.
Our proposed method achieves state-of-the-art accuracy on short-term and moving objects semantic forecasting while simultaneously reducing model parameters by up to 95% and increasing efficiency by greater than 40x.
A new scheme to sample signals defined in the nodes of a graph is proposed.
The underlying assumption is that such signals admit a sparse representation in a frequency domain related to the structure of the graph, which is captured by the so-called graph-shift operator.
Most of the works that have looked at this problem have focused on using the value of the signal observed at a subset of nodes to recover the signal in the entire graph.
Differently, the sampling scheme proposed here uses as input observations taken at a single node.
The observations correspond to sequential applications of the graph-shift operator, which are linear combinations of the information gathered by the neighbors of the node.
When the graph corresponds to a directed cycle (which is the support of time-varying signals), our method is equivalent to the classical sampling in the time domain.
When the graph is more general, we show that the Vandermonde structure of the sampling matrix, which is critical to guarantee recovery when sampling time-varying signals, is preserved.
Sampling and interpolation are analyzed first in the absence of noise and then noise is considered.
We then study the recovery of the sampled signal when the specific set of frequencies that is active is not known.
Moreover, we present a more general sampling scheme, under which, either our aggregation approach or the alternative approach of sampling a graph signal by observing the value of the signal at a subset of nodes can be both viewed as particular cases.
The last part of the paper presents numerical experiments that illustrate the results developed through both synthetic graph signals and a real-world graph of the economy of the United States.
Motivated by applications in databases, this paper considers various fragments of the calculus of binary relations.
The fragments are obtained by leaving out, or keeping in, some of the standard operators, along with some derived operators such as set difference, projection, coprojection, and residuation.
For each considered fragment, a characterization is obtained for when two given binary relational structures are indistinguishable by expressions in that fragment.
The characterizations are based on appropriately adapted notions of simulation and bisimulation.
Darknet technology such as Tor has been used by various threat actors for organising illegal activities and data exfiltration.
As such, there is a case for organisations to block such traffic, or to try and identify when it is used and for what purposes.
However, anonymity in cyberspace has always been a domain of conflicting interests.
While it gives enough power to nefarious actors to masquerade their illegal activities, it is also the cornerstone to facilitate freedom of speech and privacy.
We present a proof of concept for a novel algorithm that could form the fundamental pillar of a darknet-capable Cyber Threat Intelligence platform.
The solution can reduce anonymity of users of Tor, and considers the existing visibility of network traffic before optionally initiating targeted or widespread BGP interception.
In combination with server HTTP response manipulation, the algorithm attempts to reduce the candidate data set to eliminate client-side traffic that is most unlikely to be responsible for server-side connections of interest.
Our test results show that MITM manipulated server responses lead to expected changes received by the Tor client.
Using simulation data generated by shadow, we show that the detection scheme is effective with false positive rate of 0.001, while sensitivity detecting non-targets was 0.016+-0.127.
Our algorithm could assist collaborating organisations willing to share their threat intelligence or cooperate during investigations.
There has been a growing interest for Wireless Distributed Computing (WDC), which leverages collaborative computing over multiple wireless devices.
WDC enables complex applications that a single device cannot support individually.
However, the problem of assigning tasks over multiple devices becomes challenging in the dynamic environments encountered in real-world settings, considering that the resource availability and channel conditions change over time in unpredictable ways due to mobility and other factors.
In this paper, we formulate a task assignment problem as an online learning problem using an adversarial multi-armed bandit framework.
We propose MABSTA, a novel online learning algorithm that learns the performance of unknown devices and channel qualities continually through exploratory probing and makes task assignment decisions by exploiting the gained knowledge.
For maximal adaptability, MABSTA is designed to make no stochastic assumption about the environment.
We analyze it mathematically and provide a worst-case performance guarantee for any dynamic environment.
We also compare it with the optimal offline policy as well as other baselines via emulations on trace-data obtained from a wireless IoT testbed, and show that it offers competitive and robust performance in all cases.
To the best of our knowledge, MABSTA is the first online algorithm in this domain of task assignment problems and provides provable performance guarantee.
The paper presents a deep learning-aided iterative detection algorithm for massive overloaded MIMO systems.
Since the proposed algorithm is based on the projected gradient descent method with trainable parameters, it is named as trainable projected descent-detector (TPG-detector).
The trainable internal parameters can be optimized with standard deep learning techniques such as back propagation and stochastic gradient descent algorithms.
This approach referred to as data-driven tuning brings notable advantages of the proposed scheme such as fast convergence.
The numerical experiments show that TPG-detector achieves comparable detection performance to those of the known algorithms for massive overloaded MIMO channels with lower computation cost.
Experience replay is one of the most commonly used approaches to improve the sample efficiency of reinforcement learning algorithms.
In this work, we propose an approach to select and replay sequences of transitions in order to accelerate the learning of a reinforcement learning agent in an off-policy setting.
In addition to selecting appropriate sequences, we also artificially construct transition sequences using information gathered from previous agent-environment interactions.
These sequences, when replayed, allow value function information to trickle down to larger sections of the state/state-action space, thereby making the most of the agent's experience.
We demonstrate our approach on modified versions of standard reinforcement learning tasks such as the mountain car and puddle world problems and empirically show that it enables better learning of value functions as compared to other forms of experience replay.
Further, we briefly discuss some of the possible extensions to this work, as well as applications and situations where this approach could be particularly useful.
Designing a logo is a long, complicated, and expensive process for any designer.
However, recent advancements in generative algorithms provide models that could offer a possible solution.
Logos are multi-modal, have very few categorical properties, and do not have a continuous latent space.
Yet, conditional generative adversarial networks can be used to generate logos that could help designers in their creative process.
We propose LoGAN: an improved auxiliary classifier Wasserstein generative adversarial neural network (with gradient penalty) that is able to generate logos conditioned on twelve different colors.
In 768 generated instances (12 classes and 64 logos per class), when looking at the most prominent color, the conditional generation part of the model has an overall precision and recall of 0.8 and 0.7 respectively.
LoGAN's results offer a first glance at how artificial intelligence can be used to assist designers in their creative process and open promising future directions, such as including more descriptive labels which will provide a more exhaustive and easy-to-use system.
Network virtualization and softwarizing network functions are trends aiming at higher network efficiency, cost reduction and agility.
They are driven by the evolution in Software Defined Networking (SDN) and Network Function Virtualization (NFV).
This shows that software will play an increasingly important role within telecommunication services, which were previously dominated by hardware appliances.
Service providers can benefit from this, as it enables faster introduction of new telecom services, combined with an agile set of possibilities to optimize and fine-tune their operations.
However, the provided telecom services can only evolve if the adequate software tools are available.
In this article, we explain how the development, deployment and maintenance of such an SDN/NFV-based telecom service puts specific requirements on the platform providing it.
A Software Development Kit (SDK) is introduced, allowing service providers to adequately design, test and evaluate services before they are deployed in production and also update them during their lifetime.
This continuous cycle between development and operations, a concept known as DevOps, is a well known strategy in software development.
To extend its context further to SDN/NFV-based services, the functionalities provided by traditional cloud platforms are not yet sufficient.
By giving an overview of the currently available tools and their limitations, the gaps in DevOps for SDN/NFV services are highlighted.
The benefit of such an SDK is illustrated by a secure content delivery network service (enhanced with deep packet inspection and elastic routing capabilities).
With this use-case, the dynamics between developing and deploying a service are further illustrated.
Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance.
Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives---optimizing to true data distribution and preventing overfitting by regularization.
This paper addresses the above issues by 1) interpreting that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and 2) proposing a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration.
We demonstrate the effectiveness of our idea in several computer vision applications.
The reassembly of a broken archaeological ceramic pottery is an open and complex problem, which remains a scientific process of extreme interest for the archaeological community.
Usually, the solutions suggested by various research groups and universities depend on various aspects such as the matching process of the broken surfaces, the outline of sherds or their colors and geometric characteris-tics, their axis of symmetry, the corners of their contour, the theme portrayed on the surface, the concentric circular rills that are left during the base construction in the inner pottery side by the fingers of the potter artist etc.
In this work the reassembly process is based on a different and more secure idea, since it is based on the thick-ness profile, which is appropriately identified in every fragment.
Specifically, our approach is based on information encapsulated in the inner part of the sherd (i.e. thickness), which is not -or at least not heavily- affected by the presence of harsh environmental conditions, but is safely kept within the sherd itself.
Our method is verified in various use case experiments, using cutting edge technologies such as 3D representations and precise measurements on surfaces from the acquired 3D models.
Motion planning problems have been studied by both the robotics and the controls research communities for a long time, and many algorithms have been developed for their solution.
Among them, incremental sampling-based motion planning algorithms, such as the Rapidly-exploring Random Trees (RRTs), and the Probabilistic Road Maps (PRMs) have become very popular recently, owing to their implementation simplicity and their advantages in handling high-dimensional problems.
Although these algorithms work very well in practice, the quality of the computed solution is often not good, i.e., the solution can be far from the optimal one.
A recent variation of RRT, namely the RRT* algorithm, bypasses this drawback of the traditional RRT algorithm, by ensuring asymptotic optimality as the number of samples tends to infinity.
Nonetheless, the convergence rate to the optimal solution may still be slow.
This paper presents a new incremental sampling-based motion planning algorithm based on Rapidly-exploring Random Graphs (RRG), denoted RRT# (RRT "sharp") which also guarantees asymptotic optimality but, in addition, it also ensures that the constructed spanning tree of the geometric graph is consistent after each iteration.
In consistent trees, the vertices which have the potential to be part of the optimal solution have the minimum cost-come-value.
This implies that the best possible solution is readily computed if there are some vertices in the current graph that are already in the goal region.
Numerical results compare with the RRT* algorithm.
This paper explores the problem of sockpuppet detection in deceptive opinion spam using authorship attribution and verification approaches.
Two methods are explored.
The first is a feature subsampling scheme that uses the KL-Divergence on stylistic language models of an author to find discriminative features.
The second is a transduction scheme, spy induction that leverages the diversity of authors in the unlabeled test set by sending a set of spies (positive samples) from the training set to retrieve hidden samples in the unlabeled test set using nearest and farthest neighbors.
Experiments using ground truth sockpuppet data show the effectiveness of the proposed schemes.
The Apache Spark framework for distributed computation is popular in the data analytics community due to its ease of use, but its MapReduce-style programming model can incur significant overheads when performing computations that do not map directly onto this model.
One way to mitigate these costs is to off-load computations onto MPI codes.
In recent work, we introduced Alchemist, a system for the analysis of large-scale data sets.
Alchemist calls MPI-based libraries from within Spark applications, and it has minimal coding, communication, and memory overheads.
In particular, Alchemist allows users to retain the productivity benefits of working within the Spark software ecosystem without sacrificing performance efficiency in linear algebra, machine learning, and other related computations.
In this paper, we discuss the motivation behind the development of Alchemist, and we provide a detailed overview its design and usage.
We also demonstrate the efficiency of our approach on medium-to-large data sets, using some standard linear algebra operations, namely matrix multiplication and the truncated singular value decomposition of a dense matrix, and we compare the performance of Spark with that of Spark+Alchemist.
These computations are run on the NERSC supercomputer Cori Phase 1, a Cray XC40.
We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections.
The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets.
The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres.
It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies.
We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition.
Code, data, and usage examples are available at https://github.com/mdeff/fma
Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results.
However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact.
In this paper, we introduce a new approach to interactive image search that enables users to provide feedback via natural language, allowing for more natural and effective interaction.
We formulate the task of dialog-based interactive image retrieval as a reinforcement learning problem, and reward the dialog system for improving the rank of the target image during each dialog turn.
To mitigate the cumbersome and costly process of collecting human-machine conversations as the dialog system learns, we train our system with a user simulator, which is itself trained to describe the differences between target and candidate images.
The efficacy of our approach is demonstrated in a footwear retrieval application.
Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface.
Interest in emergent communication has recently surged in Machine Learning.
The focus of this interest has largely been either on investigating the properties of the learned protocol or on utilizing emergent communication to better solve problems that already have a viable solution.
Here, we consider self-driving cars coordinating with each other and focus on how communication influences the agents' collective behavior.
Our main result is that communication helps (most) with adverse conditions.
Recent progress on many imaging and vision tasks has been driven by the use of deep feed-forward neural networks, which are trained by propagating gradients of a loss defined on the final output, back through the network up to the first layer that operates directly on the image.
We propose back-propagating one step further---to learn camera sensor designs jointly with networks that carry out inference on the images they capture.
In this paper, we specifically consider the design and inference problems in a typical color camera---where the sensor is able to measure only one color channel at each pixel location, and computational inference is required to reconstruct a full color image.
We learn the camera sensor's color multiplexing pattern by encoding it as layer whose learnable weights determine which color channel, from among a fixed set, will be measured at each location.
These weights are jointly trained with those of a reconstruction network that operates on the corresponding sensor measurements to produce a full color image.
Our network achieves significant improvements in accuracy over the traditional Bayer pattern used in most color cameras.
It automatically learns to employ a sparse color measurement approach similar to that of a recent design, and moreover, improves upon that design by learning an optimal layout for these measurements.
This paper presents the Computoser hybrid probability/rule based algorithm for music composition (http://computoser.com) and provides a reference implementation.
It addresses the issues of unpleasantness and lack of variation exhibited by many existing approaches by combining the two methods (basing the parameters of the rules on data obtained from preliminary analysis).
A sample of 500+ musical pieces was analyzed to derive probabilities for musical characteristics and events (e.g. scale, tempo, intervals).
The algorithm was constructed to produce musical pieces using the derived probabilities combined with a large set of composition rules, which were obtained and structured after studying established composition practices.
Generated pieces were published on the Computoser website where evaluation was performed by listeners.
The feedback was positive (58.4% approval), asserting the merits of the undertaken approach.
The paper compares this hybrid approach to other approaches to algorithmic composition and presents a survey of the pleasantness of the resulting music.
Despite computation becomes much complex on data with an unprecedented scale, we argue computers or smart devices should and will consistently provide information and knowledge to human being in the order of a few tens milliseconds.
We coin a new term 10-millisecond computing to call attention to this class of workloads.
10-millisecond computing raises many challenges for both software and hardware stacks.
In this paper, using a typical workload-memcached on a 40-core server (a main-stream server in near future), we quantitatively measure 10-ms computing's challenges to conventional operating systems.
For better communication, we propose a simple metric-outlier proportion to measure quality of service: for N completed requests or jobs, if M jobs or requests' latencies exceed the outlier threshold t, the outlier proportion is M/N .
For a 1K-scale system running Linux (version 2.6.32), LXC (version 0.7.5) or XEN (version 4.0.0), respectively, we surprisingly find that so as to reduce the service outlier proportion to 10% (10% users will feel QoS degradation), the outlier proportion of a single server has to be reduced by 871X, 2372X, 2372X accordingly.
Also, we discuss the possible design spaces of 10-ms computing systems from perspectives of datacenter architectures, networking, OS and scheduling, and benchmarking.
The use of preferences in query answering, both in traditional databases and in ontology-based data access, has recently received much attention, due to its many real-world applications.
In this paper, we tackle the problem of top-k query answering in Datalog+/- ontologies subject to the querying user's preferences and a collection of (subjective) reports of other users.
Here, each report consists of scores for a list of features, its author's preferences among the features, as well as other information.
Theses pieces of information of every report are then combined, along with the querying user's preferences and his/her trust into each report, to rank the query results.
We present two alternative such rankings, along with algorithms for top-k (atomic) query answering under these rankings.
We also show that, under suitable assumptions, these algorithms run in polynomial time in the data complexity.
We finally present more general reports, which are associated with sets of atoms rather than single atoms.
A key enabler for the emerging autonomous and cooperative driving services is high-throughput and reliable Vehicle-to-Network (V2N) communication.
In this respect, the millimeter wave (mmWave) frequencies hold great promises because of the large available bandwidth which may provide the required link capacity.
However, this potential is hindered by the challenging propagation characteristics of high-frequency channels and the dynamic topology of the vehicular scenarios, which affect the reliability of the connection.
Moreover, mmWave transmissions typically leverage beamforming gain to compensate for the increased path loss experienced at high frequencies.
This, however, requires fine alignment of the transmitting and receiving beams, which may be difficult in vehicular scenarios.
Those limitations may undermine the performance of V2N communications and pose new challenges for proper vehicular communication design.
In this paper, we study by simulation the practical feasibility of some mmWave-aware strategies to support V2N, in comparison to the traditional LTE connectivity below 6 GHz.
The results show that the orchestration among different radios represents a viable solution to enable both high-capacity and robust V2N communications.
We propose an innovative meteorological radar, which uses reduced number of spatiotemporal samples without compromising the accuracy of target information.
Our approach extends recent research on compressed sensing (CS) for radar remote sensing of hard point scatterers to volumetric targets.
The previously published CS-based radar techniques are not applicable for sampling weather since the precipitation echoes lack sparsity in both range-time and Doppler domains.
We propose an alternative approach by adopting the latest advances in matrix completion algorithms to demonstrate the sparse sensing of weather echoes.
We use Iowa X-band Polarimetric (XPOL) radar data to test and illustrate our algorithms.
Using a dataset of over 1.9 million messages posted on Twitter by about 25,000 ISIS members, we explore how ISIS makes use of social media to spread its propaganda and to recruit militants from the Arab world and across the globe.
By distinguishing between violence-driven, theological, and sectarian content, we trace the connection between online rhetoric and key events on the ground.
To the best of our knowledge, ours is one of the first studies to focus on Arabic content, while most literature focuses on English content.
Our findings yield new important insights about how social media is used by radical militant groups to target the Arab-speaking world, and reveal important patterns in their propaganda efforts.
Nowadays, providing higher data rate is a momentous goal for wireless communications systems.
Interference is one of the important obstacles to reach this purpose.
Interference alignment is a management technique that align interference from other transmitters in the least possible dimension subspace at each receiver and as a result, provide the remaining dimensions for free interference signal.
An uncoordinated interference is an example of interference which cannot be aligned coordinately with interference from coordinated part and consequently, the performance of interference alignment approaches is degraded.
In this paper, we propose two rank minimization methods to enhance the performance of interference alignment in the presence of uncoordinated interference sources.
Firstly, a new objective function is chosen then, a new class of convex relaxation is proposed with respect to the uncoordinated interference which leads to decrease the optimal value of our optimization problem.
Moreover, we use schatten-p-norm as surrogate of rank function and we implement iteratively reweighted algorithm to solve optimization problem.
In addition, we apply our proposed methods to mitigate interference in relay-aided MIMO interference channel, and propose a weighted-sum method to improve the performance of interference alignment in the amplify-and-forward relay-aided MIMO system based on the rank minimization approach.
Finally, our simulation results show that our proposed methods can obtain considerably higher multiplexing gain and sum rate than other approaches in the interference alignment framework and the performance of interference alignment is improved.
This paper proposes a novel framework for the use of eye movement patterns for biometric applications.
Eye movements contain abundant information about cognitive brain functions, neural pathways, etc.
In the proposed method, eye movement data is classified into fixations and saccades.
Features extracted from fixations and saccades are used by a Gaussian Radial Basis Function Network (GRBFN) based method for biometric authentication.
A score fusion approach is adopted to classify the data in the output layer.
In the evaluation stage, the algorithm has been tested using two types of stimuli: random dot following on a screen and text reading.
The results indicate the strength of eye movement pattern as a biometric modality.
The algorithm has been evaluated on BioEye 2015 database and found to outperform all the other methods.
Eye movements are generated by a complex oculomotor plant which is very hard to spoof by mechanical replicas.
Use of eye movement dynamics along with iris recognition technology may lead to a robust counterfeit-resistant person identification system.
In this paper, we consider a time-optimal control problem with uncertainties.
Dynamics of controlled object is expressed by crisp linear system of differential equations with fuzzy initial and final states.
We introduce a notion of fuzzy optimal time and reduce its calculation to two crisp optimal control problems.
We examine the proposed approach on an example.
We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI).
Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution.
Yet, through experiments on ten distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority class baseline across a number of NLI datasets.
Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.
In this paper, we present a joint compression and classification approach of EEG and EMG signals using a deep learning approach.
Specifically, we build our system based on the deep autoencoder architecture which is designed not only to extract discriminant features in the multimodal data representation but also to reconstruct the data from the latent representation using encoder-decoder layers.
Since autoencoder can be seen as a compression approach, we extend it to handle multimodal data at the encoder layer, reconstructed and retrieved at the decoder layer.
We show through experimental results, that exploiting both multimodal data intercorellation and intracorellation 1) Significantly reduces signal distortion particularly for high compression levels 2) Achieves better accuracy in classifying EEG and EMG signals recorded and labeled according to the sentiments of the volunteer.
Several recent papers use image denoising with a Fields of Experts prior to benchmark discrete optimization methods.
We show that a non-linear least squares solver significantly outperforms all known discrete methods on this problem.
Dynamic ensemble selection (DES) techniques work by estimating the level of competence of each classifier from a pool of classifiers.
Only the most competent ones are selected to classify a given test sample.
Hence, the key issue in DES is the criterion used to estimate the level of competence of the classifiers in predicting the label of a given test sample.
In order to perform a more robust ensemble selection, we proposed the META-DES framework using meta-learning, where multiple criteria are encoded as meta-features and are passed down to a meta-classifier that is trained to estimate the competence level of a given classifier.
In this technical report, we present a step-by-step analysis of each phase of the framework during training and test.
We show how each set of meta-features is extracted as well as their impact on the estimation of the competence level of the base classifier.
Moreover, an analysis of the impact of several factors in the system performance, such as the number of classifiers in the pool, the use of different linear base classifiers, as well as the size of the validation data.
We show that using the dynamic selection of linear classifiers through the META-DES framework, we can solve complex non-linear classification problems where other combination techniques such as AdaBoost cannot.
Exploiting dependencies between labels is considered to be crucial for multi-label classification.
Rules are able to expose label dependencies such as implications, subsumptions or exclusions in a human-comprehensible and interpretable manner.
However, the induction of rules with multiple labels in the head is particularly challenging, as the number of label combinations which must be taken into account for each rule grows exponentially with the number of available labels.
To overcome this limitation, algorithms for exhaustive rule mining typically use properties such as anti-monotonicity or decomposability in order to prune the search space.
In the present paper, we examine whether commonly used multi-label evaluation metrics satisfy these properties and therefore are suited to prune the search space for multi-label heads.
Blockchains have recently been under the spotlight due to the boom of cryptocurrencies and decentralized applications.
There is an increasing demand for querying the data stored in a blockchain database.
To ensure query integrity, the user can maintain the entire blockchain database and query the data locally.
However, this approach is not economic, if not infeasible, because of the blockchain's huge data size and considerable maintenance costs.
In this paper, we take the first step toward investigating the problem of verifiable query processing over blockchain databases.
We propose a novel framework, called vChain, that alleviates the storage and computing costs of the user and employs verifiable queries to guarantee the results' integrity.
To support verifiable Boolean range queries, we propose an accumulator-based authenticated data structure that enables dynamic aggregation over arbitrary query attributes.
Two new indexes are further developed to aggregate intra-block and inter-block data records for efficient query verification.
We also propose an inverted prefix tree structure to accelerate the processing of a large number of subscription queries simultaneously.
Security analysis and empirical study validate the robustness and practicality of the proposed techniques.
Software developers create and share code online to demonstrate programming language concepts and programming tasks.
Code snippets can be a useful way to explain and demonstrate a programming concept, but may not always be directly executable.
A code snippet can contain parse errors, or fail to execute if the environment contains unmet dependencies.
This paper presents an empirical analysis of the executable status of Python code snippets shared through the GitHub gist system, and the ability of developers familiar with software configuration to correctly configure and run them.
We find that 75.6% of gists require non-trivial configuration to overcome missing dependencies, configuration files, reliance on a specific operating system, or some other environment configuration.
Our study also suggests the natural assumption developers make about resource names when resolving configuration errors is correct less than half the time.
We also present Gistable, a database and extensible framework built on GitHub's gist system, which provides executable code snippets to enable reproducible studies in software engineering.
Gistable contains 10,259 code snippets, approximately 5,000 with a Dockerfile to configure and execute them without import error.
Gistable is publicly available at https://github.com/gistable/gistable.
Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to.
We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise.
Specifically, we show that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI (Bowman et. al, 2015) and 53% of MultiNLI (Williams et. al, 2017).
Our analysis reveals that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes.
Our findings suggest that the success of natural language inference models to date has been overestimated, and that the task remains a hard open problem.
This paper describes LIUM submissions to WMT17 News Translation Task for English-German, English-Turkish, English-Czech and English-Latvian language pairs.
We train BPE-based attentive Neural Machine Translation systems with and without factored outputs using the open source nmtpy framework.
Competitive scores were obtained by ensembling various systems and exploiting the availability of target monolingual corpora for back-translation.
The impact of back-translation quantity and quality is also analyzed for English-Turkish where our post-deadline submission surpassed the best entry by +1.6 BLEU.
Over the last decade, the process of automatic image colorization has been of significant interest for several application areas including restoration of aged or degraded images.
This problem is highly ill-posed due to the large degrees of freedom during the assignment of color information.
Many of the recent developments in automatic colorization involve images that contain a common theme or require highly processed data such as semantic maps as input.
In our approach, we attempt to fully generalize the colorization procedure using a conditional Deep Convolutional Generative Adversarial Network (DCGAN), extend current methods to high-resolution images and suggest training strategies that speed up the process and greatly stabilize it.
The network is trained over datasets that are publicly available such as CIFAR-10 and Places365.
The results of the generative model and traditional deep neural networks are compared.
We propose a data-driven framework for optimizing privacy-preserving data release mechanisms toward the information-theoretically optimal tradeoff between minimizing distortion of useful data and concealing sensitive information.
Our approach employs adversarially-trained neural networks to implement randomized mechanisms and to perform a variational approximation of mutual information privacy.
We empirically validate our Privacy-Preserving Adversarial Networks (PPAN) framework with experiments conducted on discrete and continuous synthetic data, as well as the MNIST handwritten digits dataset.
With the synthetic data, we find that our model-agnostic PPAN approach achieves tradeoff points very close to the optimal tradeoffs that are analytically-derived from model knowledge.
In experiments with the MNIST data, we visually demonstrate a learned tradeoff between minimizing the pixel-level distortion versus concealing the written digit.
Scheduling and managing queues with bounded buffers are among the most fundamental problems in computer networking.
Traditionally, it is often assumed that all the proper- ties of each packet are known immediately upon arrival.
However, as traffic becomes increasingly heterogeneous and complex, such assumptions are in many cases invalid.
In particular, in various scenarios information about packet characteristics becomes avail- able only after the packet has undergone some initial processing.
In this work, we study the problem of managing queues with limited knowledge.
We start by showing lower bounds on the competitive ratio of any algorithm in such settings.
Next, we use the insight obtained from these bounds to identify several algorithmic concepts appropriate for the problem, and use these guidelines to design a concrete algorithmic framework.
We analyze the performance of our proposed algorithm, and further show how it can be implemented in various settings, which differ by the type and nature of the unknown information.
We further validate our results and algorithmic approach by a simulation study that provides further insights as to our algorithmic design principles in face of limited knowledge.
The Cloud radio access network (C-RAN) offers a revolutionary approach to cellular network deployment, management and evolution.
Advances in software-defined radio (SDR) and networking technology, moreover, enable delivering software-defined everything through the Cloud.
Resources will be pooled and dynamically allocated leveraging abstraction, virtualization, and consolidation techniques; processes will be automated using common application programming interfaces; and network functions and services will be programmatically provided through an orchestrator.
OOCRAN, oocran.dynu.com, is a software framework that is based on the NFV MANO architecture proposed by ETSI.
It provides an orchestration layer for the entire wireless infrastructure, including hardware, software, spectrum, fronthaul and backhaul.
OOCRAN extends existing NFV management frameworks by incorporating the radio communications layers and their management dependencies.
The wireless infrastructure provider can then dynamically provision virtualized wireless networks to wireless service providers.
The testbed's physical infrastructure is built around a computing cluster that executes open-source SDR libraries and connects to SDR-based remote radio heads.
We demonstrate the operation of OOCRAN and discuss the temporal implications of dynamic LTE small cell network deployments.
Optical backbone networks carry a huge amount of bandwidth and serve as a key enabling technology to provide telecommunication connectivity across the world.
Hence, in events of network component (node/link) failures, communication networks may suffer from huge amount of bandwidth loss and service disruptions.
Natural disasters such as earthquakes, hurricanes, tornadoes, etc., occur at different places around the world, causing severe communication service disruptions due to network component failures.
Most of the previous works on optical network survivability assume that the failures are going to occur in future, and the network is made survivable to ensure connectivity in events of failures.
With the advancements in seismology, the predictions of earthquakes are becoming more accurate.
Earthquakes have been a major cause of telecommunication service disruption in the past.
Hence, the information provided by the meteorological departments and other similar agencies of different countries may be helpful in designing networks that are more robust against earthquakes.
In this work, we consider the actual information provided by the Indian meteorological department (IMD) on seismic zones, and earthquakes occurred in the past in India, and propose a scheme to improve the survivability of the existing Indian optical network through minute changes in network topology.
Simulations show significant improvement in the network survivability can be achieved using the proposed scheme in events of earthquakes.
Evolution of deep learning shows that some algorithmic tricks are more durable , while others are not.
To the best of our knowledge, we firstly summarize 5 more durable and complete deep learning components for vision, that is, WARSHIP.
Moreover, we give a biological overview of WARSHIP, emphasizing brain-inspired computing of WARSHIP.
As a step towards WARSHIP, our case study of image super resolution combines 3 components of RSH to deploy a CNN model of WARSHIP-XZNet, which performs a happy medium between speed and performance.
In the modal mu-calculus, a formula is well-formed if each recursive variable occurs underneath an even number of negations.
By means of De Morgan's laws, it is easy to transform any well-formed formula into an equivalent formula without negations -- its negation normal form.
Moreover, if the formula is of size n, its negation normal form of is of the same size O(n).
The full modal mu-calculus and the negation normal form fragment are thus equally expressive and concise.
In this paper we extend this result to the higher-order modal fixed point logic (HFL), an extension of the modal mu-calculus with higher-order recursive predicate transformers.
We present a procedure that converts a formula into an equivalent formula without negations of quadratic size in the worst case and of linear size when the number of variables of the formula is fixed.
The sensitivity of networks regarding the removal of vertices has been studied extensively within the last 15 years.
A common approach to measure this sensitivity is (i) removing successively vertices by following a specific removal strategy and (ii) comparing the original and the modified network using a specific comparison method.
In this paper we apply a wide range of removal strategies and comparison methods in order to study the sensitivity of medium-sized networks from real world and randomly generated networks.
In the first part of our study we observe that social networks and web graphs differ in sensitivity.
When removing vertices, social networks are robust, web graphs are not.
This effect is conclusive with the work of Boldi et al. who analyzed very large networks.
For similarly generated random graphs we find that the sensitivity highly depends on the comparison method.
The choice of the removal strategy has surprisingly marginal impact on the sensitivity as long as we consider removal strategies implied by common centrality measures.
However, it has a strong effect when removing the vertices in random order.
Variance-based logic (VBL) uses the fluctuations or the variance in the state of a particle or a physical quantity to represent different logic levels.
In this letter we show that compared to the traditional bi-stable logic representation the variance-based representation can theoretically achieve a superior performance trade-off (in terms of energy dissipation and information capacity) when operating at fundamental limits imposed by thermal-noise.
We show that for a bi-stable logic device the lower limit on energy dissipated per bit is 4.35KT/bit, whereas under similar operating conditions, a VBL device could achieve a lower limit of sub-KT/bit.
These theoretical results are general enough to be applicable to different instantiations and variants of VBL ranging from digital processors based on energy-scavenging or to processors based on the emerging valleytronic devices.
Most Semantic Role Labeling (SRL) approaches are supervised methods which require a significant amount of annotated corpus, and the annotation requires linguistic expertise.
In this paper, we propose a Multi-Task Active Learning framework for Semantic Role Labeling with Entity Recognition (ER) as the auxiliary task to alleviate the need for extensive data and use additional information from ER to help SRL.
We evaluate our approach on Indonesian conversational dataset.
Our experiments show that multi-task active learning can outperform single-task active learning method and standard multi-task learning.
According to our results, active learning is more efficient by using 12% less of training data compared to passive learning in both single-task and multi-task setting.
We also introduce a new dataset for SRL in Indonesian conversational domain to encourage further research in this area.
Iterative decoding and linear programming decoding are guaranteed to converge to the maximum-likelihood codeword when the underlying Tanner graph is cycle-free.
Therefore, cycles are usually seen as the culprit of low-density parity-check (LDPC) codes.
In this paper, we argue in the context of graph cover pseudocodeword that, for a code that permits a cycle-free Tanner graph, cycles have no effect on error performance as long as they are a part of redundant rows.
Specifically, we characterize all parity-check matrices that are pseudocodeword-free for such class of codes.
Deep neural networks trained on large supervised datasets have led to impressive results in image classification and other tasks.
However, well-annotated datasets can be time-consuming and expensive to collect, lending increased interest to larger but noisy datasets that are more easily obtained.
In this paper, we show that deep neural networks are capable of generalizing from training data for which true labels are massively outnumbered by incorrect labels.
We demonstrate remarkably high test performance after training on corrupted data from MNIST, CIFAR, and ImageNet.
For example, on MNIST we obtain test accuracy above 90 percent even after each clean training example has been diluted with 100 randomly-labeled examples.
Such behavior holds across multiple patterns of label noise, even when erroneous labels are biased towards confusing classes.
We show that training in this regime requires a significant but manageable increase in dataset size that is related to the factor by which correct labels have been diluted.
Finally, we provide an analysis of our results that shows how increasing noise decreases the effective batch size.
Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpose computation.
A tight interaction between the GPU and the interconnection network is the strategy to express the full potential on capability computing of a multi-GPU system on large HPC clusters; that is the reason why an efficient and scalable interconnect is a key technology to finally deliver GPUs for scientific HPC.
In this paper we show the latest architectural and performance improvement of the APEnet+ network fabric, a FPGA-based PCIe board with 6 fully bidirectional off-board links with 34 Gbps of raw bandwidth per direction, and X8 Gen2 bandwidth towards the host PC.
The board implements a Remote Direct Memory Access (RDMA) protocol that leverages upon peer-to-peer (P2P) capabilities of Fermi- and Kepler-class NVIDIA GPUs to obtain real zero-copy, low-latency GPU-to-GPU transfers.
Finally, we report on the development activities for 2013 focusing on the adoption of the latest generation 28 nm FPGAs and the preliminary tests performed on this new platform.
People usually get involved in multiple social networks to enjoy new services or to fulfill their needs.
Many new social networks try to attract users of other existing networks to increase the number of their users.
Once a user (called source user) of a social network (called source network) joins a new social network (called target network), a new inter-network link (called anchor link) is formed between the source and target networks.
In this paper, we concentrated on predicting the formation of such anchor links between heterogeneous social networks.
Unlike conventional link prediction problems in which the formation of a link between two existing users within a single network is predicted, in anchor link prediction, the target user is missing and will be added to the target network once the anchor link is created.
To solve this problem, we use meta-paths as a powerful tool for utilizing heterogeneous information in both the source and target networks.
To this end, we propose an effective general meta-path-based approach called Connector and Recursive Meta-Paths (CRMP).
By using those two different categories of meta-paths, we model different aspects of social factors that may affect a source user to join the target network, resulting in the formation of a new anchor link.
Extensive experiments on real-world heterogeneous social networks demonstrate the effectiveness of the proposed method against the recent methods.
We consider Markov Decision Problems defined over continuous state and action spaces, where an autonomous agent seeks to learn a map from its states to actions so as to maximize its long-term discounted accumulation of rewards.
We address this problem by considering Bellman's optimality equation defined over action-value functions, which we reformulate into a nested non-convex stochastic optimization problem defined over a Reproducing Kernel Hilbert Space (RKHS).
We develop a functional generalization of stochastic quasi-gradient method to solve it, which, owing to the structure of the RKHS, admits a parameterization in terms of scalar weights and past state-action pairs which grows proportionately with the algorithm iteration index.
To ameliorate this complexity explosion, we apply Kernel Orthogonal Matching Pursuit to the sequence of kernel weights and dictionaries, which yields a controllable error in the descent direction of the underlying optimization method.
We prove that the resulting algorithm, called KQ-Learning, converges with probability 1 to a stationary point of this problem, yielding a fixed point of the Bellman optimality operator under the hypothesis that it belongs to the RKHS.
Under constant learning rates, we further obtain convergence to a small Bellman error that depends on the chosen learning rates.
Numerical evaluation on the Continuous Mountain Car and Inverted Pendulum tasks yields convergent parsimonious learned action-value functions, policies that are competitive with the state of the art, and exhibit reliable, reproducible learning behavior.
The present study applies a novel two-dimensional learning framework (2D-UPSO) based on particle swarms for structure selection of polynomial nonlinear auto-regressive with exogenous inputs (NARX) models.
This learning approach explicitly incorporates the information about the cardinality (i.e., the number of terms) into the structure selection process.
Initially, the effectiveness of the proposed approach was compared against the classical genetic algorithm (GA) based approach and it was demonstrated that the 2D-UPSO is superior.
Further, since the performance of any meta-heuristic search algorithm is critically dependent on the choice of the fitness function, the efficacy of the proposed approach was investigated using two distinct information theoretic criteria such as Akaike and Bayesian information criterion.
The robustness of this approach against various levels of measurement noise is also studied.
Simulation results on various nonlinear systems demonstrate that the proposed algorithm could accurately determine the structure of the polynomial NARX model even under the influence of measurement noise.
Programming is a valuable skill in the labor market, making the underrepresentation of women in computing an increasingly important issue.
Online question and answer platforms serve a dual purpose in this field: they form a body of knowledge useful as a reference and learning tool, and they provide opportunities for individuals to demonstrate credible, verifiable expertise.
Issues, such as male-oriented site design or overrepresentation of men among the site's elite may therefore compound the issue of women's underrepresentation in IT.
In this paper we audit the differences in behavior and outcomes between men and women on Stack Overflow, the most popular of these Q&A sites.
We observe significant differences in how men and women participate in the platform and how successful they are.
For example, the average woman has roughly half of the reputation points, the primary measure of success on the site, of the average man.
Using an Oaxaca-Blinder decomposition, an econometric technique commonly applied to analyze differences in wages between groups, we find that most of the gap in success between men and women can be explained by differences in their activity on the site and differences in how these activities are rewarded.
Specifically, 1) men give more answers than women and 2) are rewarded more for their answers on average, even when controlling for possible confounders such as tenure or buy-in to the site.
Women ask more questions and gain more reward per question.
We conclude with a hypothetical redesign of the site's scoring system based on these behavioral differences, cutting the reputation gap in half.
We make an important connection to existing results in econometrics to describe an alternative formulation of inverse reinforcement learning (IRL).
In particular, we describe an algorithm using Conditional Choice Probabilities (CCP), which are maximum likelihood estimates of the policy estimated from expert demonstrations, to solve the IRL problem.
Using the language of structural econometrics, we re-frame the optimal decision problem and introduce an alternative representation of value functions due to (Hotz and Miller 1993).
In addition to presenting the theoretical connections that bridge the IRL literature between Economics and Robotics, the use of CCPs also has the practical benefit of reducing the computational cost of solving the IRL problem.
Specifically, under the CCP representation, we show how one can avoid repeated calls to the dynamic programming subroutine typically used in IRL.
We show via extensive experimentation on standard IRL benchmarks that CCP-IRL is able to outperform MaxEnt-IRL, with as much as a 5x speedup and without compromising on the quality of the recovered reward function.
Scientific collaborations are among the main enablers of development in small national science systems.
Although analysing scientific collaborations is a well-established subject in scientometrics, evaluations of scientific collaborations within a country remain speculative with studies based on a limited number of fields or using data too inadequate to be representative of collaborations at a national level.
This study represents a unique view on the collaborative aspect of scientific activities in New Zealand.
We perform a quantitative study based on all Scopus publications in all subjects for more than 1500 New Zealand institutions over a period of 6 years to generate an extensive mapping of scientific collaboration at a national level.
The comparative results reveal the level of collaboration between New Zealand institutions and business enterprises, government institutions, higher education providers, and private not for profit organisations in 2010-2015.
Constructing a collaboration network of institutions, we observe a power-law distribution indicating that a small number of New Zealand institutions account for a large proportion of national collaborations.
Network centrality concepts are deployed to identify the most central institutions of the country in terms of collaboration.
We also provide comparative results on 15 universities and Crown research institutes based on 27 subject classifications.
The focus of this paper is to quantify measures of aggregate fluctuations for a class of consensus-seeking multiagent networks subject to exogenous noise with alpha-stable distributions.
This type of noise is generated by a class of random measures with heavy-tailed probability distributions.
We define a cumulative scale parameter using scale parameters of probability distributions of the output variables, as a measure of aggregate fluctuation.
Although this class of measures can be characterized implicitly in closed-form in steady-state, finding their explicit forms in terms of network parameters is, in general, almost impossible.
We obtain several tractable upper bounds in terms of Laplacian spectrum and statistics of the input noise.
Our results suggest that relying on Gaussian-based optimal design algorithms will result in non-optimal solutions for networks that are driven by non-Gaussian noise inputs with alpha-stable distributions.
The manuscript has been submitted for publication to IEEE Transactions on Control of Network Systems.
It is the extended version of preliminary paper included in the proceedings of the 2018 American Control Conference.
Real-world machine learning applications may require functions that are fast-to-evaluate and interpretable.
In particular, guaranteed monotonicity of the learned function can be critical to user trust.
We propose meeting these goals for low-dimensional machine learning problems by learning flexible, monotonic functions using calibrated interpolated look-up tables.
We extend the structural risk minimization framework of lattice regression to train monotonic look-up tables by solving a convex problem with appropriate linear inequality constraints.
In addition, we propose jointly learning interpretable calibrations of each feature to normalize continuous features and handle categorical or missing data, at the cost of making the objective non-convex.
We address large-scale learning through parallelization, mini-batching, and propose random sampling of additive regularizer terms.
Case studies with real-world problems with five to sixteen features and thousands to millions of training samples demonstrate the proposed monotonic functions can achieve state-of-the-art accuracy on practical problems while providing greater transparency to users.
We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces.
Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling.
On the algorithmic front, we describe an analogue of PCA for metric spaces: namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension.
Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning.
Recently, Nguyen et al.(2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network.
In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories.
In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks".
PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw.
We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network).
Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate.
Finally, we show that our model performs reasonably well at the task of image inpainting.
While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.
Many recent works on knowledge distillation have provided ways to transfer the knowledge of a trained network for improving the learning process of a new one, but finding a good technique for knowledge distillation is still an open problem.
In this paper, we provide a new perspective based on a decision boundary, which is one of the most important component of a classifier.
The generalization performance of a classifier is closely related to the adequacy of its decision boundary, so a good classifier bears a good decision boundary.
Therefore, transferring information closely related to the decision boundary can be a good attempt for knowledge distillation.
To realize this goal, we utilize an adversarial attack to discover samples supporting a decision boundary.
Based on this idea, to transfer more accurate information about the decision boundary, the proposed algorithm trains a student classifier based on the adversarial samples supporting the decision boundary.
Experiments show that the proposed method indeed improves knowledge distillation and achieves the state-of-the-arts performance.
Mammography is the most effective and available tool for breast cancer screening.
However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes.
Data mining algorithms could be used to help physicians in their decisions to perform a breast biopsy on a suspicious lesion seen in a mammogram image or to perform a short term follow-up examination instead.
In this research paper data mining classification algorithms; Decision Tree (DT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) are analyzed on mammographic masses data set.
The purpose of this study is to increase the ability of physicians to determine the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient,s age.
The whole data set is divided for training the models and test them by the ratio of 70:30% respectively and the performances of classification algorithms are compared through three statistical measures; sensitivity, specificity, and classification accuracy.
Accuracy of DT, ANN and SVM are 78.12%, 80.56% and 81.25% of test samples respectively.
Our analysis shows that out of these three classification models SVM predicts severity of breast cancer with least error rate and highest accuracy.
There is a vast body of theoretical research on lifted inference in probabilistic graphical models (PGMs).
However, few demonstrations exist where lifting is applied in conjunction with top of the line applied algorithms.
We pursue the applicability of lifted inference for computer vision (CV), with the insight that a globally optimal (MAP) labeling will likely have the same label for two symmetric pixels.
The success of our approach lies in efficiently handling a distinct unary potential on every node (pixel), typical of CV applications.
This allows us to lift the large class of algorithms that model a CV problem via PGM inference.
We propose a generic template for coarse-to-fine (C2F) inference in CV, which progressively refines an initial coarsely lifted PGM for varying quality-time trade-offs.
We demonstrate the performance of C2F inference by developing lifted versions of two near state-of-the-art CV algorithms for stereo vision and interactive image segmentation.
We find that, against flat algorithms, the lifted versions have a much superior anytime performance, without any loss in final solution quality.
One of the most straightforward, direct and efficient approaches to Image Segmentation is Image Thresholding.
Multi-level Image Thresholding is an essential viewpoint in many image processing and Pattern Recognition based real-time applications which can effectively and efficiently classify the pixels into various groups denoting multiple regions in an Image.
Thresholding based Image Segmentation using fuzzy entropy combined with intelligent optimization approaches are commonly used direct methods to properly identify the thresholds so that they can be used to segment an Image accurately.
In this paper a novel approach for multi-level image thresholding is proposed using Type II Fuzzy sets combined with Adaptive Plant Propagation Algorithm (APPA).
Obtaining the optimal thresholds for an image by maximizing the entropy is extremely tedious and time consuming with increase in the number of thresholds.
Hence, Adaptive Plant Propagation Algorithm (APPA), a memetic algorithm based on plant intelligence, is used for fast and efficient selection of optimal thresholds.
This fact is reasonably justified by comparing the accuracy of the outcomes and computational time consumed by other modern state-of-the-art algorithms such as Particle Swarm Optimization (PSO), Gravitational Search Algorithm (GSA) and Genetic Algorithm (GA).
Kernel methods play a critical role in many dimensionality reduction algorithms.
They are useful in manifold learning, classification, clustering and other machine learning tasks.
Setting the kernel's scale parameter, also referred as the kernel's bandwidth, highly affects the extracted low-dimensional representation.
We propose to set a scale parameter that is tailored to the desired application such as classification and manifold learning.
The scale computation for the manifold learning task enables that the dimension of the extracted embedding equals the intrinsic dimension estimation.
Three methods are proposed for scale computation in a classification task.
The proposed frameworks are simulated on artificial and real datasets.
The results show a high correlation between optimal classification rates and the computed scaling.
Finding heavy-elements (heavy-hitters) in streaming data is one of the central, and well-understood tasks.
Despite the importance of this problem, when considering the sliding windows model of streaming (where elements eventually expire) the problem of finding L_2-heavy elements has remained completely open despite multiple papers and considerable success in finding L_1-heavy elements.
In this paper, we develop the first poly-logarithmic-memory algorithm for finding L_2-heavy elements in sliding window model.
Since L_2 heavy elements play a central role for many fundamental streaming problems (such as frequency moments), we believe our method would be extremely useful for many sliding-windows algorithms and applications.
For example, our technique allows us not only to find L_2-heavy elements, but also heavy elements with respect to any L_p for 0<p<2 on sliding windows.
Thus, our paper completely resolves the question of finding L_p-heavy elements for sliding windows with poly-logarithmic memory for all values of p since it is well known that for p>2 this task is impossible.
Our method may have other applications as well.
We demonstrate a broader applicability of our novel yet simple method on two additional examples: we show how to obtain a sliding window approximation of other properties such as the similarity of two streams, or the fraction of elements that appear exactly a specified number of times within the window (the rarity problem).
In these two illustrative examples of our method, we replace the current expected memory bounds with worst case bounds.
A novel tag completion algorithm is proposed in this paper, which is designed with the following features: 1) Low-rank and error s-parsity: the incomplete initial tagging matrix D is decomposed into the complete tagging matrix A and a sparse error matrix E. However, instead of minimizing its nuclear norm, A is further factor-ized into a basis matrix U and a sparse coefficient matrix V, i.e.D=UV+E.
This low-rank formulation encapsulating sparse coding enables our algorithm to recover latent structures from noisy initial data and avoid performing too much denoising; 2) Local reconstruction structure consistency: to steer the completion of D, the local linear reconstruction structures in feature space and tag space are obtained and preserved by U and V respectively.
Such a scheme could alleviate the negative effect of distances measured by low-level features and incomplete tags.
Thus, we can seek a balance between exploiting as much information and not being mislead to suboptimal performance.
Experiments conducted on Corel5k dataset and the newly issued Flickr30Concepts dataset demonstrate the effectiveness and efficiency of the proposed method.
Correct operation of many critical systems is dependent on the data consistency and integrity properties of underlying databases.
Therefore, a verifiable and rigorous database design process is highly desirable.
This research aims to investigate and deliver a comprehensive and practical approach for modelling databases in formal methods through layered refinements.
The methodology is being guided by a number of case studies, using abstraction and refinement in UML-B and verification with the Rodin tool.
UML-B is a graphical representation of the Event-B formalism and the Rodin tool supports verification for Event-B and UML-B.
Our method guides developers to model relational databases in UML-B through layered refinement and to specify the necessary constraints and operations on the database.
We introduce the Visual Data Management System (VDMS), which enables faster access to big-visual-data and adds support to visual analytics.
This is achieved by searching for relevant visual data via metadata stored as a graph, and enabling faster access to visual data through new machine-friendly storage formats.
VDMS differs from existing large scale photo serving, video streaming, and textual big-data management systems due to its primary focus on supporting machine learning and data analytics pipelines that use visual data (images, videos, and feature vectors), treating these as first class entities.
We describe how to use VDMS via its user friendly interface and how it enables rich and efficient vision analytics through a machine learning pipeline for processing medical images.
We show the improved performance of 2x in complex queries over a comparable set-up.
Retaining players over an extended period of time is a long-standing challenge in game industry.
Significant effort has been paid to understanding what motivates players enjoy games.
While individuals may have varying reasons to play or abandon a game at different stages within the game, previous studies have looked at the retention problem from a snapshot view.
This study, by analyzing in-game logs of 51,104 distinct individuals in an online multiplayer game, uniquely offers a multifaceted view of the retention problem over the players' virtual life phases.
We find that key indicators of longevity change with the game level.
Achievement features are important for players at the initial to the advanced phases, yet social features become the most predictive of longevity once players reach the highest level offered by the game.
These findings have theoretical and practical implications for designing online games that are adaptive to meeting the players' needs.
Facial aging and facial rejuvenation analyze a given face photograph to predict a future look or estimate a past look of the person.
To achieve this, it is critical to preserve human identity and the corresponding aging progression and regression with high accuracy.
However, existing methods cannot simultaneously handle these two objectives well.
We propose a novel generative adversarial network based approach, named the Conditional Multi-Adversarial AutoEncoder with Ordinal Regression (CMAAE-OR).
It utilizes an age estimation technique to control the aging accuracy and takes a high-level feature representation to preserve personalized identity.
Specifically, the face is first mapped to a latent vector through a convolutional encoder.
The latent vector is then projected onto the face manifold conditional on the age through a deconvolutional generator.
The latent vector preserves personalized face features and the age controls facial aging and rejuvenation.
A discriminator and an ordinal regression are imposed on the encoder and the generator in tandem, making the generated face images to be more photorealistic while simultaneously exhibiting desirable aging effects.
Besides, a high-level feature representation is utilized to preserve personalized identity of the generated face.
Experiments on two benchmark datasets demonstrate appealing performance of the proposed method over the state-of-the-art.
This paper investigates generation of a secret key from a reciprocal wireless channel.
In particular we consider wireless channels that exhibit sparse structure in the wideband regime and the impact of the sparsity on the secret key capacity.
We explore this problem in two steps.
First, we study key generation from a state-dependent discrete memoryless multiple source.
The state of source captures the effect of channel sparsity.
Secondly, we consider a wireless channel model that captures channel sparsity and correlation between the legitimate users' channel and the eavesdropper's channel.
Such dependency can significantly reduce the secret key capacity.
According to system delay requirements, two performance measures are considered: (i) ergodic secret key capacity and (ii) outage probability.
We show that in the wideband regime when a white sounding sequence is adopted, a sparser channel can achieve a higher ergodic secret key rate than a richer channel can.
For outage performance, we show that if the users generate secret keys at a fraction of the ergodic capacity, the outage probability will decay exponentially in signal bandwidth.
Moreover, a larger exponent is achieved by a richer channel.
In this paper, the class of random irregular block-hierarchical networks is defined and algorithms for generation and calculation of network properties are described.
The algorithms presented for this class of networks are more efficient than known algorithms both in computation time and memory usage and can be used to analyze topological properties of such networks.
The algorithms are implemented in the system created by the authors for the study of topological and statistical properties of random networks.
Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.
We propose to view text classification as a label-word joint embedding problem: each label is embedded in the same space with the word vectors.
We introduce an attention framework that measures the compatibility of embeddings between text sequences and labels.
The attention is learned on a training set of labeled samples to ensure that, given a text sequence, the relevant words are weighted higher than the irrelevant ones.
Our method maintains the interpretability of word embeddings, and enjoys a built-in ability to leverage alternative sources of information, in addition to input text sequences.
Extensive results on the several large text datasets show that the proposed framework outperforms the state-of-the-art methods by a large margin, in terms of both accuracy and speed.
TCP is the most widely used transport protocol in the internet.
However, it offers suboptimal performance when operating over high bandwidth mmWave links.
The main issues introduced by communications at such high frequencies are (i) the sensitivity to blockage and (ii) the high bandwidth fluctuations due to Line of Sight (LOS) to Non Line of Sight (NLOS) transitions and vice versa.
In particular, TCP has an abstract view of the end-to-end connection, which does not properly capture the dynamics of the wireless mmWave link.
The consequence is a suboptimal utilization of the available resources.
In this paper we propose a TCP proxy architecture that improves the performance of TCP flows without any modification at the remote sender side.
The proxy is installed in the Radio Access Network, and exploits information available at the gNB in order to maximize throughput and minimize latency.
We propose and prove a theorem that allows the calculation of a class of functionals on Poisson point processes that have the form of expected values of sum-products of functions.
In proving the theorem, we present a variant of the Campbell-Mecke theorem from stochastic geometry.
We proceed to apply our result in the calculation of expected values involving interference in wireless Poisson networks.
Based on this, we derive outage probabilities for transmissions in a Poisson network with Nakagami fading.
Our results extend the stochastic geometry toolbox used for the mathematical analysis of interference-limited wireless networks.
Over the last years many technological advances were introduced in Internet television to meet user needs and expectations.
However due to an overwhelming bandwidth requirements traditional IP-based television service based on simple client-server approach remains restricted to small group of clients.
In such situation the use of the peer-to-peer overlay paradigm to deliver live television on the Internet is gaining increasing attention.
Unfortunately the current Internet infrastructure provides only best effort services for this kind of applications and do not offer quality of service.
This paper is a research proposition which presents potential solutions for efficient IPTV streaming over P2P networks.
We assume that the solutions will not directly modify existing P2P IPTV protocols but rather will be dedicated for a network engineer or an Internet service provider which will be able to introduce and configure the proposed mechanisms in network routers.
The League Championship Algorithm (LCA) is sport-inspired optimization algorithm that was introduced by Ali Husseinzadeh Kashan in the year 2009.
It has since drawn enormous interest among the researchers because of its potential efficiency in solving many optimization problems and real-world applications.
The LCA has also shown great potentials in solving non-deterministic polynomial time (NP-complete) problems.
This survey presents a brief synopsis of the LCA literatures in peer-reviewed journals, conferences and book chapters.
These research articles are then categorized according to indexing in the major academic databases (Web of Science, Scopus, IEEE Xplore and the Google Scholar).
The analysis was also done to explore the prospects and the challenges of the algorithm and its acceptability among researchers.
This systematic categorization can be used as a basis for future studies.
Network operators are reluctant to share traffic data due to security and privacy concerns.
Consequently, there is a lack of publicly available traces for validating and generalizing the latest results in network and security research.
Anonymization is a possible solution in this context; however, it is unclear how the sanitization of data preserves characteristics important for traffic analysis.
In addition, the privacy-preserving property of state-of-the-art IP address anonymization techniques has come into question by recent attacks that successfully identified a large number of hosts in anonymized traces.
In this paper, we examine the tradeoff between data utility for anomaly detection and the risk of host identification for IP address truncation.
Specifically, we analyze three weeks of unsampled and non-anonymized network traces from a medium-sized backbone network to assess data utility.
The risk of de-anonymizing individual IP addresses is formally evaluated, using a metric based on conditional entropy.
Our results indicate that truncation effectively prevents host identification but degrades the utility of data for anomaly detection.
However, the degree of degradation depends on the metric used and whether network-internal or external addresses are considered.
Entropy metrics are more resistant to truncation than unique counts and the detection quality of anomalies degrades much faster in internal addresses than in external addresses.
In particular, the usefulness of internal address counts is lost even for truncation of only 4 bits whereas utility of external address entropy is virtually unchanged even for truncation of 20 bits.
ESA operates the Sentinel-1 satellites, which provides Synthetic Aperture Radar (SAR) data of Earth.
Recorded Sentinel-1 data have shown a potential for remotely observing and monitoring local conditions on broad acre fields.
Remote sensing using Sentinel-1 have the potential to provide daily updates on the current conditions in the individual fields and at the same time give an overview of the agricultural areas in the region.
Research depends on the ability of independent validation of the presented results.
In the case of the Sentinel-1 satellites, every researcher has access to the same base dataset, and therefore independent validation is possible.
Well documented research performed with Sentinel-1 allow other research the ability to redo the experiments and either validate or falsify presented findings.
Based on current state-of-art research we have chosen to provide a service for researchers in the agricultural domain.
The service allows researchers the ability to monitor local conditions by using the Sentinel-1 information combined with a priori knowledge from broad acre fields.
Correlating processed Sentinel-1 to the actual conditions is still a task the individual researchers must perform to benefit from the service.
In this paper, we presented our methodology in translating sentinel-1 data to a level that is more accessible to researchers in the agricultural field.
The goal here was to make the data more easily available, so the primary focus can be on correlating and comparing to measurements collected in the broadacre fields.
We illustrate the value of the service with three examples of the possible application areas.
The presented application examples are all based on Denmark, where we have processed all sentinel-1 scan from since 2016.
Electric vehicles play a key role in the sustainability of the Smart Cities as they contribute to the reduction of carbon emissions, the preservation of natural resources and the overall quality of life of citizens.
However, when the Smart Grid powers the charging of electric vehicles, high energy costs and power peaks challenge system reliability with risks of blackouts.
This is especially the case when the Smart Grid has to moderate additional uncertainties such as the penetration of renewable energy resources or energy market dynamics.
In addition, social dynamics such as the participation in demand-response programs, the discomfort experienced by an alternative suggested usage of the electric vehicles and even the fairness in terms of how equally discomfort is experienced among the participating citizens perplex even further the operation and regulation of the Smart Grid.
This paper introduces a fully decentralized and privacy-preserving learning mechanism for charging control of electric vehicles that regulates three Smart Grid socio-technical aspects: (i) reliability, (ii) discomfort and (iii) fairness.
By exclusively using local knowledge, an autonomous software agent generates energy demand plans for its vehicle that encode different charging regimes for the battery.
Agents interact to learn and make collective decisions of which plan to execute so that power peaks and energy cost are reduced.
The impact of improving reliability on discomfort and fairness is empirically shown using real-world data under a varied participation level of electric vehicles in the optimization process.
Recently, the demand for faster and more reliable data transmission has brought up complex communications systems.
As a result, it has become more difficult to carry out closed-form solutions that can provide insight about performance levels.
In this paper, different from the existing research, we study a cognitive radio system that employs hybrid-automatic-repeat-request (HARQ) protocols under quality-of-service (QoS) constraints.
We assume that the secondary users access the spectrum by utilizing a strategy that is a combination of underlay and interweave access techniques.
Considering that the secondary users imperfectly perform channel sensing in order to detect the active primary users and that there is a transmission deadline for each data packet at the secondary transmitter buffer, we formulate the state-transition model of the system.
Then, we obtain the state-transition probabilities when HARQ-chase combining is adopted.
Subsequently, we provide the packet-loss rate in the channel and achieve the effective capacity.
Finally, we substantiate our analytical derivations with numerical results.
Typical blur from camera shake often deviates from the standard uniform convolutional script, in part because of problematic rotations which create greater blurring away from some unknown center point.
Consequently, successful blind deconvolution requires the estimation of a spatially-varying or non-uniform blur operator.
Using ideas from Bayesian inference and convex analysis, this paper derives a non-uniform blind deblurring algorithm with several desirable, yet previously-unexplored attributes.
The underlying objective function includes a spatially adaptive penalty which couples the latent sharp image, non-uniform blur operator, and noise level together.
This coupling allows the penalty to automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted.
Remaining regions with modest blur and revealing edges therefore dominate the overall estimation process without explicitly incorporating structure-selection heuristics.
The algorithm can be implemented using a majorization-minimization strategy that is virtually parameter free.
Detailed theoretical analysis and empirical validation on real images serve to validate the proposed method.
Verification of concurrent data structures is one of the most challenging tasks in software verification.
The topic has received considerable attention over the course of the last decade.
Nevertheless, human-driven techniques remain cumbersome and notoriously difficult while automated approaches suffer from limited applicability.
The main obstacle for automation is the complexity of concurrent data structures.
This is particularly true in the absence of garbage collection.
The intricacy of lock-free memory management paired with the complexity of concurrent data structures makes automated verification prohibitive.
In this work we present a method for verifying concurrent data structures and their memory management separately.
We suggest two simpler verification tasks that imply the correctness of the data structure.
The first task establishes an over-approximation of the reclamation behavior of the memory management.
The second task exploits this over-approximation to verify the data structure without the need to consider the implementation of the memory management itself.
To make the resulting verification tasks tractable for automated techniques, we establish a second result.
We show that a verification tool needs to consider only executions where a single memory location is reused.
We implemented our approach and were able to verify linearizability of Michael&Scott's queue and the DGLM queue for both hazard pointers and epoch-based reclamation.
To the best of our knowledge, we are the first to verify such implementations fully automatically.
Stylistic variation is critical to render the utterances generated by conversational agents natural and engaging.
In this paper, we focus on sequence-to-sequence models for open-domain dialogue response generation and propose a new method to evaluate the extent to which such models are able to generate responses that reflect different personality traits.
Most density based stream clustering algorithms separate the clustering process into an online and offline component.
Exact summarized statistics are being employed for defining micro-clusters or grid cells during the online stage followed by macro-clustering during the offline stage.
This paper proposes a novel alternative to the traditional two phase stream clustering scheme, introducing sketch-based data structures for assessing both stream density and cluster membership with probabilistic accuracy guarantees.
A count-min sketch using a damped window model estimates stream density.
Bloom filters employing a variation of active-active buffering estimate cluster membership.
Instances of both types of sketches share the same set of hash functions.
The resulting stream clustering algorithm is capable of detecting arbitrarily shaped clusters while correctly handling outliers and making no assumption on the total number of clusters.
Experimental results over a number of real and synthetic datasets illustrate the proposed algorithm quality and efficiency.
Software is everywhere, from mission critical systems such as industrial power stations, pacemakers and even household appliances.
This growing dependence on technology and the increasing complexity software has serious security implications as it means we are potentially surrounded by software that contain exploitable vulnerabilities.
These challenges have made binary analysis an important area of research in computer science and has emphasized the need for building automated analysis systems that can operate at scale, speed and efficacy; all while performing with the skill of a human expert.
Though great progress has been made in this area of research, there remains limitations and open challenges to be addressed.
Recognizing this need, DARPA sponsored the Cyber Grand Challenge (CGC), a competition to showcase the current state of the art in systems that perform; automated vulnerability detection, exploit generation and software patching.
This paper is a survey of the vulnerability detection and exploit generation techniques, underlying technologies and related works of two of the winning systems Mayhem and Mechanical Phish.
Programs with dynamic allocation are able to create and use an unbounded number of fresh resources, such as references, objects, files, etc.
We propose History-Register Automata (HRA), a new automata-theoretic formalism for modelling such programs.
HRAs extend the expressiveness of previous approaches and bring us to the limits of decidability for reachability checks.
The distinctive feature of our machines is their use of unbounded memory sets (histories) where input symbols can be selectively stored and compared with symbols to follow.
In addition, stored symbols can be consumed or deleted by reset.
We show that the combination of consumption and reset capabilities renders the automata powerful enough to imitate counter machines, and yields closure under all regular operations apart from complementation.
We moreover examine weaker notions of HRAs which strike different balances between expressiveness and effectiveness.
Gray Level Co-occurrence Matrices (GLCM) are one of the earliest techniques used for image texture analysis.
In this paper we defined a new feature called trace extracted from the GLCM and its implications in texture analysis are discussed in the context of Content Based Image Retrieval (CBIR).
The theoretical extension of GLCM to n-dimensional gray scale images are also discussed.
The results indicate that trace features outperform Haralick features when applied to CBIR.
In this paper, we aim at developing scalable neural network-type learning systems.
Motivated by the idea of "constructive neural networks" in approximation theory, we focus on "constructing" rather than "training" feed-forward neural networks (FNNs) for learning, and propose a novel FNNs learning system called the constructive feed-forward neural network (CFN).
Theoretically, we prove that the proposed method not only overcomes the classical saturation problem for FNN approximation, but also reaches the optimal learning rate when the regression function is smooth, while the state-of-the-art learning rates established for traditional FNNs are only near optimal (up to a logarithmic factor).
A series of numerical simulations are provided to show the efficiency and feasibility of CFN via comparing with the well-known regularized least squares (RLS) with Gaussian kernel and extreme learning machine (ELM).
Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors.
However, it is not clear how much information such representations retain about the polarity of sentences.
To study this question, we decode sentiment from unsupervised sentence representations learned with different architectures (sensitive to the order of words, the order of sentences, or none) in 9 typologically diverse languages.
Sentiment results from the (recursive) composition of lexical items and grammatical strategies such as negation and concession.
The results are manifold: we show that there is no `one-size-fits-all' representation architecture outperforming the others across the board.
Rather, the top-ranking architectures depend on the language and data at hand.
Moreover, we find that in several cases the additive composition model based on skip-gram word vectors may surpass supervised state-of-art architectures such as bidirectional LSTMs.
Finally, we provide a possible explanation of the observed variation based on the type of negative constructions in each language.
Tagging is a popular feature that supports several collaborative tasks, including search, as tags produced by one user can help others finding relevant content.
However, task performance depends on the existence of 'good' tags.
A first step towards creating incentives for users to produce 'good' tags is the quantification of their value in the first place.
This work fills this gap by combining qualitative and quantitative research methods.
In particular, using contextual interviews, we first determine aspects that influence users' perception of tags' value for exploratory search.
Next, we formalize some of the identified aspects and propose an information-theoretical method with provable properties that quantifies the two most important aspects (according to the qualitative analysis) that influence the perception of tag value: the ability of a tag to reduce the search space while retrieving relevant items to the user.
The evaluation on real data shows that our method is accurate: tags that users consider more important have higher value than tags users have not expressed interest.
Evaluating conjunctive queries and solving constraint satisfaction problems are fundamental problems in database theory and artificial intelligence, respectively.
These problems are NP-hard, so that several research efforts have been made in the literature for identifying tractable classes, known as islands of tractability, as well as for devising clever heuristics for solving efficiently real-world instances.
Many heuristic approaches are based on enforcing on the given instance a property called local consistency, where (in database terms) each tuple in every query atom matches at least one tuple in every other query atom.
Interestingly, it turns out that, for many well-known classes of queries, such as for the acyclic queries, enforcing local consistency is even sufficient to solve the given instance correctly.
However, the precise power of such a procedure was unclear, but for some very restricted cases.
The paper provides full answers to the long-standing questions about the precise power of algorithms based on enforcing local consistency.
The classes of instances where enforcing local consistency turns out to be a correct query-answering procedure are however not efficiently recognizable.
In fact, the paper finally focuses on certain subclasses defined in terms of the novel notion of greedy tree projections.
These latter classes are shown to be efficiently recognizable and strictly larger than most islands of tractability known so far, both in the general case of tree projections and for specific structural decomposition methods.
Android applications are frequently plagiarized or repackaged, and software obfuscation is a recommended protection against these practices.
However, there is very little data on the overall rates of app obfuscation, the techniques used, or factors that lead to developers to choose to obfuscate their apps.
In this paper, we present the first comprehensive analysis of the use of and challenges to software obfuscation in Android applications.
We analyzed 1.7 million free Android apps from Google Play to detect various obfuscation techniques, finding that only 24.92% of apps are obfuscated by the developer.
To better understand this rate of obfuscation, we surveyed 308 Google Play developers about their experiences and attitudes about obfuscation.
We found that while developers feel that apps in general are at risk of plagiarism, they do not fear theft of their own apps.
Developers also self-report difficulties applying obfuscation for their own apps.
To better understand this, we conducted a follow-up study where the vast majority of 70 participants failed to obfuscate a realistic sample app even while many mistakenly believed they had been successful.
Our findings show that more work is needed to make obfuscation tools more usable, to educate developers on the risk of their apps being reverse engineered, their intellectual property stolen, their apps being repackaged and redistributed as malware and to improve the health of the overall Android ecosystem.
Human-robot collaboration including close physical human-robot interaction (pHRI) is a current trend in industry and also science.
The safety guidelines prescribe two modes of safety: (i) power and force limitation and (ii) speed and separation monitoring.
We examine the potential of robots equipped with artificial sensitive skin and a protective safety zone around it (peripersonal space) to safe pHRI.
Automatically recognizing and localizing wide ranges of human actions has crucial importance for video understanding.
Towards this goal, the THUMOS challenge was introduced in 2013 to serve as a benchmark for action recognition.
Until then, video action recognition, including THUMOS challenge, had focused primarily on the classification of pre-segmented (i.e., trimmed) videos, which is an artificial task.
In THUMOS 2014, we elevated action recognition to a more practical level by introducing temporally untrimmed videos.
These also include `background videos' which share similar scenes and backgrounds as action videos, but are devoid of the specific actions.
The three editions of the challenge organized in 2013--2015 have made THUMOS a common benchmark for action classification and detection and the annual challenge is widely attended by teams from around the world.
In this paper we describe the THUMOS benchmark in detail and give an overview of data collection and annotation procedures.
We present the evaluation protocols used to quantify results in the two THUMOS tasks of action classification and temporal detection.
We also present results of submissions to the THUMOS 2015 challenge and review the participating approaches.
Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos.
We conclude by proposing several directions and improvements for future THUMOS challenges.
The task of sentiment analysis of reviews is carried out using manually built / automatically generated lexicon resources of their own with which terms are matched with lexicon to compute the term count for positive and negative polarity.
On the other hand the Sentiwordnet, which is quite different from other lexicon resources that gives scores (weights) of the positive and negative polarity for each word.
The polarity of a word namely positive, negative and neutral have the score ranging between 0 to 1 indicates the strength/weight of the word with that sentiment orientation.
In this paper, we show that using the Sentiwordnet, how we could enhance the performance of the classification at both sentence and document level.
We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?").
In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange").
This challenging task requires a range of AI skills -- active perception, language understanding, goal-driven navigation, commonsense reasoning, and grounding of language into actions.
In this work, we develop the environments, end-to-end-trained reinforcement learning agents, and evaluation protocols for EmbodiedQA.
Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.
There has been a surge in interest in discrete latent variable models, however, despite several recent improvements, the training of discrete latent variable models has remained challenging and their performance has mostly failed to match their continuous counterparts.
Recent work on vector quantized autoencoders (VQ-VAE) has made substantial progress in this direction, with its perplexity almost matching that of a VAE on datasets such as CIFAR-10.
In this work, we investigate an alternate training technique for VQ-VAE, inspired by its connection to the Expectation Maximization (EM) algorithm.
Training the discrete bottleneck with EM helps us achieve better image generation results on CIFAR-10, and together with knowledge distillation, allows us to develop a non-autoregressive machine translation model whose accuracy almost matches a strong greedy autoregressive baseline Transformer, while being 3.3 times faster at inference.
We study the query complexity of Weak Parity: the problem of computing the parity of an n-bit input string, where one only has to succeed on a 1/2+eps fraction of input strings, but must do so with high probability on those inputs where one does succeed.
It is well-known that n randomized queries and n/2 quantum queries are needed to compute parity on all inputs.
But surprisingly, we give a randomized algorithm for Weak Parity that makes only O(n/log^0.246(1/eps)) queries, as well as a quantum algorithm that makes only O(n/sqrt(log(1/eps))) queries.
We also prove a lower bound of Omega(n/log(1/eps)) in both cases; and using extremal combinatorics, prove lower bounds of Omega(log n) in the randomized case and Omega(sqrt(log n)) in the quantum case for any eps>0.
We show that improving our lower bounds is intimately related to two longstanding open problems about Boolean functions: the Sensitivity Conjecture, and the relationships between query complexity and polynomial degree.
Decide Madrid is the civic technology of Madrid City Council which allows users to create and support online petitions.
Despite the initial success, the platform is encountering problems with the growth of petition signing because petitions are far from the minimum number of supporting votes they must gather.
Previous analyses have suggested that this problem is produced by the interface: a paginated list of petitions which applies a non-optimal ranking algorithm.
For this reason, we present an interactive system for the discovery of topics and petitions.
This approach leads us to reflect on the usefulness of data visualization techniques to address relevant societal challenges.
Availability of large amount of clinical data is opening up new research avenues in a number of fields.
An exciting field in this respect is healthcare, where secondary use of healthcare data is beginning to revolutionize healthcare.
Except for availability of Big Data, both medical data from healthcare institutions (such as EMR data) and data generated from health and wellbeing devices (such as personal trackers), a significant contribution to this trend is also being made by recent advances on machine learning, specifically deep learning algorithms.
Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals.
The success of many existing systems is therefore largely dependent on the choice of features used for training.
In this work, we introduce a novel multi-channel, multi-resolution convolutional auto-encoder neural network that works on raw time-domain signals to determine appropriate multi-resolution features for separating the singing-voice from stereo music.
Our experimental results show that the proposed method can achieve multi-channel audio source separation without the need for hand-crafted features or any pre- or post-processing.
The multivariate probit model (MVP) is a popular classic model for studying binary responses of multiple entities.
Nevertheless, the computational challenge of learning the MVP model, given that its likelihood involves integrating over a multidimensional constrained space of latent variables, significantly limits its application in practice.
We propose a flexible deep generalization of the classic MVP, the Deep Multivariate Probit Model (DMVP), which is an end-to-end learning scheme that uses an efficient parallel sampling process of the multivariate probit model to exploit GPU-boosted deep neural networks.
We present both theoretical and empirical analysis of the convergence behavior of DMVP's sampling process with respect to the resolution of the correlation structure.
We provide convergence guarantees for DMVP and our empirical analysis demonstrates the advantages of DMVP's sampling compared with standard MCMC-based methods.
We also show that when applied to multi-entity modelling problems, which are natural DMVP applications, DMVP trains faster than classical MVP, by at least an order of magnitude, captures rich correlations among entities, and further improves the joint likelihood of entities compared with several competitive models.
The recovery type error estimators introduced by Zienkiewicz and Zhu use a recovered stress field evaluated from the Finite Element (FE) solution.
Their accuracy depends on the quality of the recovered field.
In this sense, accurate results are obtained using recovery procedures based on the Superconvergent Patch recovery technique (SPR).
These error estimators can be easily implemented and provide accurate estimates.
Another important feature is that the recovered solution is of a better quality than the FE solution and can therefore be used as an enhanced solution.
We have developed an SPR-type recovery technique that considers equilibrium and displacements constraints to obtain a very accurate recovered displacements field from which a recovered stress field can also be evaluated.
We propose the use of these recovered fields as the standard output of the FE code instead of the raw FE solution.
Techniques to quantify the error of the recovered solution are therefore needed.
In this report we present an error estimation technique that accurately evaluates the error of the recovered solution both at global and local levels in the FEM and XFEM frameworks.
We have also developed an h-adaptive mesh refinement strategy based on the error of the recovered solution.
As the converge rate of the error of the recovered solution is higher than that of the FE one, the computational cost required to obtain a solution with a prescribed accuracy is smaller than for traditional h-adaptive processes.
A software element defined in one place is typically used in many places.
When it is changed, all its occurrences may need to be changed too, which can severely hinder software evolution.
This has led to the support of encapsulation in modern programming languages.
Unfortunately, as is shown in this paper, this is not enough to express all the constraints that are needed to decouple programming elements that evolve at different paces.
In this paper we show that: a language can be defined to easily express very general coupling constraints; violations to these constraints can be detected automatically.
We then demonstrate several places where the need for coupling constraints arose in open-source Java projects.
These constraints were expressed in comments when explicit constraints would have enabled automatic treatment.
Convolutional Siamese neural networks have been recently used to track objects using deep features.
Siamese architecture can achieve real time speed, however it is still difficult to find a Siamese architecture that maintains the generalization capability, high accuracy and speed while decreasing the number of shared parameters especially when it is very deep.
Furthermore, a conventional Siamese architecture usually processes one local neighborhood at a time, which makes the appearance model local and non-robust to appearance changes.
To overcome these two problems, this paper proposes DensSiam, a novel convolutional Siamese architecture, which uses the concept of dense layers and connects each dense layer to all layers in a feed-forward fashion with a similarity-learning function.
DensSiam also includes a Self-Attention mechanism to force the network to pay more attention to the non-local features during offline training.
Extensive experiments are performed on four tracking benchmarks: OTB2013 and OTB2015 for validation set; and VOT2015, VOT2016 and VOT2017 for testing set.
The obtained results show that DensSiam achieves superior results on these benchmarks compared to other current state-of-the-art methods.
Micro-blogging services such as Twitter allow anyone to publish anything, anytime.
Needless to say, many of the available contents can be diminished as babble or spam.
However, given the number and diversity of users, some valuable pieces of information should arise from the stream of tweets.
Thus, such services can develop into valuable sources of up-to-date information (the so-called real-time web) provided a way to find the most relevant/trustworthy/authoritative users is available.
Hence, this makes a highly pertinent question for which graph centrality methods can provide an answer.
In this paper the author offers a comprehensive survey of feasible algorithms for ranking users in social networks, he examines their vulnerabilities to linking malpractice in such networks, and suggests an objective criterion against which to compare such algorithms.
Additionally, he suggests a first step towards "desensitizing" prestige algorithms against cheating by spammers and other abusive users.
"Fake news" is a recent phenomenon, but misinformation and propaganda are not.
Our new communication technologies make it easy for us to be exposed to high volumes of true, false, irrelevant, and unprovable information.
Future AI is expected to amplify the problem even more.
At the same time, our brains are reaching their limits in handling information.
How should we respond to propaganda?
Technology can help, but relying on it alone will not suffice in the long term.
We also need ethical policies, laws, regulations, and trusted authorities, including fact-checkers.
However, we will not solve the problem without the active engagement of the educated citizen.
Epistemological education, recognition of self biases and protection of our channels of communication and trusted networks are all needed to overcome the problem and continue our progress as democratic societies.
Bridge is among the zero-sum games for which artificial intelligence has not yet outperformed expert human players.
The main difficulty lies in the bidding phase of bridge, which requires cooperative decision making under partial information.
Existing artificial intelligence systems for bridge bidding rely on and are thus restricted by human-designed bidding systems or features.
In this work, we propose a pioneering bridge bidding system without the aid of human domain knowledge.
The system is based on a novel deep reinforcement learning model, which extracts sophisticated features and learns to bid automatically based on raw card data.
The model includes an upper-confidence-bound algorithm and additional techniques to achieve a balance between exploration and exploitation.
Our experiments validate the promising performance of our proposed model.
In particular, the model advances from having no knowledge about bidding to achieving superior performance when compared with a champion-winning computer bridge program that implements a human-designed bidding system.
In metabolomics, small molecules are structurally elucidated using tandem mass spectrometry (MS/MS); this resulted in the computational Maximum Colorful Subtree problem, which is NP-hard.
Unfortunately, data from a single metabolite requires us to solve hundreds or thousands of instances of this problem; and in a single Liquid Chromatography MS/MS run, hundreds or thousands of metabolites are measured.
Here, we comprehensively evaluate the performance of several heuristic algorithms for the problem against an exact algorithm.
We put particular emphasis on whether a heuristic is able to rank candidates such that the correct solution is ranked highly.
We propose this "intermediate" evaluation because evaluating the approximating quality of heuristics is misleading: Even a slightly suboptimal solution can be structurally very different from the true solution.
On the other hand, we cannot structurally evaluate against the ground truth, as this is unknown.
We find that one particular heuristic consistently ranks the correct solution in a top position, allowing us to speed up computations about 100-fold.
We also find that scores of the best heuristic solutions are very close to the optimal score; in contrast, the structure of the solutions can deviate significantly from the optimal structures.
As a promising downlink multiple access scheme, Rate-Splitting Multiple Access (RSMA) has been shown to achieve superior spectral and energy efficiencies compared with Space-Division Multiple Access (SDMA) and Non-Orthogonal Multiple Access (NOMA) in downlink single-cell systems.
By relying on linearly precoded rate-splitting at the transmitter and successive interference cancellation at the receivers, RSMA has the capability of partially decoding the interference and partially treating the interference as noise, and therefore copes with a wide range of user deployments and network loads.
In this work, we further study RSMA in downlink Coordinated Multi-Point (CoMP) Joint Transmission (JT) networks by investigating the optimal beamformer design to maximize the Weighted Sum-Rate (WSR) of all users subject to individual Quality of Service (QoS) rate constraints and per base station power constraints.
Numerical results show that, in CoMP JT, RSMA achieves significant WSR improvement over SDMA and NOMA in a wide range of inter-user and inter-cell channel strength disparities.
Specifically, SDMA (resp.NOMA) is more suited to deployments with little (resp. large) inter-user channel strength disparity and large (resp. little) inter-cell channel disparity, while RSMA is suited to any deployment.
We conclude that RSMA provides rate, robustness and QoS enhancements over SDMA and NOMA in CoMP JT networks.
This paper seeks to combine differential game theory with the actor-critic-identifier architecture to determine forward-in-time, approximate optimal controllers for formation tracking in multi-agent systems, where the agents have uncertain heterogeneous nonlinear dynamics.
A continuous control strategy is proposed, using communication feedback from extended neighbors on a communication topology that has a spanning tree.
A model-based reinforcement learning technique is developed to cooperatively control a group of agents to track a trajectory in a desired formation.
Simulation results are presented to demonstrate the performance of the developed technique.
This paper presents a new approach for training artificial neural networks using techniques for solving the constraint satisfaction problem (CSP).
The quotient gradient system (QGS) is a trajectory-based method for solving the CSP.
This study converts the training set of a neural network into a CSP and uses the QGS to find its solutions.
The QGS finds the global minimum of the optimization problem by tracking trajectories of a nonlinear dynamical system and does not stop at a local minimum of the optimization problem.
Lyapunov theory is used to prove the asymptotic stability of the solutions with and without the presence of measurement errors.
Numerical examples illustrate the effectiveness of the proposed methodology and compare it to a genetic algorithm and error backpropagation.
Investigation of divisibility properties of natural numbers is one of the most important themes in the theory of numbers.
Various tools have been developed over the centuries to discover and study the various patterns in the sequence of natural numbers in the context of divisibility.
In the present paper, we study the divisibility of natural numbers using the framework of a growing complex network.
In particular, using tools from the field of statistical inference, we show that the network is scale-free but has a non-stationary degree distribution.
Along with this, we report a new kind of similarity pattern for the local clustering, which we call "stretching similarity", in this network.
We also show that the various characteristics like average degree, global clustering coefficient and assortativity coefficient of the network vary smoothly with the size of the network.
Using analytical arguments we estimate the asymptotic behavior of global clustering and average degree which is validated using numerical analysis.
Clinical medical data, especially in the intensive care unit (ICU), consist of multivariate time series of observations.
For each patient visit (or episode), sensor data and lab test results are recorded in the patient's Electronic Health Record (EHR).
While potentially containing a wealth of insights, the data is difficult to mine effectively, owing to varying length, irregular sampling and missing data.
Recurrent Neural Networks (RNNs), particularly those using Long Short-Term Memory (LSTM) hidden units, are powerful and increasingly popular models for learning from sequence data.
They effectively model varying length sequences and capture long range dependencies.
We present the first study to empirically evaluate the ability of LSTMs to recognize patterns in multivariate time series of clinical measurements.
Specifically, we consider multilabel classification of diagnoses, training a model to classify 128 diagnoses given 13 frequently but irregularly sampled clinical measurements.
First, we establish the effectiveness of a simple LSTM network for modeling clinical data.
Then we demonstrate a straightforward and effective training strategy in which we replicate targets at each sequence step.
Trained only on raw time series, our models outperform several strong baselines, including a multilayer perceptron trained on hand-engineered features.
In this contribution, we investigate a coarsely quantized Multi-User (MU)-Multiple Input Single Output (MISO) downlink communication system, where we assume 1-Bit Digital-to-Analog Converters (DACs) at the Base Station (BS) antennas.
First, we analyze the achievable sum rate lower-bound using the Bussgang decomposition.
In the presence of the non-linear quanization, our analysis indicates the potential merit of reconsidering traditional signal processing techniques in coarsely quantized systems, i.e., reconsidering transmit covariance matrices whose rank is equal to the rank of the channel.
Furthermore, in the second part of this paper, we propose a linear precoder design which achieves the predicted increase in performance compared with a state of the art linear precoder design.
Moreover, our linear signal processing algorithm allows for higher-order modulation schemes to be employed.
We analyze cooperative Cournot games with boundedly rational firms.
Due to cogni- tive constraints, the members of a coalition cannot accurately predict the coalitional structure of the non-members.
Thus, they compute their value using simple heuris- tics.
In particular, they assign various non-equilibrium probability distributions over the outsiders' set of partitions.
We construct the characteristic function of a coalition in such an environment and we analyze the core of the corresponding games.
We show that the core is non-empty provided the number of firms in the market is sufficiently large.
Moreover, we show that if two distributions over the set of partitions are related via first-order dominance, then the core of the game under the dominated distribution is a subset of the core under the dominant distribution.
This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image.
This architecture is end-to-end trainable, deterministic and problem-agnostic.
It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation.
We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets.
Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.
The paper proposes an approach to modeling users of large Web sites based on combining different data sources: access logs and content of the accessed pages are combined with semantic information about the Web pages, the users and the accesses of the users to the Web site.
The assumption is that we are dealing with a large Web site providing content to a large number of users accessing the site.
The proposed approach represents each user by a set of features derived from the different data sources, where some feature values may be missing for some users.
It further enables user modeling based on the provided characteristics of the targeted user subset.
The approach is evaluated on real-world data where we compare performance of the automatic assignment of a user to a predefined user segment when different data sources are used to represent the users.
Event recognition in still images is an intriguing problem and has potential for real applications.
This paper addresses the problem of event recognition by proposing a convolutional neural network that exploits knowledge of objects and scenes for event classification (OS2E-CNN).
Intuitively, it stands to reason that there exists a correlation among the concepts of objects, scenes, and events.
We empirically demonstrate that the recognition of objects and scenes substantially contributes to the recognition of events.
Meanwhile, we propose an iterative selection method to identify a subset of object and scene classes, which help to more efficiently and effectively transfer their deep representations to event recognition.
Specifically, we develop three types of transferring techniques: (1) initialization-based transferring, (2) knowledge-based transferring, and (3) data-based transferring.
These newly designed transferring techniques exploit multi-task learning frameworks to incorporate extra knowledge from other networks and additional datasets into the training procedure of event CNNs.
These multi-task learning frameworks turn out to be effective in reducing the effect of over-fitting and improving the generalization ability of the learned CNNs.
With OS2E-CNN, we design a multi-ratio and multi-scale cropping strategy, and propose an end-to-end event recognition pipeline.
We perform experiments on three event recognition benchmarks: the ChaLearn Cultural Event Recognition dataset, the Web Image Dataset for Event Recognition (WIDER), and the UIUC Sports Event dataset.
The experimental results show that our proposed algorithm successfully adapts object and scene representations towards the event dataset and that it achieves the current state-of-the-art performance on these challenging datasets.
Exploiting the fact that natural languages are complex systems, the present exploratory article proposes a direct method based on frequency distributions that may be useful when making a decision on the status of problematic phonemes, an open problem in linguistics.
The main notion is that natural languages, which can be considered from a complex outlook as information processing machines, and which somehow manage to set appropriate levels of redundancy, already "made the choice" whether a linguistic unit is a phoneme or not, and this would be reflected in a greater smoothness in a frequency versus rank graph.
For the particular case we chose to study, we conclude that it is reasonable to consider the Spanish semiconsonant /w/ as a separate phoneme from its vowel counterpart /u/, on the one hand, and possibly also the semiconsonant /j/ as a separate phoneme from its vowel counterpart /i/, on the other.
As language has been so central a topic in the study of complexity, this discussion grants us, in addition, an opportunity to gain insight into emerging properties in the broader complex systems debate.
In this paper, we theoretically investigate a new technique for simultaneous information and power transfer (SWIPT) in multiple-input multiple-output (MIMO) point-to-point with radio frequency energy harvesting capabilities.
The proposed technique exploits the spatial decomposition of the MIMO channel and uses the eigenchannels either to convey information or to transfer energy.
In order to generalize our study, we consider channel estimation error in the decomposition process and the interference between the eigenchannels.
An optimization problem that minimizes the total transmitted power subject to maximum power per eigenchannel, information and energy constraints is formulated as a mixed-integer nonlinear program and solved to optimality using mixed-integer second-order cone programming.
A near-optimal mixed-integer linear programming solution is also developed with robust computational performance.
A polynomial complexity algorithm is further proposed for the optimal solution of the problem when no maximum power per eigenchannel constraints are imposed.
In addition, a low polynomial complexity algorithm is developed for the power allocation problem with a given eigenchannel assignment, as well as a low-complexity heuristic for solving the eigenchannel assignment problem.
Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities.
A first step towards solving these tasks is the automated discovery of distributed symbol-like representations.
In this paper, we explicitly formalize this problem as inference in a spatial mixture model where each component is parametrized by a neural network.
Based on the Expectation Maximization framework we then derive a differentiable clustering method that simultaneously learns how to group and represent individual entities.
We evaluate our method on the (sequential) perceptual grouping task and find that it is able to accurately recover the constituent objects.
We demonstrate that the learned representations are useful for next-step prediction.
The rapidly increasing number of mobile devices, voluminous data, and higher data rate are pushing to rethink the current generation of the cellular mobile communication.
The next or fifth generation (5G) cellular networks are expected to meet high-end requirements.
The 5G networks are broadly characterized by three unique features: ubiquitous connectivity, extremely low latency, and very high-speed data transfer.
The 5G networks would provide novel architectures and technologies beyond state-of-the-art architectures and technologies.
In this paper, our intent is to find an answer to the question: "what will be done by 5G and how?"
We investigate and discuss serious limitations of the fourth generation (4G) cellular networks and corresponding new features of 5G networks.
We identify challenges in 5G networks, new technologies for 5G networks, and present a comparative study of the proposed architectures that can be categorized on the basis of energy-efficiency, network hierarchy, and network types.
Interestingly, the implementation issues, e.g., interference, QoS, handoff, security-privacy, channel access, and load balancing, hugely effect the realization of 5G networks.
Furthermore, our illustrations highlight the feasibility of these models through an evaluation of existing real-experiments and testbeds.
Complex applications implemented as Systems on Chip (SoCs) demand extensive use of system level modeling and validation.
Their implementation gathers a large number of complex IP cores and advanced interconnection schemes, such as hierarchical bus architectures or networks on chip (NoCs).
Modeling applications involves capturing its computation and communication characteristics.
Previously proposed communication weighted models (CWM) consider only the application communication aspects.
This work proposes a communication dependence and computation model (CDCM) that can simultaneously consider both aspects of an application.
It presents a solution to the problem of mapping applications on regular NoCs while considering execution time and energy consumption.
The use of CDCM is shown to provide estimated average reductions of 40% in execution time, and 20% in energy consumption, for current technologies.
Network algorithms always prefer low memory cost and fast packet processing speed.
Forwarding information base (FIB), as a typical network processing component, requires a scalable and memory-efficient algorithm to support fast lookups.
In this paper, we present a new network algorithm, Othello Hashing, and its application of a FIB design called Concise, which uses very little memory to support ultra-fast lookups of network names.
Othello Hashing and Concise make use of minimal perfect hashing and relies on the programmable network framework to support dynamic updates.
Our conceptual contribution of Concise is to optimize the memory efficiency and query speed in the data plane and move the relatively complex construction and update components to the resource-rich control plane.
We implemented Concise on three platforms.
Experimental results show that Concise uses significantly smaller memory to achieve much faster query speed compared to existing solutions of network name lookups.
Partial differential equations are central to describing many physical phenomena.
In many applications these phenomena are observed through a sensor network, with the aim of inferring their underlying properties.
Leveraging from certain results in sampling and approximation theory, we present a new framework for solving a class of inverse source problems for physical fields governed by linear partial differential equations.
Specifically, we demonstrate that the unknown field sources can be recovered from a sequence of, so called, generalised measurements by using multidimensional frequency estimation techniques.
Next we show that---for physics-driven fields---this sequence of generalised measurements can be estimated by computing a linear weighted-sum of the sensor measurements; whereby the exact weights (of the sums) correspond to those that reproduce multidimensional exponentials, when used to linearly combine translates of a particular prototype function related to the Green's function of our underlying field.
Explicit formulae are then derived for the sequence of weights, that map sensor samples to the exact sequence of generalised measurements when the Green's function satisfies the generalised Strang-Fix condition.
Otherwise, the same mapping yields a close approximation of the generalised measurements.
Based on this new framework we develop practical, noise robust, sensor network strategies for solving the inverse source problem, and then present numerical simulation results to verify their performance.
It is common for cloud data centers meeting unexpected loads like request bursts, which may lead to overloaded situation and performance degradation.
Dynamic Voltage Frequency Scaling and VM consolidation have been proved effective to manage overloads.
However, they cannot function when the whole data center is overloaded.
Brownout provides a promising direction to avoid overloads through configuring applications to temporarily degrade user experience.
Additionally, brownout can also be applied to reduce data center energy consumption.
As a complementary option for Dynamic Voltage Frequency Scaling and VM consolidation, our combined brownout approach reduces energy consumption through selectively and dynamically deactivating application optional components, which can also be applied to self-contained microservices.
The results show that our approach can save more than 20% energy consumption and there are trade-offs between energy saving and discount offered to users.
Many academics have called for increasing attention to theory in software engineering.
Consequently, this paper empirically evaluates two dissimilar software development process theories - one expressing a more traditional, methodical view (FBS) and one expressing an alternative, more improvisational view (SCI).
A primarily quantitative survey of more than 1300 software developers is combined with four qualitative case studies to achieve a simultaneously broad and deep empirical evaluation.
Case data analysis using a closed-ended, a priori coding scheme based on the two theories strongly supports SCI, as does analysis of questionnaire response distributions (p<0.001; chi-square goodness of fit test).
Furthermore, case-questionnaire triangulation found no evidence that support for SCI varied by participants' gender, education, experience, nationality or the size or nature of their projects.
This suggests that instead of iteration between weakly-coupled phases (analysis, design, coding, testing), it is more accurate and useful to conceptualize development as ad hoc oscillation between organizing perceptions of the project context (Sensemaking), simultaneously improving mental pictures of the context and design artifact (Coevolution) and constructing, debugging and deploying software artifacts (Implementation).
One way to reduce the power consumption in large-scale multiple-input multiple-output (MIMO) systems is to employ low-resolution analog-to-digital converters (ADCs).
In this paper, we investigate antenna selection for large-scale MIMO receivers with low-resolution ADCs, thereby providing more flexibility in resolution and number of ADCs.
To incorporate quantization effects, we generalize an existing objective function for a greedy capacity-maximization antenna selection approach.
The derived objective function offers an opportunity to select an antenna with the best tradeoff between the additional channel gain and increase in quantization error.
Using the generalized objective function, we propose an antenna selection algorithm based on a conventional antenna selection algorithm without an increase in overall complexity.
Simulation results show that the proposed algorithm outperforms the conventional algorithm in achievable capacity for the same number of antennas.
In this paper we are proposing a new sorting algorithm, List Sort algorithm, is based on the dynamic memory allocation.
In this research study we have also shown the comparison of various efficient sorting techniques with List sort.
Due the dynamic nature of the List sort, it becomes much more fast than some conventional comparison sorting techniques and comparable to Quick Sort and Merge Sort.
List sort takes the advantage of the data which is already sorted either in ascending order or in descending order.
V1 is a declarative visual query language for schema-based property graphs.
V1 supports property graphs with mixed (both directed and undirected) edges and half-edges, with multivalued and composite properties, and with empty property values.
V1 supports temporal data types, operators, and functions, and can be extended to support additional data types, operators, and functions (one spatiotemporal model is presented).
V1 is generic, concise, has rich expressive power, and is highly receptive and productive.
Evidence in the literature from several business sectors shows that exploratory and exploitative innovation strategies are complementarily important for competitiveness.
Our empirical findings reinforced those evidences in the context of software development companies.
The innovative behaviour of individuals is an essential ingredient to success in both types of innovations strategies and leaders can have a big influence on this behaviour.
Adopting a leadership style that combines transactional and transformational practices is more likely to produce effective results in supporting innovative behaviour.
In software development, project managers and other group leaders should be stimulated and supported in adopting such practices to create the conditions for innovative behaviour to thrive.
We study the outcome of deferred acceptance when prospective medical residents can only apply to a limited set of hospitals.
This limitation requires residents to make a strategic choice about the quality of hospitals they apply to.
Through a mix of theoretical and experimental results, we study the effect of this strategic choice on the preferences submitted by participants, as well as on the overall welfare.
We find that residents' choices in our model mimic the behavior observed in real systems where individuals apply to a mix of positions consisting mostly of places where they are reasonably likely to get accepted, as well as a few "reach" applications to hospitals of very high quality, and a few "safe" applications to hospitals of lower than their expected level.
Surprisingly, the number of such "safe" applications is not monotone in the number of allowed applications.
We also find that selfish behavior can hurt social welfare, but the deterioration of overall welfare is very minimal.
This paper presents a new optimal filter namely past observation-based extended Kalman filter for the problem of localization of Internet-based mobile robot in which the control input and the feedback measurement suffer from communication delay.
The filter operates through two phases: the time update and the data correction.
The time update predicts the robot position by reformulating the kinematics model to be non-memoryless.
The correction step corrects the prediction by extrapolating the delayed measurement to the present and then incorporating it to the being estimate as there is no delay.
The optimality of the incorporation is ensured by the derivation of a multiplier that reflects the relevance of past observations to the present.
Simulations in MATLAB and experiments in a real networked robot system confirm the validity of the proposed approach.
This paper concerns automated vehicles negotiating with other vehicles, typically human driven, in crossings with the goal to find a decision algorithm by learning typical behaviors of other vehicles.
The vehicle observes distance and speed of vehicles on the intersecting road and use a policy that adapts its speed along its pre-defined trajectory to pass the crossing efficiently.
Deep Q-learning is used on simulated traffic with different predefined driver behaviors and intentions.
The results show a policy that is able to cross the intersection avoiding collision with other vehicles 98% of the time, while at the same time not being too passive.
Moreover, inferring information over time is important to distinguish between different intentions and is shown by comparing the collision rate between a Deep Recurrent Q-Network at 0.85% and a Deep Q-learning at 1.75%.
This paper presents a new dataset called HUMBI - a large corpus of high fidelity models of behavioral signals in 3D from a diverse population measured by a massive multi-camera system.
With our novel design of a portable imaging system (consists of 107 HD cameras), we collect human behaviors from 164 subjects across gender, ethnicity, age, and physical condition at a public venue.
Using the multiview image streams, we reconstruct high fidelity models of five elementary parts: gaze, face, hands, body, and cloth.
As a byproduct, the 3D model provides geometrically consistent image annotation via 2D projection, e.g., body part segmentation.
This dataset is a significant departure from the existing human datasets that suffers from subject diversity.
We hope the HUMBI opens up a new opportunity for the development for behavioral imaging.
We report results on benchmarking Open Information Extraction (OIE) systems using RelVis, a toolkit for benchmarking Open Information Extraction systems.
Our comprehensive benchmark contains three data sets from the news domain and one data set from Wikipedia with overall 4522 labeled sentences and 11243 binary or n-ary OIE relations.
In our analysis on these data sets we compared the performance of four popular OIE systems, ClausIE, OpenIE 4.2, Stanford OpenIE and PredPatt.
In addition, we evaluated the impact of five common error classes on a subset of 749 n-ary tuples.
From our deep analysis we unreveal important research directions for a next generation of OIE systems.
A novel Mathematical Random Number Generator (MRNG) is presented here.
In this case, "mathematical" refers to the fact that to construct that generator it is not necessary to resort to a physical phenomenon, such as the thermal noise of an electronic device, but rather to a mathematical procedure.
The MRNG generates binary strings - in principle, as long as desired - which may be considered genuinely random in the sense that they pass the statistical tests currently accepted to evaluate the randomness of those strings.
From those strings, the MRNG also generates random numbers expressed in base 10.
An MRNG has been installed as a facility on the following web page: http://www.appliedmathgroup.org.
This generator may be used for applications in tasks in: a) computational simulation of probabilistic-type systems, and b) the random selection of samples of different populations.
Users interested in applications in cryptography can build another MRNG, but they would have to withhold information - specified in section 5 - from people who are not authorized to decode messages encrypted using that resource.
In recent years, deep convolutional neural networks (CNNs) have shown record-shattering performance in a variety of computer vision problems, such as visual object recognition, detection and segmentation.
These methods have also been utilised in medical image analysis domain for lesion segmentation, anatomical segmentation and classification.
We present an extensive literature review of CNN techniques applied in brain magnetic resonance imaging (MRI) analysis, focusing on the architectures, pre-processing, data-preparation and post-processing strategies available in these works.
The aim of this study is three-fold.
Our primary goal is to report how different CNN architectures have evolved, discuss state-of-the-art strategies, condense their results obtained using public datasets and examine their pros and cons.
Second, this paper is intended to be a detailed reference of the research activity in deep CNN for brain MRI analysis.
Finally, we present a perspective on the future of CNNs in which we hint some of the research directions in subsequent years.
This paper addresses the problem of video summarization.
Given an input video, the goal is to select a subset of the frames to create a summary video that optimally captures the important information of the input video.
With the large amount of videos available online, video summarization provides a useful tool that assists video search, retrieval, browsing, etc.
In this paper, we formulate video summarization as a sequence labeling problem.
Unlike existing approaches that use recurrent models, we propose fully convolutional sequence models to solve video summarization.
We firstly establish a novel connection between semantic segmentation and video summarization, and then adapt popular semantic segmentation networks for video summarization.
Extensive experiments and analysis on two benchmark datasets demonstrate the effectiveness of our models.
Recently, the millimeter wave (mmWave) band has been investigated as a means to support the foreseen extreme data rate demands of emerging automotive applications, which go beyond the capabilities of existing technologies for vehicular communications.
However, this potential is hindered by the severe isotropic path loss and the harsh propagation of high-frequency channels.
Moreover, mmWave signals are typically directional, to benefit from beamforming gain, and require frequent realignment of the beams to maintain connectivity.
These limitations are particularly challenging when considering vehicle-to-vehicle (V2V) transmissions, because of the highly mobile nature of the vehicular scenarios, and pose new challenges for proper vehicular communication design.
In this paper, we conduct simulations to compare the performance of IEEE 802.11p and the mmWave technology to support V2V networking, aiming at providing insights on how both technologies can complement each other to meet the requirements of future automotive services.
The results show that mmWave-based strategies support ultra-high transmission speeds, and IEEE 802.11p systems have the ability to guarantee reliable and robust communications.
This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation.
We mainly explored two multimodal architectures where either global visual features or convolutional feature maps are integrated in order to benefit from visual context.
Our final systems ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU.
This paper presents a short and simple proof of the Four-Color Theorem, which can be utterly checkable by human mathematicians, without computer assistance.
The new key idea that has permitted it is presented in the Introduction.
One promising trend in digital system integration consists of boosting on-chip communication performance by means of silicon photonics, thus materializing the so-called Optical Networks-on-Chip (ONoCs).
Among them, wavelength routing can be used to route a signal to destination by univocally associating a routing path to the wavelength of the optical carrier.
Such wavelengths should be chosen so to minimize interferences among optical channels and to avoid routing faults.
As a result, physical parameter selection of such networks requires the solution of complex constrained optimization problems.
In previous work, published in the proceedings of the International Conference on Computer-Aided Design, we proposed and solved the problem of computing the maximum parallelism obtainable in the communication between any two endpoints while avoiding misrouting of optical signals.
The underlying technology, only quickly mentioned in that paper, is Answer Set Programming (ASP).
In this work, we detail the ASP approach we used to solve such problem.
Another important design issue is to select the wavelengths of optical carriers such that they are spread across the available spectrum, in order to reduce the likelihood that, due to imperfections in the manufacturing process, unintended routing faults arise.
We show how to address such problem in Constraint Logic Programming on Finite Domains (CLP(FD)).
This paper is under consideration for possible publication on Theory and Practice of Logic Programming.
Due to the increasing number of mobile robots including domestic robots for cleaning and maintenance in developed countries, human activity recognition is inevitable for congruent human-robot interaction.
Needless to say that this is indeed a challenging task for robots, it is expedient to learn human activities for autonomous mobile robots (AMR) for navigating in an uncontrolled environment without any guidance.
Building a correct classifier for complex human action is non-trivial since simple actions can be combined to recognize a complex human activity.
In this paper, we trained a model for human activity recognition using convolutional neural network.
We trained and validated the model using the Vicon physical action dataset and also tested the model on our generated dataset (VMCUHK).
Our experiment shows that our method performs with high accuracy, human activity recognition task both on the Vicon physical action dataset and VMCUHK dataset.
In this work we continue the syntactic study of completeness that began with the works of Immerman and Medina.
In particular, we take a conjecture raised by Medina in his dissertation that says if a conjunction of a second-order and a first-order sentences defines an NP-complete problems via fops, then it must be the case that the second-order conjoint alone also defines a NP-complete problem.
Although this claim looks very plausible and intuitive, currently we cannot provide a definite answer for it.
However, we can solve in the affirmative a weaker claim that says that all ``consistent'' universal first-order sentences can be safely eliminated without the fear of losing completeness.
Our methods are quite general and can be applied to complexity classes other than NP (in this paper: to NLSPACE, PTIME, and coNP), provided the class has a complete problem satisfying a certain combinatorial property.
Aspects of the properties, enumeration and construction of points on diagonal and Hermitian surfaces have been considered extensively in the literature and are further considered here.
The zeta function of diagonal surfaces is given as a direct result of the work of Wolfmann.
Recursive construction techniques for the set of rational points of Hermitian surfaces are of interest.
The relationship of these techniques here to the construction of codes on surfaces is briefly noted.
Online graph problems are considered in models where the irrevocability requirement is relaxed.
Motivated by practical examples where, for example, there is a cost associated with building a facility and no extra cost associated with doing it later, we consider the Late Accept model, where a request can be accepted at a later point, but any acceptance is irrevocable.
Similarly, we also consider a Late Reject model, where an accepted request can later be rejected, but any rejection is irrevocable (this is sometimes called preemption).
Finally, we consider the Late Accept/Reject model, where late accepts and rejects are both allowed, but any late reject is irrevocable.
For Independent Set, the Late Accept/Reject model is necessary to obtain a constant competitive ratio, but for Vertex Cover the Late Accept model is sufficient and for Minimum Spanning Forest the Late Reject model is sufficient.
The Matching problem has a competitive ratio of 2, but in the Late Accept/Reject model, its competitive ratio is 3/2.
Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology.
Particularly, infections are caused by the interactions of host and pathogen proteins.
It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases.
Conventional wet lab PPI prediction techniques have limitations in terms of large scale application and budget.
Hence, computational approaches are developed to predict PPIs.
This study aims to develop large margin machine learning models to predict interspecies PPIs with a special interest in host-pathogen protein interactions (HPIs).
Especially, we focus on seeking answers to three queries that arise while developing an HPI predictor.
1) How should we select negative samples?
2) What should be the size of negative samples as compared to the positive samples?
3) What type of margin violation penalty should be used to train the predictor?
We compare two available methods for negative sampling.
Moreover, we propose a new method of assigning weights to each training example in weighted SVM depending on the distance of the negative examples from the positive examples.
We have also developed a web server for our HPI predictor called HoPItor (Host Pathogen Interaction predicTOR) that can predict interactions between human and viral proteins.
This webserver can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor.
In this project we outline a modularized, scalable system for comparing Amazon products in an interactive and informative way using efficient latent variable models and dynamic visualization.
We demonstrate how our system can build on the structure and rich review information of Amazon products in order to provide a fast, multifaceted, and intuitive comparison.
By providing a condensed per-topic comparison visualization to the user, we are able to display aggregate information from the entire set of reviews while providing an interface that is at least as compact as the "most helpful reviews" currently displayed by Amazon, yet far more informative.
Medical imaging is widely used in clinical practice for diagnosis and treatment.
Report-writing can be error-prone for unexperienced physicians, and time- consuming and tedious for experienced physicians.
To address these issues, we study the automatic generation of medical imaging reports.
This task presents several challenges.
First, a complete report contains multiple heterogeneous forms of information, including findings and tags.
Second, abnormal regions in medical images are difficult to identify.
Third, the re- ports are typically long, containing multiple sentences.
To cope with these challenges, we (1) build a multi-task learning framework which jointly performs the pre- diction of tags and the generation of para- graphs, (2) propose a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, (3) develop a hierarchical LSTM model to generate long paragraphs.
We demonstrate the effectiveness of the proposed methods on two publicly available datasets.
During the 1990s, one of us developed a series of freeware routines (http://www.leydesdorff.net/indicators) that enable the user to organize downloads from the Web-of-Science (Thomson Reuters) into a relational database, and then to export matrices for further analysis in various formats (for example, for co-author analysis).
The basic format of the matrices displays each document as a case in a row that can be attributed different variables in the columns.
One limitation to this approach was hitherto that relational databases typically have an upper limit for the number of variables, such as 256 or 1024.
In this brief communication, we report on a way to circumvent this limitation by using txt2Pajek.exe, available as freeware from http://www.pfeffer.at/txt2pajek/.
We present a formal measure-theoretical theory of neural networks (NN) built on probability coupling theory.
Our main contributions are summarized as follows.
* Built on the formalism of probability coupling theory, we derive an algorithm framework, named Hierarchical Measure Group and Approximate System (HMGAS), nicknamed S-System, that is designed to learn the complex hierarchical, statistical dependency in the physical world.
* We show that NNs are special cases of S-System when the probability kernels assume certain exponential family distributions.
Activation Functions are derived formally.
We further endow geometry on NNs through information geometry, show that intermediate feature spaces of NNs are stochastic manifolds, and prove that "distance" between samples is contracted as layers stack up.
* S-System shows NNs are inherently stochastic, and under a set of realistic boundedness and diversity conditions, it enables us to prove that for large size nonlinear deep NNs with a class of losses, including the hinge loss, all local minima are global minima with zero loss errors, and regions around the minima are flat basins where all eigenvalues of Hessians are concentrated around zero, using tools and ideas from mean field theory, random matrix theory, and nonlinear operator equations.
* S-System, the information-geometry structure and the optimization behaviors combined completes the analog between Renormalization Group (RG) and NNs.
It shows that a NN is a complex adaptive system that estimates the statistic dependency of microscopic object, e.g., pixels, in multiple scales.
Unlike clear-cut physical quantity produced by RG in physics, e.g., temperature, NNs renormalize/recompose manifolds emerging through learning/optimization that divide the sample space into highly semantically meaningful groups that are dictated by supervised labels (in supervised NNs).
We present a palette-based framework for color composition for visual applications.
Color composition is a critical aspect of visual applications in art, design, and visualization.
The color wheel is often used to explain pleasing color combinations in geometric terms, and, in digital design, to provide a user interface to visualize and manipulate colors.
We abstract relationships between palette colors as a compact set of axes describing harmonic templates over perceptually uniform color wheels.
Our framework provides a basis for a variety of color-aware image operations, such as color harmonization and color transfer, and can be applied to videos.
To enable our approach, we introduce an extremely scalable and efficient yet simple palette-based image decomposition algorithm.
Our approach is based on the geometry of images in RGBXY-space.
This new geometric approach is orders of magnitude more efficient than previous work and requires no numerical optimization.
We demonstrate a real-time layer decomposition tool.
After preprocessing, our algorithm can decompose 6 MP images into layers in 20 milliseconds.
We also conducted three large-scale, wide-ranging perceptual studies on the perception of harmonic colors and harmonization algorithms.
The serious privacy and security problems related to online social networks (OSNs) are what fueled two complementary studies as part of this thesis.
In the first study, we developed a general algorithm for the mining of data of targeted organizations by using Facebook (currently the most popular OSN) and socialbots.
By friending employees in a targeted organization, our active socialbots were able to find new employees and informal organizational links that we could not find by crawling with passive socialbots.
We evaluated our method on the Facebook OSN and were able to reconstruct the social networks of employees in three distinct, actual organizations.
Furthermore, in the crawling process with our active socialbots we discovered up to 13.55% more employees and 22.27% more informal organizational links in contrast to the crawling process that was performed by passive socialbots with no company associations as friends.
In our second study, we developed a general algorithm for reaching specific OSN users who declared themselves to be employees of targeted organizations, using the topologies of organizational social networks and utilizing socialbots.
We evaluated the proposed method on targeted users from three actual organizations on Facebook, and two actual organizations on the Xing OSN (another popular OSN platform).
Eventually, our socialbots were able to reach specific users with a success rate of up to 70% on Facebook, and up to 60% on Xing.
Determining semantic similarity between academic documents is crucial to many tasks such as plagiarism detection, automatic technical survey and semantic search.
Current studies mostly focus on semantic similarity between concepts, sentences and short text fragments.
However, document-level semantic matching is still based on statistical information in surface level, neglecting article structures and global semantic meanings, which may cause the deviation in document understanding.
In this paper, we focus on the document-level semantic similarity issue for academic literatures with a novel method.
We represent academic articles with topic events that utilize multiple information profiles, such as research purposes, methodologies and domains to integrally describe the research work, and calculate the similarity between topic events based on the domain ontology to acquire the semantic similarity between articles.
Experiments show that our approach achieves significant performance compared to state-of-the-art methods.
Lately, the problem of code-switching has gained a lot of attention and has emerged as an active area of research.
In bilingual communities, the speakers commonly embed the words and phrases of a non-native language into the syntax of a native language in their day-to-day communications.
The code-switching is a global phenomenon among multilingual communities, still very limited acoustic and linguistic resources are available as yet.
For developing effective speech based applications, the ability of the existing language technologies to deal with the code-switched data can not be over emphasized.
The code-switching is broadly classified into two modes: inter-sentential and intra-sentential code-switching.
In this work, we have studied the intra-sentential problem in the context of code-switching language modeling task.
The salient contributions of this paper includes: (i) the creation of Hindi-English code-switching text corpus by crawling a few blogging sites educating about the usage of the Internet (ii) the exploration of the parts-of-speech features towards more effective modeling of Hindi-English code-switched data by the monolingual language model (LM) trained on native (Hindi) language data, and (iii) the proposal of a novel textual factor referred to as the code-switch factor (CS-factor), which allows the LM to predict the code-switching instances.
In the context of recognition of the code-switching data, the substantial reduction in the PPL is achieved with the use of POS factors and also the proposed CS-factor provides independent as well as additive gain in the PPL.
A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent Bernoulli dimensions.
The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks.
In this paper, we have analyzed the information-theoretic PAC-learnability of BMMs, when the number of clusters is unknown.
In particular, we stipulate certain conditions on both sample complexity and the dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a given dataset.
To the best of our knowledge, these findings are the first non-asymptotic (PAC) bounds on the sample complexity of learning BMMs.
Quantitative extraction of high-dimensional mineable data from medical images is a process known as radiomics.
Radiomics is foreseen as an essential prognostic tool for cancer risk assessment and the quantification of intratumoural heterogeneity.
In this work, 1615 radiomic features (quantifying tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET and CT images of 300 patients from four different cohorts were analyzed for the risk assessment of locoregional recurrences (LR) and distant metastases (DM) in head-and-neck cancer.
Prediction models combining radiomic and clinical variables were constructed via random forests and imbalance-adjustment strategies using two of the four cohorts.
Independent validation of the prediction and prognostic performance of the models was carried out on the other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88).
Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the potential of radiomics for assessing the risk of specific tumour outcomes using multiple stratification groups.
This could have important clinical impact, notably by allowing for a better personalization of chemo-radiation treatments for head-and-neck cancer patients from different risk groups.
Hidden Markov model based various phoneme recognition methods for Bengali language is reviewed.
Automatic phoneme recognition for Bengali language using multilayer neural network is reviewed.
Usefulness of multilayer neural network over single layer neural network is discussed.
Bangla phonetic feature table construction and enhancement for Bengali speech recognition is also discussed.
Comparison among these methods is discussed.
Classification of multivariate time series (MTS) has been tackled with a large variety of methodologies and applied to a wide range of scenarios.
Among the existing approaches, reservoir computing (RC) techniques, which implement a fixed and high-dimensional recurrent network to process sequential data, are computationally efficient tools to generate a vectorial, fixed-size representation of the MTS that can be further processed by standard classifiers.
Despite their unrivaled training speed, MTS classifiers based on a standard RC architecture fail to achieve the same accuracy of other classifiers, such as those exploiting fully trainable recurrent networks.
In this paper we introduce the reservoir model space, an RC approach to learn vectorial representations of MTS in an unsupervised fashion.
Each MTS is encoded within the parameters of a linear model trained to predict a low-dimensional embedding of the reservoir dynamics.
Our model space yields a powerful representation of the MTS and, thanks to an intermediate dimensionality reduction procedure, attains computational performance comparable to other RC methods.
As a second contribution we propose a modular RC framework for MTS classification, with an associated open source Python library.
By combining the different modules it is possible to seamlessly implement advanced RC architectures, including our proposed unsupervised representation, bidirectional reservoirs, and non-linear readouts, such as deep neural networks with both fixed and flexible activation functions.
Results obtained on benchmark and real-world MTS datasets show that RC classifiers are dramatically faster and, when implemented using our proposed representation, also achieve superior classification accuracy.
We evaluate 8 different word embedding models on their usefulness for predicting the neural activation patterns associated with concrete nouns.
The models we consider include an experiential model, based on crowd-sourced association data, several popular neural and distributional models, and a model that reflects the syntactic context of words (based on dependency parses).
Our goal is to assess the cognitive plausibility of these various embedding models, and understand how we can further improve our methods for interpreting brain imaging data.
We show that neural word embedding models exhibit superior performance on the tasks we consider, beating experiential word representation model.
The syntactically informed model gives the overall best performance when predicting brain activation patterns from word embeddings; whereas the GloVe distributional method gives the overall best performance when predicting in the reverse direction (words vectors from brain images).
Interestingly, however, the error patterns of these different models are markedly different.
This may support the idea that the brain uses different systems for processing different kinds of words.
Moreover, we suggest that taking the relative strengths of different embedding models into account will lead to better models of the brain activity associated with words.
Simulation frameworks are important tools for the analysis and design of communication networks and protocols, but they can result extremely costly and/or complex (for the case of very specialized tools), or too naive and lacking proper features and support (for the case of ad-hoc tools).
In this paper, we present an analysis of three 5G scenarios using 'simmer', a recent R package for discrete-event simulation that sits between the above two paradigms.
As our results show, it provides a simple yet very powerful syntax, supporting the efficient simulation of relatively complex scenarios at a low implementation cost.
We elaborate on the recently proposed orthogonal time frequency space (OTFS) modulation technique, which provides significant advantages over orthogonal frequency division multiplexing (OFDM) in Doppler channels.
We first derive the input--output relation describing OTFS modulation and demodulation (mod/demod) for delay--Doppler channels with arbitrary number of paths, with given delay and Doppler values.
We then propose a low-complexity message passing (MP) detection algorithm, which is suitable for large-scale OTFS taking advantage of the inherent channel sparsity.
Since the fractional Doppler paths (i.e., not exactly aligned with the Doppler taps) produce the inter Doppler interference (IDI), we adapt the MP detection algorithm to compensate for the effect of IDI in order to further improve performance.
Simulations results illustrate the superior performance gains of OTFS over OFDM under various channel conditions.
Most current single image camera calibration methods rely on specific image features or user input, and cannot be applied to natural images captured in uncontrolled settings.
We propose directly inferring camera calibration parameters from a single image using a deep convolutional neural network.
This network is trained using automatically generated samples from a large-scale panorama dataset, and considerably outperforms other methods, including recent deep learning-based approaches, in terms of standard L2 error.
However, we argue that in many cases it is more important to consider how humans perceive errors in camera estimation.
To this end, we conduct a large-scale human perception study where we ask users to judge the realism of 3D objects composited with and without ground truth camera calibration.
Based on this study, we develop a new perceptual measure for camera calibration, and demonstrate that our deep calibration network outperforms other methods on this measure.
Finally, we demonstrate the use of our calibration network for a number of applications including virtual object insertion, image retrieval and compositing.
In a reversible language, any forward computation can be undone by a finite sequence of backward steps.
Reversible computing has been studied in the context of different programming languages and formalisms, where it has been used for debugging and for enforcing fault-tolerance, among others.
In this paper, we consider a subset of Erlang, a concurrent language based on the actor model.
We formally introduce a reversible semantics for this language.
To the best of our knowledge, this is the first attempt to define a reversible semantics for Erlang.
Whether teaching in a classroom or a Massive Online Open Course it is crucial to present the material in a way that benefits the audience as a whole.
We identify two important tasks to solve towards this objective, 1 group students so that they can maximally benefit from peer interaction and 2 find an optimal schedule of the educational material for each group.
Thus, in this paper, we solve the problem of team formation and content scheduling for education.
Given a time frame d, a set of students S with their required need to learn different activities T and given k as the number of desired groups, we study the problem of finding k group of students.
The goal is to teach students within time frame d such that their potential for learning is maximized and find the best schedule for each group.
We show this problem to be NP-hard and develop a polynomial algorithm for it.
We show our algorithm to be effective both on synthetic as well as a real data set.
For our experiments, we use real data on students' grades in a Computer Science department.
As part of our contribution, we release a semi-synthetic dataset that mimics the properties of the real data.
This paper proposes CAESAR, a novel multi-leader Generalized Consensus protocol for geographically replicated sites.
The main goal of CAESAR is to overcome one of the major limitations of existing approaches, which is the significant performance degradation when application workload produces conflicting requests.
CAESAR does that by changing the way a fast decision is taken: its ordering protocol does not reject a fast decision for a client request if a quorum of nodes reply with different dependency sets for that request.
The effectiveness of CAESAR is demonstrated through an evaluation study performed on Amazon's EC2 infrastructure using 5 geo-replicated sites.
CAESAR outperforms other multi-leader (e.g., EPaxos) competitors by as much as 1.7x in the presence of 30% conflicting requests, and single-leader (e.g., Multi-Paxos) by up to 3.5x.
While service-dominant logic proposes that all "Goods are a distribution mechanism for service provision" (FP3), there is a need to understand when and why a firm would utilise direct or indirect (goods) service provision, and the interactions between them, to co-create value with the customer.
Three longitudinal case studies in B2B equipment-based 'complex service' systems were analysed to gain an understanding of customers' co-creation activities to achieve outcomes.
We found the nature of value, degree of contextual variety and the firm's legacy viability to be viability threats.
To counter this, the firm uses (a) Direct Service Provision for Scalability and Replicability, (b) Indirect Service Provision for variety absorption and co-creating emotional value and customer experience and (c) designing direct and indirect provision for Scalability and Absorptive Resources of the customer.
The co-creation of complex multidimensional value could be delivered through different value propositions of the firm.
The research proposes a value-centric way of understanding the interactions between direct and indirect service provision in the design of the firm's value proposition and proposes a viable systems approach towards reorganising the firm.
The study provides a way for managers to understand the effectiveness (rather than efficiency) of the firm in co-creating value as a major issue in the design of complex socio-technical systems.
Goods are often designed within the domain of engineering and product design, often placing human activity as a supporting role to the equipment.
Through an SDLogic lens, this study considers the design of both equipment and human activity on an equal footing for value co-creation with the customer, and it yielded interesting results on when direct provisioning (goods) should be redesigned, considering all activities equally.
Obstacle detection plays an important role in unmanned surface vehicles (USV).
The USVs operate in highly diverse environments in which an obstacle may be a floating piece of wood, a scuba diver, a pier, or a part of a shoreline, which presents a significant challenge to continuous detection from images taken onboard.
This paper addresses the problem of online detection by constrained unsupervised segmentation.
To this end, a new graphical model is proposed that affords a fast and continuous obstacle image-map estimation from a single video stream captured onboard a USV.
The model accounts for the semantic structure of marine environment as observed from USV by imposing weak structural constraints.
A Markov random field framework is adopted and a highly efficient algorithm for simultaneous optimization of model parameters and segmentation mask estimation is derived.
Our approach does not require computationally intensive extraction of texture features and comfortably runs in real-time.
The algorithm is tested on a new, challenging, dataset for segmentation and obstacle detection in marine environments, which is the largest annotated dataset of its kind.
Results on this dataset show that our model outperforms the related approaches, while requiring a fraction of computational effort.
This paper develops a novel framework for sharing secret keys using the Automatic Repeat reQuest (ARQ) protocol.
We first characterize the underlying information theoretic limits, under different assumptions on the channel spatial and temporal correlation function.
Our analysis reveals a novel role of "dumb antennas" in overcoming the negative impact of spatial correlation on the achievable secrecy rates.
We further develop an adaptive rate allocation policy, which achieves higher secrecy rates in temporally correlated channels, and explicit constructions for ARQ secrecy coding that enjoy low implementation complexity.
Building on this theoretical foundation, we propose a unified framework for ARQ-based secrecy in Wi-Fi networks.
By exploiting the existing ARQ mechanism in the IEEE 802.11 standard, we develop security overlays that offer strong security guarantees at the expense of only minor modifications in the medium access layer.
Our numerical results establish the achievability of non-zero secrecy rates even when the eavesdropper channel is less noisy, on the average, than the legitimate channel, while our linux-based prototype demonstrates the efficiency of our ARQ overlays in mitigating all known, passive and active, Wi-Fi attacks at the expense of a minimal increase in the link setup time and a small loss in throughput.
Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems.
RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features.
In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to learn finite representations of these vectors and features.
The result is a quantized representation of the RNN that can be analyzed to improve our understanding of memory use and general behavior.
We present results of this approach on synthetic environments and six Atari games.
The resulting finite representations are surprisingly small in some cases, using as few as 3 discrete memory states and 10 observations for a perfect Pong policy.
We also show that these finite policy representations lead to improved interpretability.
In this work we propose a multi-task spatio-temporal network, called SUSiNet, that can jointly tackle the spatio-temporal problems of saliency estimation, action recognition and video summarization.
Our approach employs a single network that is jointly end-to-end trained for all tasks with multiple and diverse datasets related to the exploring tasks.
The proposed network uses a unified architecture that includes global and task specific layer and produces multiple output types, i.e., saliency maps or classification labels, by employing the same video input.
Moreover, one additional contribution is that the proposed network can be deeply supervised through an attention module that is related to human attention as it is expressed by eye-tracking data.
From the extensive evaluation, on seven different datasets, we have observed that the multi-task network performs as well as the state-of-the-art single-task methods (or in some cases better), while it requires less computational budget than having one independent network per each task.
Convolutional neural networks (CNNs) have achieved great success on grid-like data such as images, but face tremendous challenges in learning from more generic data such as graphs.
In CNNs, the trainable local filters enable the automatic extraction of high-level features.
The computation with filters requires a fixed number of ordered units in the receptive fields.
However, the number of neighboring units is neither fixed nor are they ordered in generic graphs, thereby hindering the applications of convolutional operations.
Here, we address these challenges by proposing the learnable graph convolutional layer (LGCL).
LGCL automatically selects a fixed number of neighboring nodes for each feature based on value ranking in order to transform graph data into grid-like structures in 1-D format, thereby enabling the use of regular convolutional operations on generic graphs.
To enable model training on large-scale graphs, we propose a sub-graph training method to reduce the excessive memory and computational resource requirements suffered by prior methods on graph convolutions.
Our experimental results on node classification tasks in both transductive and inductive learning settings demonstrate that our methods can achieve consistently better performance on the Cora, Citeseer, Pubmed citation network, and protein-protein interaction network datasets.
Our results also indicate that the proposed methods using sub-graph training strategy are more efficient as compared to prior approaches.
Soon after its introduction in 2009, Bitcoin has been adopted by cyber-criminals, which rely on its pseudonymity to implement virtually untraceable scams.
One of the typical scams that operate on Bitcoin are the so-called Ponzi schemes.
These are fraudulent investments which repay users with the funds invested by new users that join the scheme, and implode when it is no longer possible to find new investments.
Despite being illegal in many countries, Ponzi schemes are now proliferating on Bitcoin, and they keep alluring new victims, who are plundered of millions of dollars.
We apply data mining techniques to detect Bitcoin addresses related to Ponzi schemes.
Our starting point is a dataset of features of real-world Ponzi schemes, that we construct by analysing, on the Bitcoin blockchain, the transactions used to perform the scams.
We use this dataset to experiment with various machine learning algorithms, and we assess their effectiveness through standard validation protocols and performance metrics.
The best of the classifiers we have experimented can identify most of the Ponzi schemes in the dataset, with a low number of false positives.
Neural networks are known to be vulnerable to adversarial examples.
Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild.
To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model.
In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy.
In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense.
SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate.
We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples.
Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration.
Incorporating graphs in the analysis of multivariate signals is becoming a standard way to understand the interdependency of activity recorded at different sites.
The new research frontier in this direction includes the important problem of how to assess dynamic changes of signal activity.
We address this problem in a novel way by defining the graph-variate signal alongside methods for its analysis.
Essentially, graph-variate signal analysis leverages graphs of reliable connectivity information to filter instantaneous bivariate functions of the multivariate signal.
This opens up a new and robust approach to analyse joint signal and network dynamics at sample resolution.
Furthermore, our method can be formulated as instantaneous networks on which standard network analysis can be implemented.
When graph connectivity is estimated from the multivariate signal itself, the appropriate consideration of instantaneous graph signal functions allows for a novel dynamic connectivity measure-- graphvariate dynamic (GVD) connectivity-- which is robust to spurious short-term dependencies.
Particularly, we present appropriate functions for three pertinent connectivity metrics-- correlation, coherence and the phase-lag index.
We show that our approach can determine signals with a single correlated couple against wholly uncorrelated data of up to 128 nodes in signal size (1 out of 8128 weighted edges).
GVD connectivity is also shown to be more robust than i) other GSP approaches at detecting a randomly traveling spheroid on a 3D grid and ii) standard dynamic connectivity in determining differences in EEG restingstate and task-related activity.
We also demonstrate its use in revealing hidden depth correlations from geophysical gamma ray data.
We expect that the methods and framework presented will provide new approaches to data analysis in a variety of applied settings.
Air pollution poses a serious threat to human health as well as economic development around the world.
To meet the increasing demand for accurate predictions for air pollutions, we proposed a Deep Inferential Spatial-Temporal Network to deal with the complicated non-linear spatial and temporal correlations.
We forecast three air pollutants (i.e., PM2.5, PM10 and O3) of monitoring stations over the next 48 hours, using a hybrid deep learning model consists of inferential predictor (inference for regions without air pollution readings), spatial predictor (capturing spatial correlations using CNN) and temporal predictor (capturing temporal relationship using sequence-to-sequence model with simplified attention mechanism).
Our proposed model considers historical air pollution records and historical meteorological data.
We evaluate our model on a large-scale dataset containing air pollution records of 35 monitoring stations and grid meteorological data in Beijing, China.
Our model outperforms other state-of-art methods in terms of SMAPE and RMSE.
We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods.
Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate topics.
One of the problems with methods like LDA is that users who apply them may not understand the topics that are generated.
Also, users may find it difficult to search correlated topics and correlated documents.
LDAExplore, tries to alleviate these problems by visualizing topic and word distributions generated from the document corpus and allowing the user to interact with them.
The system is designed for users, who have minimal knowledge of LDA or Topic Modelling methods.
To evaluate our design, we run a pilot study which uses the abstracts of 322 Information Visualization papers, where every abstract is considered a document.
The topics generated are then explored by users.
The results show that users are able to find correlated documents and group them based on topics that are similar.
Symmetry is an important factor in human perception in general, as well as in the visualization of graphs in particular.
There are three main types of symmetry: reflective, translational, and rotational.
We report the results of a human subjects experiment to determine what types of symmetries are more salient in drawings of graphs.
We found statistically significant evidence that vertical reflective symmetry is the most dominant (when selecting among vertical reflective, horizontal reflective, and translational).
We also found statistically significant evidence that rotational symmetry is affected by the number of radial axes (the more, the better), with a notable exception at four axes.
Material attributes have been shown to provide a discriminative intermediate representation for recognizing materials, especially for the challenging task of recognition from local material appearance (i.e., regardless of object and scene context).
In the past, however, material attributes have been recognized separately preceding category recognition.
In contrast, neuroscience studies on material perception and computer vision research on object and place recognition have shown that attributes are produced as a by-product during the category recognition process.
Does the same hold true for material attribute and category recognition?
In this paper, we introduce a novel material category recognition network architecture to show that perceptual attributes can, in fact, be automatically discovered inside a local material recognition framework.
The novel material-attribute-category convolutional neural network (MAC-CNN) produces perceptual material attributes from the intermediate pooling layers of an end-to-end trained category recognition network using an auxiliary loss function that encodes human material perception.
To train this model, we introduce a novel large-scale database of local material appearance organized under a canonical material category taxonomy and careful image patch extraction that avoids unwanted object and scene context.
We show that the discovered attributes correspond well with semantically-meaningful visual material traits via Boolean algebra, and enable recognition of previously unseen material categories given only a few examples.
These results have strong implications in how perceptually meaningful attributes can be learned in other recognition tasks.
Pain is a personal, subjective experience that is commonly evaluated through visual analog scales (VAS).
While this is often convenient and useful, automatic pain detection systems can reduce pain score acquisition efforts in large-scale studies by estimating it directly from the participants' facial expressions.
In this paper, we propose a novel two-stage learning approach for VAS estimation: first, our algorithm employs Recurrent Neural Networks (RNNs) to automatically estimate Prkachin and Solomon Pain Intensity (PSPI) levels from face images.
The estimated scores are then fed into the personalized Hidden Conditional Random Fields (HCRFs), used to estimate the VAS, provided by each person.
Personalization of the model is performed using a newly introduced facial expressiveness score, unique for each person.
To the best of our knowledge, this is the first approach to automatically estimate VAS from face images.
We show the benefits of the proposed personalized over traditional non-personalized approach on a benchmark dataset for pain analysis from face images.
Visual media are powerful means of expressing emotions and sentiments.
The constant generation of new content in social networks highlights the need of automated visual sentiment analysis tools.
While Convolutional Neural Networks (CNNs) have established a new state-of-the-art in several vision problems, their application to the task of sentiment analysis is mostly unexplored and there are few studies regarding how to design CNNs for this purpose.
In this work, we study the suitability of fine-tuning a CNN for visual sentiment prediction as well as explore performance boosting techniques within this deep learning setting.
Finally, we provide a deep-dive analysis into a benchmark, state-of-the-art network architecture to gain insight about how to design patterns for CNNs on the task of visual sentiment prediction.
We propose deterministic sampling strategies for compressive imaging based on Delsarte-Goethals frames.
We show that these sampling strategies result in multi-scale measurements which can be related to the 2D Haar wavelet transform.
We demonstrate the effectiveness of our proposed strategies through numerical experiments.
In this paper we present a new efficient algorithm for factoring the RSA and the Rabin moduli in the particular case when the difference between their two prime factors is bounded.
As an extension, we also give some theoretical results on factoring integers.
We introduce a general framework for visual forecasting, which directly imitates visual sequences without additional supervision.
As a result, our model can be applied at several semantic levels and does not require any domain knowledge or handcrafted features.
We achieve this by formulating visual forecasting as an inverse reinforcement learning (IRL) problem, and directly imitate the dynamics in natural sequences from their raw pixel values.
The key challenge is the high-dimensional and continuous state-action space that prohibits the application of previous IRL algorithms.
We address this computational bottleneck by extending recent progress in model-free imitation with trainable deep feature representations, which (1) bypasses the exhaustive state-action pair visits in dynamic programming by using a dual formulation and (2) avoids explicit state sampling at gradient computation using a deep feature reparametrization.
This allows us to apply IRL at scale and directly imitate the dynamics in high-dimensional continuous visual sequences from the raw pixel values.
We evaluate our approach at three different level-of-abstraction, from low level pixels to higher level semantics: future frame generation, action anticipation, visual story forecasting.
At all levels, our approach outperforms existing methods.
This is the preprint version of our paper on IEEE Virtual Reality Conference 2015.
A touch-less interaction technology on vision based wearable device is designed and evaluated.
Users interact with the application with dynamic hands/feet gestures in front of the camera.
Several proof-of-concept prototypes with eleven dynamic gestures are developed based on the touch-less interaction.
At last, a comparing user study evaluation is proposed to demonstrate the usability of the touch-less approach, as well as the impact on user's emotion, running on a wearable framework or Google Glass.
During imagery motor movements tasks, the so called mu and beta event related desynchronization (ERD) and synchronization (ERS) are taking place, allowing us to determine human patient imagery movement.
However, initial recordings of electroencephalography (EEG) signals contain system and environmental noise as well as interference that must be ejected in order to separate the ERS/ERD events from the rest of the signal.
This paper presents a new technique based on a reworked Second Order Blind Identification (SOBI) algorithm for noise removal while imagery movement classification is implemented using Support Vector Machine (SVM) technique.
Efforts to automate the reconstruction of neural circuits from 3D electron microscopic (EM) brain images are critical for the field of connectomics.
An important computation for reconstruction is the detection of neuronal boundaries.
Images acquired by serial section EM, a leading 3D EM technique, are highly anisotropic, with inferior quality along the third dimension.
For such images, the 2D max-pooling convolutional network has set the standard for performance at boundary detection.
Here we achieve a substantial gain in accuracy through three innovations.
Following the trend towards deeper networks for object recognition, we use a much deeper network than previously employed for boundary detection.
Second, we incorporate 3D as well as 2D filters, to enable computations that use 3D context.
Finally, we adopt a recursively trained architecture in which a first network generates a preliminary boundary map that is provided as input along with the original image to a second network that generates a final boundary map.
Backpropagation training is accelerated by ZNN, a new implementation of 3D convolutional networks that uses multicore CPU parallelism for speed.
Our hybrid 2D-3D architecture could be more generally applicable to other types of anisotropic 3D images, including video, and our recursive framework for any image labeling problem.
Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature.
In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences.
We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results.
Additionally, we demonstrate that optimization from end-to-end leads to significantly higher accuracy than separated learning.
The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way.
The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.
This paper introduces a fast algorithm for simultaneous inversion and determinant computation of small sized matrices in the context of fully Polarimetric Synthetic Aperture Radar (PolSAR) image processing and analysis.
The proposed fast algorithm is based on the computation of the adjoint matrix and the symmetry of the input matrix.
The algorithm is implemented in a general purpose graphical processing unit (GPGPU) and compared to the usual approach based on Cholesky factorization.
The assessment with simulated observations and data from an actual PolSAR sensor show a speedup factor of about two when compared to the usual Cholesky factorization.
Moreover, the expressions provided here can be implemented in any platform.
The research on hashing techniques for visual data is gaining increased attention in recent years due to the need for compact representations supporting efficient search/retrieval in large-scale databases such as online images.
Among many possibilities, Mean Average Precision(mAP) has emerged as the dominant performance metric for hashing-based retrieval.
One glaring shortcoming of mAP is its inability in balancing retrieval accuracy and utilization of hash codes: pushing a system to attain higher mAP will inevitably lead to poorer utilization of the hash codes.
Poor utilization of the hash codes hinders good retrieval because of increased collision of samples in the hash space.
This means that a model giving a higher mAP values does not necessarily do a better job in retrieval.
In this paper, we introduce a new metric named Mean Local Group Average Precision (mLGAP) for better evaluation of the performance of hashing-based retrieval.
The new metric provides a retrieval performance measure that also reconciles the utilization of hash codes, leading to a more practically meaningful performance metric than conventional ones like mAP.
To this end, we start by mathematical analysis of the deficiencies of mAP for hashing-based retrieval.
We then propose mLGAP and show why it is more appropriate for hashing-based retrieval.
Experiments on image retrieval are used to demonstrate the effectiveness of the proposed metric.
We define a compilation scheme for a constructor-based, strongly-sequential, graph rewriting system which shortcuts some needed steps.
The object code is another constructor-based graph rewriting system.
This system is normalizing for the original system when using an innermost strategy.
Consequently, the object code can be easily implemented by eager functions in a variety of programming languages.
We modify this object code in a way that avoids total or partial construction of the contracta of some needed steps of a computation.
When computing normal forms in this way, both memory consumption and execution time are reduced compared to ordinary rewriting computations in the original system.
Partial mutual exclusion is the drinking philosophers problem for complete graphs.
It is the problem that a process may enter a critical section CS of its code only when some finite set nbh of other processes are not in their critical sections.
For each execution of CS, the set nbh can be given by the environment.
We present a starvation free solution of this problem in a setting with infinitely many processes, each with finite memory, that communicate by asynchronous messages.
The solution has the property of first-come first-served, in so far as this can be guaranteed by asynchronous messages.
For every execution of CS and every process in nbh, between three and six messages are needed.
The correctness of the solution is argued with invariants and temporal logic.
It has been verified with the proof assistant PVS.
This paper investigates exploration strategies of Deep Reinforcement Learning (DRL) methods to learn navigation policies for mobile robots.
In particular, we augment the normal external reward for training DRL algorithms with intrinsic reward signals measured by curiosity.
We test our approach in a mapless navigation setting, where the autonomous agent is required to navigate without the occupancy map of the environment, to targets whose relative locations can be easily acquired through low-cost solutions (e.g., visible light localization, Wi-Fi signal localization).
We validate that the intrinsic motivation is crucial for improving DRL performance in tasks with challenging exploration requirements.
Our experimental results show that our proposed method is able to more effectively learn navigation policies, and has better generalization capabilities in previously unseen environments.
A video of our experimental results can be found at https://goo.gl/pWbpcF.
This paper investigates the fully distributed cooperation scheme for networked nonholonomic mobile manipulators.
To achieve cooperative task allocation in a distributed way, an adaptation-based estimation law is established for each robotic agent to estimate the desired local trajectory.
In addition, wrench synthesis is analyzed in detail to lay a solid foundation for tight cooperation task.
Together with the estimated task, a set of distributed adaptive control is proposed to achieve motion synchronization of the mobile manipulator ensemble over a directed graph with a spanning tree irrespective of the kinematic and dynamic uncertainties in both the mobile manipulators and the tightly grasped object.
The controlled synchronization alleviates the performance degradation caused by the estimation/tracking discrepancy during transient phase.
Persistent excitation condition and noisy Cartesian-space velocities are totally avoided.
Furthermore, the proposed scheme is independent from the object's center of mass by employing formation-based task allocation and task-oriented strategy.
These attractive attributes facilitate its practical application.
It is theoretically proved that convergence of the cooperative task tracking error is guaranteed.
Simulation results validate the efficacy and demonstrates the expected performance of the proposed scheme.
For a given set of intervals on the real line, we consider the problem of ordering the intervals with the goal of minimizing an objective function that depends on the exposed interval pieces (that is, the pieces that are not covered by earlier intervals in the ordering).
This problem is motivated by an application in molecular biology that concerns the determination of the structure of the backbone of a protein.
We present polynomial-time algorithms for several natural special cases of the problem that cover the situation where the interval boundaries are agreeably ordered and the situation where the interval set is laminar.
Also the bottleneck variant of the problem is shown to be solvable in polynomial time.
Finally we prove that the general problem is NP-hard, and that the existence of a constant-factor-approximation algorithm is unlikely.
Convolutional neural networks (CNN) have recently achieved remarkable performance in a wide range of applications.
In this research, we equip convolutional sequence-to-sequence (seq2seq) model with an efficient graph linearization technique for abstract meaning representation parsing.
Our linearization method is better than the prior method at signaling the turn of graph traveling.
Additionally, convolutional seq2seq model is more appropriate and considerably faster than the recurrent neural network models in this task.
Our method outperforms previous methods by a large margin on both the standard dataset LDC2014T12.
Our result indicates that future works still have a room for improving parsing model using graph linearization approach.
This paper introduces an innovative approach for handling 2D compound hypotheses within the Belief Function Theory framework.
We propose a polygon-based generic rep- resentation which relies on polygon clipping operators.
This approach allows us to account in the computational cost for the precision of the representation independently of the cardinality of the discernment frame.
For the BBA combination and decision making, we propose efficient algorithms which rely on hashes for fast lookup, and on a topological ordering of the focal elements within a directed acyclic graph encoding their interconnections.
Additionally, an implementation of the functionalities proposed in this paper is provided as an open source library.
Experimental results on a pedestrian localization problem are reported.
The experiments show that the solution is accurate and that it fully benefits from the scalability of the 2D search space granularity provided by our representation.
For ergodic fading, a lattice coding and decoding strategy is proposed and its performance is analyzed for the single-input single-output (SISO) and multiple-input multiple-output (MIMO) point-to-point channel as well as the multiple-access channel (MAC), with channel state information available only at the receiver (CSIR).
At the decoder a novel strategy is proposed consisting of a time-varying equalization matrix followed by decision regions that depend only on channel statistics, not individual realizations.
Our encoder has a similar structure to that of Erez and Zamir.
For the SISO channel, the gap to capacity is bounded by a constant under a wide range of fading distributions.
For the MIMO channel under Rayleigh fading, the rate achieved is within a gap to capacity that does not depend on the signal-to-noise ratio (SNR), and diminishes with the number of receive antennas.
The analysis is extended to the K-user MAC where similar results hold.
Achieving a small gap to capacity while limiting the use of CSIR to the equalizer highlights the scope for efficient decoder implementations, since decision regions are fixed, i.e., independent of channel realizations.
In an organization, individuals prefer to form various formal and informal groups for mutual interactions.
Therefore, ubiquitous identification of such groups and understanding their dynamics are important to monitor activities, behaviours and well-being of the individuals.
In this paper, we develop a lightweight, yet near-accurate, methodology, called MeetSense, to identify various interacting groups based on collective sensing through users' smartphones.
Group detection from sensor signals is not straightforward because users in proximity may not always be under the same group.
Therefore, we use acoustic context extracted from audio signals to infer interaction pattern among the subjects in proximity.
We have developed an unsupervised and lightweight mechanism for user group detection by taking cues from network science and measuring the cohesivity of the detected groups in terms of modularity.
Taking modularity into consideration, MeetSense can efficiently eliminate incorrect groups, as well as adapt the mechanism depending on the role played by the proximity and the acoustic context in a specific scenario.
The proposed method has been implemented and tested under many real-life scenarios in an academic institute environment, and we observe that MeetSense can identify user groups with close to 90% accuracy even in a noisy environment.
Recognizing textual entailment is a fundamental task in a variety of text mining or natural language processing applications.
This paper proposes a simple neural model for RTE problem.
It first matches each word in the hypothesis with its most-similar word in the premise, producing an augmented representation of the hypothesis conditioned on the premise as a sequence of word pairs.
The LSTM model is then used to model this augmented sequence, and the final output from the LSTM is fed into a softmax layer to make the prediction.
Besides the base model, in order to enhance its performance, we also proposed three techniques: the integration of multiple word-embedding library, bi-way integration, and ensemble based on model averaging.
Experimental results on the SNLI dataset have shown that the three techniques are effective in boosting the predicative accuracy and that our method outperforms several state-of-the-state ones.
One of the most influential recent results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution.
This has inspired numerous generative models that match this property.
However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance.
We believe this is, in part, because many of these real-world networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative.
We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research.
An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution.
This is the main topic of this paper and we study the problem from both a theoretical and practical perspective.
We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them.
We also show that the state space of simple graphs with a fixed degree distribution is connected via end point switches.
We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge.
These experiments show that our Markov Chain mixes quickly on real graphs, allowing for utilization of our techniques in practice.
Neglecting the effects of rolling-shutter cameras for visual odometry (VO) severely degrades accuracy and robustness.
In this paper, we propose a novel direct monocular VO method that incorporates a rolling-shutter model.
Our approach extends direct sparse odometry which performs direct bundle adjustment of a set of recent keyframe poses and the depths of a sparse set of image points.
We estimate the velocity at each keyframe and impose a constant-velocity prior for the optimization.
In this way, we obtain a near real-time, accurate direct VO method.
Our approach achieves improved results on challenging rolling-shutter sequences over state-of-the-art global-shutter VO.
The most efficient algorithms for finding maximum independent sets in both theory and practice use reduction rules to obtain a much smaller problem instance called a kernel.
The kernel can then be solved quickly using exact or heuristic algorithms - or by repeatedly kernelizing recursively in the branch-and-reduce paradigm.
It is of critical importance for these algorithms that kernelization is fast and returns a small kernel.
Current algorithms are either slow but produce a small kernel, or fast and give a large kernel.
We attempt to accomplish both of these goals simultaneously, by giving an efficient parallel kernelization algorithm based on graph partitioning and parallel bipartite maximum matching.
We combine our parallelization techniques with two techniques to accelerate kernelization further: dependency checking that prunes reductions that cannot be applied, and reduction tracking that allows us to stop kernelization when reductions become less fruitful.
Our algorithm produces kernels that are orders of magnitude smaller than the fastest kernelization methods, while having a similar execution time.
Furthermore, our algorithm is able to compute kernels with size comparable to the smallest known kernels, but up to two orders of magnitude faster than previously possible.
Finally, we show that our kernelization algorithm can be used to accelerate existing state-of-the-art heuristic algorithms, allowing us to find larger independent sets faster on large real-world networks and synthetic instances.
Classical control and management plane for computer networks is addressing individual parameters of protocol layers within an individual wireless network device.
We argue that this is not sufficient in phase of increasing deployment of highly re-configurable systems, as well as heterogeneous wireless systems co-existing in the same radio spectrum which demand harmonized, frequently even coordinated adaptation of multiple parameters in different protocol layers (cross-layer) in multiple network devices (cross-node).
We propose UniFlex, a framework enabling unified and flexible radio and network control.
It provides an API enabling coordinated cross-layer control and management operation over multiple network nodes.
The controller logic may be implemented either in a centralized or distributed manner.
This allows to place time-sensitive control functions close to the controlled device (i.e., local control application), off-load more resource hungry network application to compute servers and make them work together to control entire network.
The UniFlex framework was prototypically implemented and provided to the research community as open-source.
We evaluated the the framework in a number of use-cases, what proved its usability.
This paper investigates the offline packet-delay-minimization problem for an energy harvesting transmitter.
To overcome the non-convexity of the problem, we propose a C2-diffeomorphic transformation and provide the necessary and sufficient condition for the transformed problem to a standard convex optimization problem.
Based on this condition, a simple choice of the transformation is determined which allows an analytically tractable solution of the original non-convex problem to be easily obtained once the transformed convex problem is solved.
We further study the structure of the optimal transmission policy in a special case and find it to follow a weighted-directional-water-filling structure.
In particular, the optimal policy tends to allocate more power in earlier time slots and less power in later time slots.
Our analytical insight is verified by simulation results.
Software-defined networking (SDN) is a new paradigm that allows developing more flexible network applications.
SDN controller, which represents a centralized controlling point, is responsible for running various network applications as well as maintaining different network services and functionalities.
Choosing an efficient intrusion detection system helps in reducing the overhead of the running controller and creates a more secure network.
In this study, we investigate the performance of the well-known anomaly-based intrusion detection approaches in terms of accuracy, false alarm rate, precision, recall, f1-measure, area under ROC curve, execution time and Mc Nemar's test.
Precisely, we focus on supervised machine-learning approaches where we use the following classifiers: Decision Trees (DT), Extreme Learning Machine (ELM), Naive Bayes (NB), Linear Discriminant Analysis (LDA), Neural Networks (NN), Support Vector Machines (SVM), Random Forest (RT), K Nearest-Neighbour (KNN), AdaBoost, RUSBoost, LogitBoost and BaggingTrees where we employ the well-known NSL-KDD benchmark dataset to compare the performance of each one of these classifiers.
A simulation model is presented to analyze and evaluate the performance of VoIP based integrated wireless LAN/WAN with taking into account various voice encoding schemes.
The network model was simulated using OPNET Modeler software.
Different parameters that indicate the QoS like MOS, jitter, end to end delay, traffic send and traffic received are calculated and analyzed in Wireless LAN/WAN scenarios.
Depending on this evaluation, Selection codecs G.729A consider the best choice for VoIP.
A method based on the classical principal component analysis leads to demonstrate that the role of co-authors should give a h-index measure to a group leader higher than usually accepted.
The method rather easily gives what is usually searched for, i.e. an estimate of the role (or "weight") of co-authors, as the additional value to an author papers' popularity.
The construction of the co-authorship popularity H-matrix is exemplified and the role of eigenvalues and the main eigenvector component are discussed.
An example illustrates the points and serves as the basis for suggesting a generally practical application of the concept.
In this paper we present a biorealistic model for the first part of the early vision processing by incorporating memristive nanodevices.
The architecture of the proposed network is based on the organisation and functioning of the outer plexiform layer (OPL) in the vertebrate retina.
We demonstrate that memristive devices are indeed a valuable building block for neuromorphic architectures, as their highly non-linear and adaptive response could be exploited for establishing ultra-dense networks with similar dynamics to their biological counterparts.
We particularly show that hexagonal memristive grids can be employed for faithfully emulating the smoothing-effect occurring at the OPL for enhancing the dynamic range of the system.
In addition, we employ a memristor-based thresholding scheme for detecting the edges of grayscale images, while the proposed system is also evaluated for its adaptation and fault tolerance capacity against different light or noise conditions as well as distinct device yields.
Extraction of local feature descriptors is a vital stage in the solution pipelines for numerous computer vision tasks.
Learning-based approaches improve performance in certain tasks, but still cannot replace handcrafted features in general.
In this paper, we improve the learning of local feature descriptors by optimizing the performance of descriptor matching, which is a common stage that follows descriptor extraction in local feature based pipelines, and can be formulated as nearest neighbor retrieval.
Specifically, we directly optimize a ranking-based retrieval performance metric, Average Precision, using deep neural networks.
This general-purpose solution can also be viewed as a listwise learning to rank approach, which is advantageous compared to recent local ranking approaches.
On standard benchmarks, descriptors learned with our formulation achieve state-of-the-art results in patch verification, patch retrieval, and image matching.
In this paper, we define the general framework to describe the diffusion operators associated to a positive matrix.
We define the equations associated to diffusion operators and present some general properties of their state vectors.
We show how this can be applied to prove and improve the convergence of a fixed point problem associated to the matrix iteration scheme, including for distributed computation framework.
The approach can be understood as a decomposition of the matrix-vector product operation in elementary operations at the vector entry level.
Recently virtual platforms and virtual prototyping techniques have been widely applied for accelerating software development in electronics companies.
It has been proved that these techniques can greatly shorten time-to-market and improve product quality.
One challenge is how to test and validate a virtual prototype.
In this paper, we present how to conduct regression testing of virtual prototypes in different versions using symbolic execution.
Suppose we have old and new versions of a virtual prototype, we first apply symbolic execution to the new version and collect all path constraints.
Then the collected path constraints are used for guiding the symbolic execution of the old version.
For each path explored, we compare the device states between two versions to check if they behave the same.
We have applied this approach to a widely-used virtual prototype and detected numerous differences.
The experimental results show that our approach is useful and efficient.
We explore the concept of co-design in the context of neural network verification.
Specifically, we aim to train deep neural networks that not only are robust to adversarial perturbations but also whose robustness can be verified more easily.
To this end, we identify two properties of network models - weight sparsity and so-called ReLU stability - that turn out to significantly impact the complexity of the corresponding verification task.
We demonstrate that improving weight sparsity alone already enables us to turn computationally intractable verification problems into tractable ones.
Then, improving ReLU stability leads to an additional 4-13x speedup in verification times.
An important feature of our methodology is its "universality," in the sense that it can be used with a broad range of training procedures and verification approaches.
This paper presents a super-efficient spatially adaptive contrast enhancement algorithm for enhancing infrared (IR) radiation based superficial vein images in real-time.
The super-efficiency permits the algorithm to run in consumer-grade handheld devices, which ultimately reduces the cost of vein imaging equipment.
The proposed method utilizes the response from the low-frequency range of the IR image signal to adjust the boundaries of the reference dynamic range in a linear contrast stretching process with a tunable contrast enhancement parameter, as opposed to traditional approaches which use costly adaptive histogram equalization based methods.
The algorithm has been implemented and deployed in a consumer grade Android-based mobile device to evaluate the performance.
The results revealed that the proposed algorithm can process IR images of veins in real-time on low-performance computers.
It was compared with several well-performed traditional methods and the results revealed that the new algorithm stands out with several beneficial features, namely, the fastest processing, the ability to enhance the desired details, the excellent illumination normalization capability and the ability to enhance details where the traditional methods failed.
For an artificial creative agent, an essential driver of the search for novelty is a value function which is often provided by the system designer or users.
We argue that an important barrier for progress in creativity research is the inability of these systems to develop their own notion of value for novelty.
We propose a notion of knowledge-driven creativity that circumvent the need for an externally imposed value function, allowing the system to explore based on what it has learned from a set of referential objects.
The concept is illustrated by a specific knowledge model provided by a deep generative autoencoder.
Using the described system, we train a knowledge model on a set of digit images and we use the same model to build coherent sets of new digits that do not belong to known digit types.
The piggybacking framework for designing erasure codes for distributed storage has empirically proven to be very useful, and has been used to design codes with desirable properties, such as low repair bandwidth and complexity.
However, the theoretical properties of this framework remain largely unexplored.
We address this by adapting a general characterization of repair schemes (previously used for Reed Solomon codes) to analyze piggybacking codes with low substriping.
With this characterization, we establish a separation between piggybacking and general erasure codes, and several impossibility results for subcategories of piggybacking codes; for certain parameters, we also present explicit, optimal constructions of piggybacking codes.
The effort devoted to hand-crafting neural network image classifiers has motivated the use of architecture search to discover them automatically.
Although evolutionary algorithms have been repeatedly applied to neural network topologies, the image classifiers thus discovered have remained inferior to human-crafted ones.
Here, we evolve an image classifier---AmoebaNet-A---that surpasses hand-designs for the first time.
To do this, we modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes.
Matching size, AmoebaNet-A has comparable accuracy to current state-of-the-art ImageNet models discovered with more complex architecture-search methods.
Scaled to larger size, AmoebaNet-A sets a new state-of-the-art 83.9% / 96.6% top-5 ImageNet accuracy.
In a controlled comparison against a well known reinforcement learning algorithm, we give evidence that evolution can obtain results faster with the same hardware, especially at the earlier stages of the search.
This is relevant when fewer compute resources are available.
Evolution is, thus, a simple method to effectively discover high-quality architectures.
Portmanteaus are a word formation phenomenon where two words are combined to form a new word.
We propose character-level neural sequence-to-sequence (S2S) methods for the task of portmanteau generation that are end-to-end-trainable, language independent, and do not explicitly use additional phonetic information.
We propose a noisy-channel-style model, which allows for the incorporation of unsupervised word lists, improving performance over a standard source-to-target model.
This model is made possible by an exhaustive candidate generation strategy specifically enabled by the features of the portmanteau task.
Experiments find our approach superior to a state-of-the-art FST-based baseline with respect to ground truth accuracy and human evaluation.
There is a common need to search of molecular databases for compounds resembling some shape, what suggests having similar biological activity while searching for new drugs.
The large size of the databases requires fast methods for such initial screening, for example based on feature vectors constructed to fulfill the requirement that similar molecules should correspond to close vectors.
Ultrafast Shape Recognition (USR) is a popular approach of this type.
It uses vectors of 12 real number as 3 first moments of distances from 4 emphasized points.
These coordinates might contain unnecessary correlations and does not allow to reconstruct the approximated shape.
In contrast, spherical harmonic (SH) decomposition uses orthogonal coordinates, suggesting their independence and so lager informational content of the feature vector.
There is usually considered rotationally invariant SH descriptors, what means discarding of some essential information.
This article discusses framework for descriptors with normalized rotation, for example by using principal component analysis (PCA-SH).
As one of the most interesting are ligands which have to slide into a protein, we will introduce descriptors optimized for such flat elongated shapes.
Bent deformed cylinder (BDC) describes the molecule as a cylinder which was first bent, then deformed such that its cross-sections became ellipses of evolving shape.
Legendre polynomials are used to describe the central axis of such bent cylinder.
Additional polynomials are used to define evolution of such elliptic cross-section along the main axis.
There will be also discussed bent cylindrical harmonics (BCH), which uses cross-sections described by cylindrical harmonics instead of ellipses.
All these normalized rotation descriptors allow to reconstruct (decode) the approximated representation of the shape, hence can be also used for lossy compression purposes.
Application-specific integrated circuit (ASIC) implementations for Deep Neural Networks (DNNs) have been adopted in many systems because of their higher classification speed.
However, although they may be characterized by better accuracy, larger DNNs require significant energy and area, thereby limiting their wide adoption.
The energy consumption of DNNs is driven by both memory accesses and computation.
Binarized Neural Networks (BNNs), as a trade-off between accuracy and energy consumption, can achieve great energy reduction, and have good accuracy for large DNNs due to its regularization effect.
However, BNNs show poor accuracy when a smaller DNN configuration is adopted.
In this paper, we propose a new DNN model, LightNN, which replaces the multiplications to one shift or a constrained number of shifts and adds.
For a fixed DNN configuration, LightNNs have better accuracy at a slight energy increase than BNNs, yet are more energy efficient with only slightly less accuracy than conventional DNNs.
Therefore, LightNNs provide more options for hardware designers to make trade-offs between accuracy and energy.
Moreover, for large DNN configurations, LightNNs have a regularization effect, making them better in accuracy than conventional DNNs.
These conclusions are verified by experiment using the MNIST and CIFAR-10 datasets for different DNN configurations.
Recently the influence maximization problem has received much attention for its applications on viral marketing and product promotions.
However, such influence maximization problems have not taken into account the monetary effect on the purchasing decision of individuals.
To fulfill this gap, in this paper, we aim for maximizing the revenue by considering the quantity constraint on the promoted commodity.
For this problem, we not only identify a proper small group of individuals as seeds for promotion but also determine the pricing of the commodity.
To tackle the revenue maximization problem, we first introduce a strategic searching algorithm, referred to as Algorithm PRUB, which is able to derive the optimal solutions.
After that, we further modify PRUB to propose a heuristic, Algorithm PRUB+IF, for obtaining feasible solutions more effciently on larger instances.
Experiments on real social networks with different valuation distributions demonstrate the effectiveness of PRUB and PRUB+IF.
The Honey-Bee game is a two-player board game that is played on a connected hexagonal colored grid or (in a generalized setting) on a connected graph with colored nodes.
In a single move, a player calls a color and thereby conquers all the nodes of that color that are adjacent to his own current territory.
Both players want to conquer the majority of the nodes.
We show that winning the game is PSPACE-hard in general, NP-hard on series-parallel graphs, but easy on outerplanar graphs.
In the solitaire version, the goal of the single player is to conquer the entire graph with the minimum number of moves.
The solitaire version is NP-hard on trees and split graphs, but can be solved in polynomial time on co-comparability graphs.
Archetypal scenarios for change detection generally consider two images acquired through sensors of the same modality.
However, in some specific cases such as emergency situations, the only images available may be those acquired through different kinds of sensors.
More precisely, this paper addresses the problem of detecting changes between two multi-band optical images characterized by different spatial and spectral resolutions.
This sensor dissimilarity introduces additional issues in the context of operational change detection.
To alleviate these issues, classical change detection methods are applied after independent preprocessing steps (e.g., resampling) used to get the same spatial and spectral resolutions for the pair of observed images.
Nevertheless, these preprocessing steps tend to throw away relevant information.
Conversely, in this paper, we propose a method that more effectively uses the available information by modeling the two observed images as spatial and spectral versions of two (unobserved) latent images characterized by the same high spatial and high spectral resolutions.
As they cover the same scene, these latent images are expected to be globally similar except for possible changes in sparse spatial locations.
Thus, the change detection task is envisioned through a robust multi-band image fusion method which enforces the differences between the estimated latent images to be spatially sparse.
This robust fusion problem is formulated as an inverse problem which is iteratively solved using an efficient block-coordinate descent algorithm.
The proposed method is applied to real panchormatic/multispectral and hyperspectral images with simulated realistic changes.
A comparison with state-of-the-art change detection methods evidences the accuracy of the proposed strategy.
Learning speaker turn embeddings has shown considerable improvement in situations where conventional speaker modeling approaches fail.
However, this improvement is relatively limited when compared to the gain observed in face embedding learning, which has been proven very successful for face verification and clustering tasks.
Assuming that face and voices from the same identities share some latent properties (like age, gender, ethnicity), we propose three transfer learning approaches to leverage the knowledge from the face domain (learned from thousands of images and identities) for tasks in the speaker domain.
These approaches, namely target embedding transfer, relative distance transfer, and clustering structure transfer, utilize the structure of the source face embedding space at different granularities to regularize the target speaker turn embedding space as optimizing terms.
Our methods are evaluated on two public broadcast corpora and yield promising advances over competitive baselines in verification and audio clustering tasks, especially when dealing with short speaker utterances.
The analysis of the results also gives insight into characteristics of the embedding spaces and shows their potential applications.
Transmission of information reliably and efficiently across channels is one of the fundamental goals of coding and information theory.
In this respect, efficiently decodable deterministic coding schemes which achieve capacity provably have been elusive until as recent as 2008, even though schemes which come close to it in practice existed.
This survey tries to give the interested reader an overview of the area.
Erdal Arikan came up with his landmark polar coding shemes which achieve capacity on symmetric channels subject to the constraint that the input codewords are equiprobable.
His idea is to convert any B-DMC into efficiently encodable-decodable channels which have rates 0 and 1, while conserving capacity in this transformation.
An exponentially decreasing probability of error which independent of code rate is achieved for all rates lesser than the symmetric capacity.
These codes perform well in practice since encoding and decoding complexity is O(N log N).
Guruswami et al. improved the above results by showing that error probability can be made to decrease doubly exponentially in the block length.
We also study recent results by Urbanke et al. which show that 2-transitive codes also achieve capacity on erasure channels under MAP decoding.
Urbanke and his group use complexity theoretic results in boolean function analysis to prove that EXIT functions, which capture the error probability, have a sharp threshold at 1-R, thus proving that capacity is achieved.
One of the oldest and most widely used codes - Reed Muller codes are 2-transitive.
Polar codes are 2-transitive too and we thus have a different proof of the fact that they achieve capacity, though the rate of polarization would be better as found out by Guruswami.
For many years, we have observed industry struggling in defining a high quality requirements engineering (RE) and researchers trying to understand industrial expectations and problems.
Although we are investigating the discipline with a plethora of empirical studies, they still do not allow for empirical generalisations.
To lay an empirical and externally valid foundation about the state of the practice in RE, we aim at a series of open and reproducible surveys that allow us to steer future research in a problem-driven manner.
We designed a globally distributed family of surveys in joint collaborations with different researchers and completed the first run in Germany.
The instrument is based on a theory in the form of a set of hypotheses inferred from our experiences and available studies.
We test each hypothesis in our theory and identify further candidates to extend the theory by correlation and Grounded Theory analysis.
In this article, we report on the design of the family of surveys, its underlying theory, and the full results obtained from Germany with participants from 58 companies.
The results reveal, for example, a tendency to improve RE via internally defined qualitative methods rather than relying on normative approaches like CMMI.
We also discovered various RE problems that are statistically significant in practice.
For instance, we could corroborate communication flaws or moving targets as problems in practice.
Our results are not yet fully representative but already give first insights into current practices and problems in RE, and they allow us to draw lessons learnt for future replications.
Our results obtained from this first run in Germany make us confident that the survey design and instrument are well-suited to be replicated and, thereby, to create a generalisable empirical basis of RE in practice.
In this paper, we propose a novel progressive parameter pruning method for Convolutional Neural Network acceleration, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in a probabilistic manner.
Unlike existing deterministic pruning approaches, where unimportant weights are permanently eliminated, SPP introduces a pruning probability for each weight, and pruning is guided by sampling from the pruning probabilities.
A mechanism is designed to increase and decrease pruning probabilities based on importance criteria in the training process.
Experiments show that, with 4x speedup, SPP can accelerate AlexNet with only 0.3% loss of top-5 accuracy and VGG-16 with 0.8% loss of top-5 accuracy in ImageNet classification.
Moreover, SPP can be directly applied to accelerate multi-branch CNN networks, such as ResNet, without specific adaptations.
Our 2x speedup ResNet-50 only suffers 0.8% loss of top-5 accuracy on ImageNet.
We further show the effectiveness of SPP on transfer learning tasks.
The recently proposed self-ensembling methods have achieved promising results in deep semi-supervised learning, which penalize inconsistent predictions of unlabeled data under different perturbations.
However, they only consider adding perturbations to each single data point, while ignoring the connections between data samples.
In this paper, we propose a novel method, called Smooth Neighbors on Teacher Graphs (SNTG).
In SNTG, a graph is constructed based on the predictions of the teacher model, i.e., the implicit self-ensemble of models.
Then the graph serves as a similarity measure with respect to which the representations of "similar" neighboring points are learned to be smooth on the low-dimensional manifold.
We achieve state-of-the-art results on semi-supervised learning benchmarks.
The error rates are 9.89%, 3.99% for CIFAR-10 with 4000 labels, SVHN with 500 labels, respectively.
In particular, the improvements are significant when the labels are fewer.
For the non-augmented MNIST with only 20 labels, the error rate is reduced from previous 4.81% to 1.36%.
Our method also shows robustness to noisy labels.
Virtualization is generally adopted in server and desktop environments to provide for fault tolerance, resource management, and energy efficiency.
Virtualization enables parallel execution of multiple operating systems (OSs) while sharing the hardware resources.
Virtualization was previously not deemed as feasible technology for mobile and embedded devices due to their limited processing and memory resource.
However, the enterprises are advocating Bring Your Own Device (BYOD) applications that enable co-existence of heterogeneous OSs on a single mobile device.
Moreover, embedded device require virtualization for logical isolation of secure and general purpose OSs on a single device.
In this paper, we investigate the processor architectures in the mobile and embedded space while examining their formal visualizability.
We also compare the virtualization solutions enabling coexistence of multiple OSs in Multicore Processor System-on-Chip (MPSoC) mobile and embedded systems.
We advocate that virtualization is necessary to manage resource in MPSoC designs and to enable BYOD, security, and logical isolation use cases.
We consider deletion correcting codes over a q-ary alphabet.
It is well known that any code capable of correcting s deletions can also correct any combination of s total insertions and deletions.
To obtain asymptotic upper bounds on code size, we apply a packing argument to channels that perform different mixtures of insertions and deletions.
Even though the set of codes is identical for all of these channels, the bounds that we obtain vary.
Prior to this work, only the bounds corresponding to the all insertion case and the all deletion case were known.
We recover these as special cases.
The bound from the all deletion case, due to Levenshtein, has been the best known for more than forty five years.
Our generalized bound is better than Levenshtein's bound whenever the number of deletions to be corrected is larger than the alphabet size.
The BLEBeacon dataset is a collection of Bluetooth Low Energy (BLE) advertisement packets/traces generated from BLE beacons carried by people following their daily routine inside a university building.
A network of Raspberry Pi 3 (RPi)-based edge devices were deployed inside a multi-floor facility continuously gathering BLE advertisement packets and storing them in a cloud-based environment.
The data were collected during an IRB (Institutional Review Board forhe Protection of Human Subjects in Research) approved one-month trial.
Each facility occupant/participant was handed a BLE beacon to carry with him at all times.
The focus is on presenting a real-life realization of a location-aware sensing infrastructure, that can provide insights for smart sensing platforms, crowd-based applications, building management, and user-localization frameworks.
This work describes and documents the published BLEBeacon dataset.
Evolutionary algorithms (EAs), a large class of general purpose optimization algorithms inspired from the natural phenomena, are widely used in various industrial optimizations and often show excellent performance.
This paper presents an attempt towards revealing their general power from a statistical view of EAs.
By summarizing a large range of EAs into the sampling-and-learning framework, we show that the framework directly admits a general analysis on the probable-absolute-approximate (PAA) query complexity.
We particularly focus on the framework with the learning subroutine being restricted as a binary classification, which results in the sampling-and-classification (SAC) algorithms.
With the help of the learning theory, we obtain a general upper bound on the PAA query complexity of SAC algorithms.
We further compare SAC algorithms with the uniform search in different situations.
Under the error-target independence condition, we show that SAC algorithms can achieve polynomial speedup to the uniform search, but not super-polynomial speedup.
Under the one-side-error condition, we show that super-polynomial speedup can be achieved.
This work only touches the surface of the framework.
Its power under other conditions is still open.
Comments of online articles provide extended views and improve user engagement.
Automatically making comments thus become a valuable functionality for online forums, intelligent chatbots, etc.
This paper proposes the new task of automatic article commenting, and introduces a large-scale Chinese dataset with millions of real comments and a human-annotated subset characterizing the comments' varying quality.
Incorporating the human bias of comment quality, we further develop automatic metrics that generalize a broad set of popular reference-based metrics and exhibit greatly improved correlations with human evaluations.
We define a plane curve to be threadable if it can rigidly pass through a point-hole in a line L without otherwise touching L. Threadable curves are in a sense generalizations of monotone curves.
We have two main results.
The first is a linear-time algorithm for deciding whether a polygonal curve is threadable---O(n) for a curve of n vertices---and if threadable, finding a sequence of rigid motions to thread it through a hole.
We also sketch an argument that shows that the threadability of algebraic curves can be decided in time polynomial in the degree of the curve.
The second main result is an O(n polylog n)-time algorithm for deciding whether a 3D polygonal curve can thread through hole in a plane in R^3, and if so, providing a description of the rigid motions that achieve the threading.
In this paper we introduce a mathematical model that captures some of the salient features of recommender systems that are based on popularity and that try to exploit social ties among the users.
We show that, under very general conditions, the market always converges to a steady state, for which we are able to give an explicit form.
Thanks to this we can tell rather precisely how much a market is altered by a recommendation system, and determine the power of users to influence others.
Our theoretical results are complemented by experiments with real world social networks showing that social graphs prevent large market distortions in spite of the presence of highly influential users.
Covert aspects of ongoing user mental states provide key context information for user-aware human computer interactions.
In this paper, we focus on the problem of estimating the vigilance of users using EEG and EOG signals.
To improve the feasibility and wearability of vigilance estimation devices for real-world applications, we adopt a novel electrode placement for forehead EOG and extract various eye movement features, which contain the principal information of traditional EOG.
We explore the effects of EEG from different brain areas and combine EEG and forehead EOG to leverage their complementary characteristics for vigilance estimation.
Considering that the vigilance of users is a dynamic changing process because the intrinsic mental states of users involve temporal evolution, we introduce continuous conditional neural field and continuous conditional random field models to capture dynamic temporal dependency.
We propose a multimodal approach to estimating vigilance by combining EEG and forehead EOG and incorporating the temporal dependency of vigilance into model training.
The experimental results demonstrate that modality fusion can improve the performance compared with a single modality, EOG and EEG contain complementary information for vigilance estimation, and the temporal dependency-based models can enhance the performance of vigilance estimation.
From the experimental results, we observe that theta and alpha frequency activities are increased, while gamma frequency activities are decreased in drowsy states in contrast to awake states.
The forehead setup allows for the simultaneous collection of EEG and EOG and achieves comparative performance using only four shared electrodes in comparison with the temporal and posterior sites.
An efficient speech to text converter for mobile application is presented in this work.
The prime motive is to formulate a system which would give optimum performance in terms of complexity, accuracy, delay and memory requirements for mobile environment.
The speech to text converter consists of two stages namely front-end analysis and pattern recognition.
The front end analysis involves preprocessing and feature extraction.
The traditional voice activity detection algorithms which track only energy cannot successfully identify potential speech from input because the unwanted part of the speech also has some energy and appears to be speech.
In the proposed system, VAD that calculates energy of high frequency part separately as zero crossing rate to differentiate noise from speech is used.
Mel Frequency Cepstral Coefficient (MFCC) is used as feature extraction method and Generalized Regression Neural Network is used as recognizer.
MFCC provides low word error rate and better feature extraction.
Neural Network improves the accuracy.
Thus a small database containing all possible syllable pronunciation of the user is sufficient to give recognition accuracy closer to 100%.
Thus the proposed technique entertains realization of real time speaker independent applications like mobile phones, PDAs etc.
We present techniques to prove termination of cycle rewriting, that is, string rewriting on cycles, which are strings in which the start and end are connected.
Our main technique is to transform cycle rewriting into string rewriting and then apply state of the art techniques to prove termination of the string rewrite system.
We present three such transformations, and prove for all of them that they are sound and complete.
In this way not only termination of string rewriting of the transformed system implies termination of the original cycle rewrite system, a similar conclusion can be drawn for non-termination.
Apart from this transformational approach, we present a uniform framework of matrix interpretations, covering most of the earlier approaches to automatically proving termination of cycle rewriting.
All our techniques serve both for proving termination and relative termination.
We present several experiments showing the power of our techniques.
Capturability analysis of the linear inverted pendulum (LIP) model enabled walking with constrained height based on the capture point.
We generalize this analysis to the variable-height inverted pendulum (VHIP) and show how it enables 3D walking over uneven terrains based on capture inputs.
Thanks to a tailored optimization scheme, we can compute these inputs fast enough for real-time model predictive control.
We implement this approach as open-source software and demonstrate it in dynamic simulations.
Background: Mobile phone sensor technology has great potential in providing behavioral markers of mental health.
However, this promise has not yet been brought to fruition.
Objective: The objective of our study was to examine challenges involved in developing an app to extract behavioral markers of mental health from passive sensor data.
Methods: Both technical challenges and acceptability of passive data collection for mental health research were assessed based on literature review and results obtained from a feasibility study.
Socialise, a mobile phone app developed at the Black Dog Institute, was used to collect sensor data (Bluetooth, global positioning system, and battery status) and investigate views and experiences of a group of people with lived experience of mental health challenges (N=32).
Results: On average, sensor data were obtained for 55% (Android) and 45% (iPhone OS) of scheduled scans.
Battery life was reduced from 21.3 hours to 18.8 hours when scanning every 5 minutes with a reduction of 2.5 hours or 12%.
Despite this relatively small reduction, most participants reported that the app had a noticeable effect on their battery life.
In addition to battery life, the purpose of data collection, trust in the organization that collects data, and perceived impact on privacy were identified as main factors for acceptability.
Conclusions: Based on the findings of the feasibility study and literature review, we recommend a commitment to open science and transparent reporting and stronger partnerships and communication with users.
Sensing technology has the potential to greatly enhance the delivery and impact of mental health care.
Realizing this requires all aspects of mobile phone sensor technology to be rigorously assessed.
Normalized graph cut (NGC) has become a popular research topic due to its wide applications in a large variety of areas like machine learning and very large scale integration (VLSI) circuit design.
Most of traditional NGC methods are based on pairwise relationships (similarities).
However, in real-world applications relationships among the vertices (objects) may be more complex than pairwise, which are typically represented as hyperedges in hypergraphs.
Thus, normalized hypergraph cut (NHC) has attracted more and more attention.
Existing NHC methods cannot achieve satisfactory performance in real applications.
In this paper, we propose a novel relaxation approach, which is called relaxed NHC (RNHC), to solve the NHC problem.
Our model is defined as an optimization problem on the Stiefel manifold.
To solve this problem, we resort to the Cayley transformation to devise a feasible learning algorithm.
Experimental results on a set of large hypergraph benchmarks for clustering and partitioning in VLSI domain show that RNHC can outperform the state-of-the-art methods.
Single image dehazing is a challenging ill-posed restoration problem.
Various prior-based and learning-based methods have been proposed.
Most of them follow a classic atmospheric scattering model which is an elegant simplified physical model based on the assumption of single-scattering and homogeneous atmospheric medium.
The formulation of haze in realistic environment is more complicated.
In this paper, we propose to take its essential mechanism as "black box", and focus on learning an input-adaptive trainable end-to-end dehazing model.
An U-Net like encoder-decoder deep network via progressive feature fusions has been proposed to directly learn highly nonlinear transformation function from observed hazy image to haze-free ground-truth.
The proposed network is evaluated on two public image dehazing benchmarks.
The experiments demonstrate that it can achieve superior performance when compared with popular state-of-the-art methods.
With efficient GPU memory usage, it can satisfactorily recover ultra high definition hazed image up to 4K resolution, which is unaffordable by many deep learning based dehazing algorithms.
In the face of scarcity in detailed training annotations, the ability to perform object localization tasks in real-time with weak-supervision is very valuable.
However, the computational cost of generating and evaluating region proposals is heavy.
We adapt the concept of Class Activation Maps (CAM) into the very first weakly-supervised 'single-shot' detector that does not require the use of region proposals.
To facilitate this, we propose a novel global pooling technique called Spatial Pyramid Averaged Max (SPAM) pooling for training this CAM-based network for object extent localisation with only weak image-level supervision.
We show this global pooling layer possesses a near ideal flow of gradients for extent localization, that offers a good trade-off between the extremes of max and average pooling.
Our approach only requires a single network pass and uses a fast-backprojection technique, completely omitting any region proposal steps.
To the best of our knowledge, this is the first approach to do so.
Due to this, we are able to perform inference in real-time at 35fps, which is an order of magnitude faster than all previous weakly supervised object localization frameworks.
A programmable optical computer has remained an elusive concept.
To construct a practical computing primitive equivalent to an electronic Boolean logic, one should find a nonlinear phenomenon that overcomes weaknesses present in many optical processing schemes.
Ideally, the nonlinearity should provide a functionally complete set of logic operations, enable ultrafast all-optical programmability, and allow cascaded operations without a change in the operating wavelength or in the signal encoding format.
Here we demonstrate a programmable logic gate using an injection-locked Vertical-Cavity Surface-Emitting Laser (VCSEL).
The gate program is switched between the AND and the OR operations at the rate of 1 GHz with Bit Error Ratio (BER) of 10e-6 without changes in the wavelength or in the signal encoding format.
The scheme is based on nonlinearity of normalization operations, which can be used to construct any continuous complex function or operation, Boolean or otherwise.
Bias is a common problem in today's media, appearing frequently in text and in visual imagery.
Users on social media websites such as Twitter need better methods for identifying bias.
Additionally, activists --those who are motivated to effect change related to some topic, need better methods to identify and counteract bias that is contrary to their mission.
With both of these use cases in mind, in this paper we propose a novel tool called UnbiasedCrowd that supports identification of, and action on bias in visual news media.
In particular, it addresses the following key challenges (1) identification of bias; (2) aggregation and presentation of evidence to users; (3) enabling activists to inform the public of bias and take action by engaging people in conversation with bots.
We describe a preliminary study on the Twitter platform that explores the impressions that activists had of our tool, and how people reacted and engaged with online bots that exposed visual bias.
We conclude by discussing design and implication of our findings for creating future systems to identify and counteract the effects of news bias.
The KE inference system is a tableau method developed by Marco Mondadori which was presented as an improvement, in the computational efficiency sense, over Analytic Tableaux.
In the literature, there is no description of a theorem prover based on the KE method for the C1 paraconsistent logic.
Paraconsistent logics have several applications, such as in robot control and medicine.
These applications could benefit from the existence of such a prover.
We present a sound and complete KE system for C1, an informal specification of a strategy for the C1 prover as well as problem families that can be used to evaluate provers for C1.
The C1 KE system and the strategy described in this paper will be used to implement a KE based prover for C1, which will be useful for those who study and apply paraconsistent logics.
Cross-correlation is a popular signal processing technique used in numerous location tracking systems for obtaining reliable range information.
However, its efficient design and practical implementation has not yet been achieved on mote platforms that are typical in wireless sensor network due to resource constrains.
In this paper, we propose SparseS-XCorr: cross-correlation via structured sparse representation, a new computing framework for ranging based on L1-minimization and structured sparsity.
The key idea is to compress the ranging signal samples on the mote by efficient random projections and transfer them to a central device; where a convex optimization process estimates the range by exploiting the sparse signal structure in the proposed correlation dictionary.
Through theoretical validation, extensive empirical studies and experiments on an end-to-end acoustic ranging system implemented on resource limited off-the-shelf sensor nodes, we show that the proposed framework can achieve up to two orders of magnitude better performance compared to other approaches such as working on DCT domain and downsampling.
Compared to the standard cross-correlation, it is able to obtain range estimates with a bias of 2-6cm with 30% and approximately 100cm with 5% compressed measurements.
Its structured sparsity model is able to improve the ranging accuracy by 40% under challenging recovery conditions (such as high compression factor and low signal-to-noise ratio) by overcoming limitations due to dictionary coherence.
The segmentation of liver lesions is crucial for detection, diagnosis and monitoring progression of liver cancer.
However, design of accurate automated methods remains challenging due to high noise in CT scans, low contrast between liver and lesions, as well as large lesion variability.
We propose a 3D automatic, unsupervised method for liver lesions segmentation using a phase separation approach.
It is assumed that liver is a mixture of two phases: healthy liver and lesions, represented by different image intensities polluted by noise.
The Cahn-Hilliard equation is used to remove the noise and separate the mixture into two distinct phases with well-defined interfaces.
This simplifies the lesion detection and segmentation task drastically and enables to segment liver lesions by thresholding the Cahn-Hilliard solution.
The method was tested on 3Dircadb and LITS dataset.
Big data sets must be carefully partitioned into statistically similar data subsets that can be used as representative samples for big data analysis tasks.
In this paper, we propose the random sample partition (RSP) data model to represent a big data set as a set of non-overlapping data subsets, called RSP data blocks, where each RSP data block has a probability distribution similar to the whole big data set.
Under this data model, efficient block level sampling is used to randomly select RSP data blocks, replacing expensive record level sampling to select sample data from a big distributed data set on a computing cluster.
We show how RSP data blocks can be employed to estimate statistics of a big data set and build models which are equivalent to those built from the whole big data set.
In this approach, analysis of a big data set becomes analysis of few RSP data blocks which have been generated in advance on the computing cluster.
Therefore, the new method for data analysis based on RSP data blocks is scalable to big data.
Tool-assisted refactoring transformations must be trustworthy if programmers are to be confident in applying them on arbitrarily extensive and complex code in order to improve style or efficiency.
We propose a simple, high-level but rigorous, notation for defining refactoring transformations in Erlang, and show that this notation provides an extensible, verifiable and executable specification language for refactoring.
To demonstrate the applicability of our approach, we show how to define and verify a number of example refactorings in the system.
Annual Average Daily Traffic (AADT) is an important parameter used in traffic engineering analysis.
Departments of Transportation (DOTs) continually collect traffic count using both permanent count stations (i.e., Automatic Traffic Recorders or ATRs) and temporary short-term count stations.
In South Carolina, 87% of the ATRs are located on interstates and arterial highways.
For most secondary highways (i.e., collectors and local roads), AADT is estimated based on short-term counts.
This paper develops AADT estimation models for different roadway functional classes with two machine learning techniques: Artificial Neural Network (ANN) and Support Vector Regression (SVR).
The models aim to predict AADT from short-term counts.
The results are first compared against each other to identify the best model.
Then, the results of the best model are compared against a regression method and factor-based method.
The comparison reveals the superiority of SVR for AADT estimation for different roadway functional classes over all other methods.
Among all developed models for different functional roadway classes, the SVR-based model shows a minimum root mean square error (RMSE) of 0.22 and a mean absolute percentage error (MAPE) of 11.3% for the interstate/expressway functional class.
This model also shows a higher R-squared value compared to the traditional factor-based model and regression model.
SVR models are validated for each roadway functional class using the 2016 ATR data and selected short-term count data collected by the South Carolina Department of Transportation (SCDOT).
The validation results show that the SVR-based AADT estimation models can be used by the SCDOT as a reliable option to predict AADT from the short-term counts.
The research described in this paper concerns automatic cyberbullying detection in social media.
There are two goals to achieve: building a gold standard cyberbullying detection dataset and measuring the performance of the Samurai cyberbullying detection system.
The Formspring dataset provided in a Kaggle competition was re-annotated as a part of the research.
The annotation procedure is described in detail and, unlike many other recent data annotation initiatives, does not use Mechanical Turk for finding people willing to perform the annotation.
The new annotation compared to the old one seems to be more coherent since all tested cyberbullying detection system performed better on the former.
The performance of the Samurai system is compared with 5 commercial systems and one well-known machine learning algorithm, used for classifying textual content, namely Fasttext.
It turns out that Samurai scores the best in all measures (accuracy, precision and recall), while Fasttext is the second-best performing algorithm.
Research interest in rapid structured-light imaging has grown increasingly for the modeling of moving objects, and a number of methods have been suggested for the range capture in a single video frame.
The imaging area of a 3D object using a single projector is restricted since the structured light is projected only onto a limited area of the object surface.
Employing additional projectors to broaden the imaging area is a challenging problem since simultaneous projection of multiple patterns results in their superposition in the light-intersected areas and the recognition of original patterns is by no means trivial.
This paper presents a novel method of multi-projector color structured-light vision based on projector-camera triangulation.
By analyzing the behavior of superposed-light colors in a chromaticity domain, we show that the original light colors cannot be properly extracted by the conventional direct estimation.
We disambiguate multiple projectors by multiplexing the orientations of projector patterns so that the superposed patterns can be separated by explicit derivative computations.
Experimental studies are carried out to demonstrate the validity of the presented method.
The proposed method increases the efficiency of range acquisition compared to conventional active stereo using multiple projectors.
In an incoherent dictionary, most signals that admit a sparse representation admit a unique sparse representation.
In other words, there is no way to express the signal without using strictly more atoms.
This work demonstrates that sparse signals typically enjoy a higher privilege: each nonoptimal representation of the signal requires far more atoms than the sparsest representation-unless it contains many of the same atoms as the sparsest representation.
One impact of this finding is to confer a certain degree of legitimacy on the particular atoms that appear in a sparse representation.
This result can also be viewed as an uncertainty principle for random sparse signals over an incoherent dictionary.
Introduced by Dal Lago and Hofmann, quantitative realizability is a technique used to define models for logics based on Multiplicative Linear Logic.
A particularity is that functions are interpreted as bounded time computable functions.
It has been used to give new and uniform proofs of soundness of several type systems with respect to certain time complexity classes.
We propose a reformulation of their ideas in the setting of Krivine's classical realizability.
The framework obtained generalizes Dal Lago and Hofmann's realizability, and reveals deep connections between quantitative realizability and a linear variant of Cohen's forcing.
We introduce TextWorld, a sandbox learning environment for the training and evaluation of RL agents on text-based games.
TextWorld is a Python library that handles interactive play-through of text games, as well as backend functions like state tracking and reward assignment.
It comes with a curated list of games whose features and challenges we have analyzed.
More significantly, it enables users to handcraft or automatically generate new games.
Its generative mechanisms give precise control over the difficulty, scope, and language of constructed games, and can be used to relax challenges inherent to commercial text games like partial observability and sparse rewards.
By generating sets of varied but similar games, TextWorld can also be used to study generalization and transfer learning.
We cast text-based games in the Reinforcement Learning formalism, use our framework to develop a set of benchmark games, and evaluate several baseline agents on this set and the curated list.
Continuous-time signals are well known for not being perfectly localized in both time and frequency domains.
Conversely, a signal defined over the vertices of a graph can be perfectly localized in both vertex and frequency domains.
We derive the conditions ensuring the validity of this property and then, building on this theory, we provide the conditions for perfect reconstruction of a graph signal from its samples.
Next, we provide a finite step algorithm for the reconstruction of a band-limited signal from its samples and then we show the effect of sampling a non perfectly band-limited signal and show how to select the bandwidth that minimizes the mean square reconstruction error.
We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials.
Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the `PICO' elements).
These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary.
We acquired annotations from a diverse set of workers with varying levels of expertise and cost.
We describe our data collection process and the corpus itself in detail.
We then outline a set of challenging NLP tasks that would aid searching of the medical literature and the practice of evidence-based medicine.
Deep neural network models for Chinese zero pronoun resolution learn semantic information for zero pronoun and candidate antecedents, but tend to be short-sighted---they often make local decisions.
They typically predict coreference chains between the zero pronoun and one single candidate antecedent one link at a time, while overlooking their long-term influence on future decisions.
Ideally, modeling useful information of preceding potential antecedents is critical when later predicting zero pronoun-candidate antecedent pairs.
In this study, we show how to integrate local and global decision-making by exploiting deep reinforcement learning models.
With the help of the reinforcement learning agent, our model learns the policy of selecting antecedents in a sequential manner, where useful information provided by earlier predicted antecedents could be utilized for making later coreference decisions.
Experimental results on OntoNotes 5.0 dataset show that our technique surpasses the state-of-the-art models.
We introduce here a fully automated convolutional neural network-based method for brain image processing to Detect Neurons in different brain Regions during Development (DeNeRD).
Our method takes a developing mouse brain as input and i) registers the brain sections against a developing mouse reference atlas, ii) detects various types of neurons, and iii) quantifies the neural density in many unique brain regions at different postnatal (P) time points.
Our method is invariant to the shape, size and expression of neurons and by using DeNeRD, we compare the brain-wide neural density of all GABAergic neurons in developing brains of ages P4, P14 and P56.
We discover and report 6 different clusters of regions in the mouse brain in which GABAergic neurons develop in a differential manner from early age (P4) to adulthood (P56).
These clusters reveal key steps of GABAergic cell development that seem to track with the functional development of diverse brain regions as the mouse transitions from a passive receiver of sensory information (<P14) to an active seeker (>P14).
The X-problem of number 3 for one dimension and related observations are discussed
Scholars and practitioners across domains are increasingly concerned with algorithmic transparency and opacity, interrogating the values and assumptions embedded in automated, black-boxed systems, particularly in user-generated content platforms.
I report from an ethnography of infrastructure in Wikipedia to discuss an often understudied aspect of this topic: the local, contextual, learned expertise involved in participating in a highly automated social-technical environment.
Today, the organizational culture of Wikipedia is deeply intertwined with various data-driven algorithmic systems, which Wikipedians rely on to help manage and govern the "anyone can edit" encyclopedia at a massive scale.
These bots, scripts, tools, plugins, and dashboards make Wikipedia more efficient for those who know how to work with them, but like all organizational culture, newcomers must learn them if they want to fully participate.
I illustrate how cultural and organizational expertise is enacted around algorithmic agents by discussing two autoethnographic vignettes, which relate my personal experience as a veteran in Wikipedia.
I present thick descriptions of how governance and gatekeeping practices are articulated through and in alignment with these automated infrastructures.
Over the past 15 years, Wikipedian veterans and administrators have made specific decisions to support administrative and editorial workflows with automation in particular ways and not others.
I use these cases of Wikipedia's bot-supported bureaucracy to discuss several issues in the fields of critical algorithms studies, critical data studies, and fairness, accountability, and transparency in machine learning -- most principally arguing that scholarship and practice must go beyond trying to "open up the black box" of such systems and also examine sociocultural processes like newcomer socialization.
If an interarea oscillatory mode has insufficient damping, generator redispatch can be used to improve its damping.
We explain and apply a new analytic formula for the modal sensitivity to rank the best pairs of generators to redispatch.
The formula requires some dynamic power system data and we show how to obtain that data from synchrophasor measurements.
The application of the formula to damp interarea modes is explained and illustrated with interarea modes of the New England 10-machine power system.
In this paper we propose a novel texture descriptor called Fractal Weighted Local Binary Pattern (FWLBP).
The fractal dimension (FD) measure is relatively invariant to scale-changes, and presents a good correlation with human viewpoint of surface roughness.
We have utilized this property to construct a scale-invariant descriptor.
Here, the input image is sampled using an augmented form of the local binary pattern (LBP) over three different radii, and then used an indexing operation to assign FD weights to the collected samples.
The final histogram of the descriptor has its features calculated using LBP, and its weights computed from the FD image.
The proposed descriptor is scale invariant, and is also robust in rotation or reflection, and partially tolerant to noise and illumination changes.
In addition, the local fractal dimension is relatively insensitive to the bi-Lipschitz transformations, whereas its extension is adequate to precisely discriminate the fundamental of texture primitives.
Experiment results carried out on standard texture databases show that the proposed descriptor achieved better classification rates compared to the state-of-the-art descriptors.
Datacenter applications demand both low latency and high throughput; while interactive applications (e.g., Web Search) demand low tail latency for their short messages due to their partition-aggregate software architecture, many data-intensive applications (e.g., Map-Reduce) require high throughput for long flows as they move vast amounts of data across the network.
Recent proposals improve latency of short flows and throughput of long flows by addressing the shortcomings of existing packet scheduling and congestion control algorithms, respectively.
We make the key observation that long tails in the Flow Completion Times (FCT) of short flows result from packets that suffer congestion at more than one switch along their paths in the network.
Our proposal, Slytherin, specifically targets packets that suffered from congestion at multiple points and prioritizes them in the network.
Slytherin leverages ECN mechanism which is widely used in existing datacenters to identify such tail packets and dynamically prioritizes them using existing priority queues.
As compared to existing state-of-the-art packet scheduling proposals, Slytherin achieves 18.6% lower 99th percentile flow completion times for short flows without any loss of throughput.
Further, Slytherin drastically reduces 99th percentile queue length in switches by a factor of about 2x on average.
Nowadays, the ubiquity of various sensors enables the collection of voluminous datasets of car trajectories.
Such datasets enable analysts to make sense of driving patterns and behaviors: in order to understand the behavior of drivers, one approach is to break a trajectory into its underlying patterns and then analyze that trajectory in terms of derived patterns.
The process of trajectory segmentation is a function of various resources including a set of ground truth trajectories with their driving patterns.
To the best of our knowledge, no such ground-truth dataset exists in the literature.
In this paper, we describe a trajectory annotation framework and report our results to annotate a dataset of personal car trajectories.
Our annotation methodology consists of a crowd-sourcing task followed by a precise process of aggregation.
Our annotation process consists of two granularity levels, one to specify the annotation (segment border) and the other one to describe the type of the segment (e.g. speed-up, turn, merge, etc.).
The output of our project, Dataset of Annotated Car Trajectories (DACT), is available online at https://figshare.com/articles/dact_dataset_of_annotated_car_trajectories/5005289 .
Novelty attracts attention like popularity.
Hence predicting novelty is as important as popularity.
Novelty is the side effect of competition and aging in evolving systems.
Recent behavior or recent link gain in networks plays an important role in emergence or trend.
We exploited this wisdom and came up with two models considering different scenarios and systems.
Where recent behavior dominates over total behavior (total link gain) in the first one, and recent behavior is as important as total behavior for future link gain in second one.
It suppose that random walker walks on a network and can jump to any node, the probablity of jumping or making connection to other node is based on which node is recently more active or receiving more links.
In our assumption random walker can also jump to node which is already popular but recently not popular.
We are able to predict rising novelties or popular nodes which is generally suppressed under preferential attachment effect.
To show performance of our model we have conducted experiments on four real data sets namely, MovieLens, Netflix, Facebook and Arxiv High Energy Physics paper citation.
For testing our model we used four information retrieval indices namely Precision, Novelty, Area Under Receiving Operating Characteristic(AUC) and Kendal's rank correlation coefficient.
We have used four benchmark models for validating our proposed models.
Although our model doesn't perform better in all the cases but, it has theoretical significance in working better for recent behavior dominant systems.
Thanks to the advances in the technology of low-cost digital cameras and the popularity of the self-recording culture, the amount of visual data on the Internet is going to the opposite side of the available time and patience of the users.
Thus, most of the uploaded videos are doomed to be forgotten and unwatched in a computer folder or website.
In this work, we address the problem of creating smooth fast-forward videos without losing the relevant content.
We present a new adaptive frame selection formulated as a weighted minimum reconstruction problem, which combined with a smoothing frame transition method accelerates first-person videos emphasizing the relevant segments and avoids visual discontinuities.
The experiments show that our method is able to fast-forward videos to retain as much relevant information and smoothness as the state-of-the-art techniques in less time.
We also present a new 80-hour multimodal (RGB-D, IMU, and GPS) dataset of first-person videos with annotations for recorder profile, frame scene, activities, interaction, and attention.
Navigating safely in urban environments remains a challenging problem for autonomous vehicles.
Occlusion and limited sensor range can pose significant challenges to safely navigate among pedestrians and other vehicles in the environment.
Enabling vehicles to quantify the risk posed by unseen regions allows them to anticipate future possibilities, resulting in increased safety and ride comfort.
This paper proposes an algorithm that takes advantage of the known road layouts to forecast, quantify, and aggregate risk associated with occlusions and limited sensor range.
This allows us to make predictions of risk induced by unobserved vehicles even in heavily occluded urban environments.
The risk can then be used either by a low-level planning algorithm to generate better trajectories, or by a high-level one to plan a better route.
The proposed algorithm is evaluated on intersection layouts from real-world map data with up to five other vehicles in the scene, and verified to reduce collision rates by 4.8x comparing to a baseline method while improving driving comfort.
While imitation learning is becoming common practice in robotics, this approach often suffers from data mismatch and compounding errors.
DAgger is an iterative algorithm that addresses these issues by continually aggregating training data from both the expert and novice policies, but does not consider the impact of safety.
We present a probabilistic extension to DAgger, which uses the distribution over actions provided by the novice policy, for a given observation.
Our method, which we call DropoutDAgger, uses dropout to train the novice as a Bayesian neural network that provides insight to its confidence.
Using the distribution over the novice's actions, we estimate a probabilistic measure of safety with respect to the expert action, tuned to balance exploration and exploitation.
The utility of this approach is evaluated on the MuJoCo HalfCheetah and in a simple driving experiment, demonstrating improved performance and safety compared to other DAgger variants and classic imitation learning.
Collaborative filtering (CF) is a powerful recommender system that generates a list of recommended items for an active user based on the ratings of similar users.
This paper presents a novel approach to CF by first finding the set of users similar to the active user by adopting self-organizing maps (SOM), followed by k-means clustering.
Then, the ratings for each item in the cluster closest to the active user are mapped to the frequency domain using the Discrete Fourier Transform (DFT).
The power spectra of the mapped ratings are generated, and a new similarity measure based on the coherence of these power spectra is calculated.
The proposed similarity measure is more time efficient than current state-of-the-art measures.
Moreover, it can capture the global similarity between the profiles of users.
Experimental results show that the proposed approach overcomes the major problems in existing CF algorithms as follows: First, it mitigates the scalability problem by creating clusters of similar users and applying the time-efficient similarity measure.
Second, its frequency-based similarity measure is less sensitive to sparsity problems because the DFT performs efficiently even with sparse data.
Third, it outperforms standard similarity measures in terms of accuracy.
Consider the problem in which n jobs that are classified into k types are to be scheduled on m identical machines without preemption.
A machine requires a proper setup taking s time units before processing jobs of a given type.
The objective is to minimize the makespan of the resulting schedule.
We design and analyze an approximation algorithm that runs in time polynomial in n, m and k and computes a solution with an approximation factor that can be made arbitrarily close to 3/2.
Colours are everywhere.
They embody a significant part of human visual perception.
In this paper, we explore the paradigm of hallucinating colours from a given gray-scale image.
The problem of colourization has been dealt in previous literature but mostly in a supervised manner involving user-interference.
With the emergence of Deep Learning methods numerous tasks related to computer vision and pattern recognition have been automatized and carried in an end-to-end fashion due to the availability of large data-sets and high-power computing systems.
We investigate and build upon the recent success of Conditional Generative Adversarial Networks (cGANs) for Image-to-Image translations.
In addition to using the training scheme in the basic cGAN, we propose an encoder-decoder generator network which utilizes the class-specific cross-entropy loss as well as the perceptual loss in addition to the original objective function of cGAN.
We train our model on a large-scale dataset and present illustrative qualitative and quantitative analysis of our results.
Our results vividly display the versatility and proficiency of our methods through life-like colourization outcomes.
Over the past decade, contextual bandit algorithms have been gaining in popularity due to their effectiveness and flexibility in solving sequential decision problems---from online advertising and finance to clinical trial design and personalized medicine.
At the same time, there are, as of yet, surprisingly few options that enable researchers and practitioners to simulate and compare the wealth of new and existing bandit algorithms in a standardized way.
To help close this gap between analytical research and empirical evaluation the current paper introduces the object-oriented R package "contextual": a user-friendly and, through its object-oriented structure, easily extensible framework that facilitates parallelized comparison of contextual and context-free bandit policies through both simulation and offline analysis.
We consider the general problem of matching a subspace to a signal in R^N that has been observed indirectly (compressed) through a random projection.
We are interested in the case where the collection of K-dimensional subspaces is continuously parameterized, i.e. naturally indexed by an interval from the real line, or more generally a region of R^D.
Our main results show that if the dimension of the random projection is on the order of K times a geometrical constant that describes the complexity of the collection, then the match obtained from the compressed observation is nearly as good as one obtained from a full observation of the signal.
We give multiple concrete examples of collections of subspaces for which this geometrical constant can be estimated, and discuss the relevance of the results to the general problems of template matching and source localization.
We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data.
Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with corresponding 3D poses.
Most existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically.
Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time.
Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose algorithms.
Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data.
Our proposed algorithm opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.
The pedagogy of teaching and learning has changed with the proliferation of communication technology and it is necessary to develop interactive learning materials for children that may improve their learning, catching, and memorizing capabilities.
Perhaps, one of the most important innovations in the age of technology is multimedia and its application.
It is imperative to create high quality and realistic learning environment for children.
Interactive learning materials can be easier to understand and deal with their first learning.
We developed some interactive learning materials in the form of a video for Playgroup using multimedia application tools.
This study investigated the impact of students' abilities to acquire new knowledge or skills through interactive learning materials.
We visited one kindergartens (Nursery schools), interviewed class teachers about their teaching methods and level of students' ability of recognizing English alphabets, pictures, etc.
The course teachers were provided interactive learning materials to show their playgroups for a number of sessions.
The video included English alphabets with related words and pictures, and motivational fun.
We noticed that almost all children were very interested to interact with their leaning video.
The students were assessed individually and asked to recognize the alphabets, and pictures.
The students adapted with their first alphabets very quickly.
However, there were individual differences in their cognitive development.
This interactive multimedia can be an alternative to traditional pedagogy for teaching playgroups.
The Sentinel-2 satellite mission delivers multi-spectral imagery with 13 spectral bands, acquired at three different spatial resolutions.
The aim of this research is to super-resolve the lower-resolution (20 m and 60 m Ground Sampling Distance - GSD) bands to 10 m GSD, so as to obtain a complete data cube at the maximal sensor resolution.
We employ a state-of-the-art convolutional neural network (CNN) to perform end-to-end upsampling, which is trained with data at lower resolution, i.e., from 40->20 m, respectively 360->60 m GSD.
In this way, one has access to a virtually infinite amount of training data, by downsampling real Sentinel-2 images.
We use data sampled globally over a wide range of geographical locations, to obtain a network that generalises across different climate zones and land-cover types, and can super-resolve arbitrary Sentinel-2 images without the need of retraining.
In quantitative evaluations (at lower scale, where ground truth is available), our network, which we call DSen2, outperforms the best competing approach by almost 50% in RMSE, while better preserving the spectral characteristics.
It also delivers visually convincing results at the full 10 m GSD.
The code is available at https://github.com/lanha/DSen2
An important way to resolve games of conflict (snowdrift, hawk-dove, chicken) involves adopting a convention: a correlated equilibrium that avoids any conflict between aggressive strategies.
Dynamic networks allow individuals to resolve conflict via their network connections rather than changing their strategy.
Exploring how behavioral strategies coevolve with social networks reveals new dynamics that can help explain the origins and robustness of conventions.
Here we model the emergence of conventions as correlated equilibria in dynamic networks.
Our results show that networks have the tendency to break the symmetry between the two conventional solutions in a strongly biased way.
Rather than the correlated equilibrium associated with ownership norms (play aggressive at home, not away), we usually see the opposite host-guest norm (play aggressive away, not at home) evolve on dynamic networks, a phenomenon common to human interaction.
We also show that learning to avoid conflict can produce realistic network structures in a way different than preferential attachment models.
The motor control problem involves determining the time-varying muscle activation trajectories required to accomplish a given movement.
Muscle redundancy makes motor control a challenging task: there are many possible activation trajectories that accomplish the same movement.
Despite this redundancy, most movements are accomplished in highly stereotypical ways.
For example, point-to-point reaching movements are almost universally performed with very similar smooth trajectories.
Optimization methods are commonly used to predict muscle forces for measured movements.
However, these approaches require computationally expensive simulations and are sensitive to the chosen optimality criteria and regularization.
In this work, we investigate deep autoencoders for the prediction of muscle activation trajectories for point-to-point reaching movements.
We evaluate our DNN predictions with simulated reaches and two methods to generate the muscle activations: inverse dynamics (ID) and optimal control (OC) criteria.
We also investigate optimal network parameters and training criteria to improve the accuracy of the predictions.
Non-frontal lip views contain useful information which can be used to enhance the performance of frontal view lipreading.
However, the vast majority of recent lipreading works, including the deep learning approaches which significantly outperform traditional approaches, have focused on frontal mouth images.
As a consequence, research on joint learning of visual features and speech classification from multiple views is limited.
In this work, we present an end-to-end multi-view lipreading system based on Bidirectional Long-Short Memory (BLSTM) networks.
To the best of our knowledge, this is the first model which simultaneously learns to extract features directly from the pixels and performs visual speech classification from multiple views and also achieves state-of-the-art performance.
The model consists of multiple identical streams, one for each view, which extract features directly from different poses of mouth images.
The temporal dynamics in each stream/view are modelled by a BLSTM and the fusion of multiple streams/views takes place via another BLSTM.
An absolute average improvement of 3% and 3.8% over the frontal view performance is reported on the OuluVS2 database when the best two (frontal and profile) and three views (frontal, profile, 45) are combined, respectively.
The best three-view model results in a 10.5% absolute improvement over the current multi-view state-of-the-art performance on OuluVS2, without using external databases for training, achieving a maximum classification accuracy of 96.9%.
Pinterest is an image-based online social network, which was launched in the year 2010 and has gained a lot of traction, ever since.
Within 3 years, Pinterest has attained 48.7 million unique users.
This stupendous growth makes it interesting to study Pinterest, and gives rise to multiple questions about it's users, and content.
We characterized Pinterest on the basis of large scale crawls of 3.3 million user profiles, and 58.8 million pins.
In particular, we explored various attributes of users, pins, boards, pin sources, and user locations, in detail and performed topical analysis of user generated textual content.
The characterization revealed most prominent topics among users and pins, top image sources, and geographical distribution of users on Pinterest.
We then investigated this social network from a privacy and security standpoint, and found traces of malware in the form of pin sources.
Instances of Personally Identifiable Information (PII) leakage were also discovered in the form of phone numbers, BBM (Blackberry Messenger) pins, and email addresses.
Further, our analysis demonstrated how Pinterest is a potential venue for copyright infringement, by showing that almost half of the images shared on Pinterest go uncredited.
To the best of our knowledge, this is the first attempt to characterize Pinterest at such a large scale.
We completely determine the complexity status of the dominating set problem for hereditary graph classes defined by forbidden induced subgraphs with at most five vertices.
Preventing traffic congestion by forecasting near time traffic flows is an important problem as it leads to effective use of transport resources.
Social network provides information about activities of humans and social events.
Thus, with the help of social network, we can extract which humans will attend a particular event (in near time) and can estimate flow of traffic based on it.
This opens up a wide area of research which poses need to have a framework for traffic management that can capture essential parameters of real-life behaviour and provide a way to iterate upon and evaluate new ideas.
In this paper, we present building blocks of a framework and a system to simulate a city with its transport system, humans and their social network.
We emphasize on relevant parameters selected and modular design of the framework.
Our framework defines metrics to evaluate congestion avoidance strategies.
To show utility of the framework, we present experimental studies of few strategies on a public transport system.
Journal impact factor (IF) as a gauge of influence and impact of a particular journal comparing with other journals in the same area of research, reports the mean number of citations to the published articles in particular journal.
Although, IF attracts more attention and being used more frequently than other measures, it has been subjected to criticisms, which overcome the advantages of IF.
Critically, extensive use of IF may result in destroying editorial and researchers behaviour, which could compromise the quality of scientific articles.
Therefore, it is the time of the timeliness and importance of a new invention of journal ranking techniques beyond the journal impact factor.
What are the key-features that enable an information diffusion model to explain the inherent dynamic, and often competitive, nature of real-world propagation phenomena?
In this paper we aim to answer this question by proposing a novel class of diffusion models, inspired by the classic Linear Threshold model, and built around the following aspects: trust/distrust in the user relationships, which is leveraged to model different effects of social influence on the decisions taken by an individual; changes in adopting one or alternative information items; hesitation towards adopting an information item over time; latency in the propagation; time horizon for the unfolding of the diffusion process; and multiple cascades of information that might occur competitively.
To the best of our knowledge, the above aspects have never been unified into the same LT-based diffusion model.
We also define different strategies for the selection of the initial influencers to simulate non-competitive and competitive diffusion scenarios, particularly related to the problem of limitation of misinformation spread.
Results on publicly available networks have shown the meaningfulness and uniqueness of our models.
We propose a novel system which can transform a recipe into any selected regional style (e.g., Japanese, Mediterranean, or Italian).
This system has two characteristics.
First the system can identify the degree of regional cuisine style mixture of any selected recipe and visualize such regional cuisine style mixtures using barycentric Newton diagrams.
Second, the system can suggest ingredient substitutions through an extended word2vec model, such that a recipe becomes more authentic for any selected regional cuisine style.
Drawing on a large number of recipes from Yummly, an example shows how the proposed system can transform a traditional Japanese recipe, Sukiyaki, into French style.
Despite initiatives to improve the quality of scientific codes, there still is a large presence of legacy code.
Such code often needs to implement a lot of functionality under time constrains, sacrificing quality.
Additionally, quality is rarely improved by optimizations for new architectures.
This development model leads to code that is increasingly difficult to work with.
Our suggested solution includes complexity-reducing refactoring and hardware abstraction.
We focus on the AIREBO potential from LAMMPS, where the challenge is that any potential kernel is rather large and complex, hindering systematic optimization.
This issue is common to codes that model multiple physical phenomena.
We present our journey from the C++ port of a previous Fortran code to performance-portable, KNC-hybrid, vectorized, scalable, optimized code supporting full and reduced precision.
The journey includes extensive testing that fixed bugs in the original code.
Large-scale, full-precision runs sustain speedups of more than 4x (KNL) and 3x (Skylake).
Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty.
We propose two sets of satisficing objectives for the multi-armed bandit problem, where the objective is to achieve reward-based decision-making performance above a given threshold.
We show that these new problems are equivalent to various standard multi-armed bandit problems with maximizing objectives and use the equivalence to find bounds on performance.
The different objectives can result in qualitatively different behavior; for example, agents explore their options continually in one case and only a finite number of times in another.
For the case of Gaussian rewards we show an additional equivalence between the two sets of satisficing objectives that allows algorithms developed for one set to be applied to the other.
We then develop variants of the Upper Credible Limit (UCL) algorithm that solve the problems with satisficing objectives and show that these modified UCL algorithms achieve efficient satisficing performance.
We prove that in the geometric complexity theory program the vanishing of rectangular Kronecker coefficients cannot be used to prove superpolynomial determinantal complexity lower bounds for the permanent polynomial.
Moreover, we prove the positivity of rectangular Kronecker coefficients for a large class of partitions where the side lengths of the rectangle are at least quadratic in the length of the partition.
We also compare rectangular Kronecker coefficients with their corresponding plethysm coefficients, which leads to a new lower bound for rectangular Kronecker coefficients.
Moreover, we prove that the saturation of the rectangular Kronecker semigroup is trivial, we show that the rectangular Kronecker positivity stretching factor is 2 for a long first row, and we completely classify the positivity of rectangular limit Kronecker coefficients that were introduced by Manivel in 2011.
High-energy physics experiments rely on reconstruction of the trajectories of particles produced at the interaction point.
This is a challenging task, especially in the high track multiplicity environment generated by p-p collisions at the LHC energies.
A typical event includes hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples).
This work describes a modification of the Artificial Retina algorithm for fast track finding: numerical optimization methods were adopted for fast local track search.
This approach allows for considerable reduction of the total computational time per event.
Test results on simplified simulated model of LHCb VELO (VErtex LOcator) detector are presented.
Also this approach is well-suited for implementation of paralleled computations as GPGPU which look very attractive in the context of upcoming detector upgrades.
Starting from an unsolved problem of information retrieval this paper presents an ontology-based model for indexing and retrieval.
The model combines the methods and experiences of cognitive-to-interpret indexing languages with the strengths and possibilities of formal knowledge representation.
The core component of the model uses inferences along the paths of typed relations between the entities of a knowledge representation for enabling the determination of hit quantities in the context of retrieval processes.
The entities are arranged in aspect-oriented facets to ensure a consistent hierarchical structure.
The possible consequences for indexing and retrieval are discussed.
In this paper, we report on the practical application of a novel approach for validating the knowledge of WordNet using Adimen-SUMO.
In particular, this paper focuses on cross-checking the WordNet meronymy relations against the knowledge encoded in Adimen-SUMO.
Our validation approach tests a large set of competency questions (CQs), which are derived (semi)-automatically from the knowledge encoded in WordNet, SUMO and their mapping, by applying efficient first-order logic automated theorem provers.
Unfortunately, despite of being created manually, these knowledge resources are not free of errors and discrepancies.
In consequence, some of the resulting CQs are not plausible according to the knowledge included in Adimen-SUMO.
Thus, first we focus on (semi)-automatically improving the alignment between these knowledge resources, and second, we perform a minimal set of corrections in the ontology.
Our aim is to minimize the manual effort required for an extensive validation process.
We report on the strategies followed, the changes made, the effort needed and its impact when validating the WordNet meronymy relations using improved versions of the mapping and the ontology.
Based on the new results, we discuss the implications of the appropriate corrections and the need of future enhancements.
Parking sensor network is rapidly deploying around the world and is regarded as one of the first implemented urban services in smart cities.
To provide the best network performance, the MAC protocol shall be adaptive enough in order to satisfy the traffic intensity and variation of parking sensors.
In this paper, we study the heavy-tailed parking and vacant time models from SmartSantander, and then we apply the traffic model in the simulation with four different kinds of MAC protocols, that is, contention-based, schedule-based and two hybrid versions of them.
The result shows that the packet interarrival time is no longer heavy-tailed while collecting a group of parking sensors, and then choosing an appropriate MAC protocol highly depends on the network configuration.
Also, the information delay is bounded by traffic and MAC parameters which are important criteria while the timely message is required.
Brandt et al.(2013) have recently disproved a conjecture by Schwartz (1990) by non-constructively showing the existence of a counterexample with about 10^136 alternatives.
We provide a concrete counterexample for Schwartz's conjecture with only 24 alternatives.
This paper addresses the problem of reducing the delivery time of data messages to cellular users using instantly decodable network coding (IDNC) with physical-layer rate awareness.
While most of the existing literature on IDNC does not consider any physical layer complications and abstract the model as equally slotted time for all users, this paper proposes a cross-layer scheme that incorporates the different channel rates of the various users in the decision process of both the transmitted message combinations and the rates with which they are transmitted.
The consideration of asymmetric rates for receivers reflects more practical application scenarios and introduces a new trade-off between the choice of coding combinations for various receivers and the broadcasting rate for achieving shorter completion time.
The completion time minimization problem in such scenario is first shown to be intractable.
The problem is, thus, approximated by reducing, at each transmission, the increase of an anticipated version of the completion time.
The paper solves the problem by formulating it as a maximum weight clique problem over a newly designed rate aware IDNC (RA-IDNC) graph.
The highest weight clique in the created graph being potentially not unique, the paper further suggests a multi-layer version of the proposed solution to improve the obtained results from the employed completion time approximation.
Simulation results indicate that the cross-layer design largely outperforms the uncoded transmissions strategies and the classical IDNC scheme.
This paper presents a procedural generation method that creates visually attractive levels for the Angry Birds game.
Besides being an immensely popular mobile game, Angry Birds has recently become a test bed for various artificial intelligence technologies.
We propose a new approach for procedurally generating Angry Birds levels using Chinese style and Japanese style building structures.
A conducted experiment confirms the effectiveness of our approach with statistical significance.
In a labeling scheme the vertices of a given graph from a particular class are assigned short labels such that adjacency can be algorithmically determined from these labels.
A representation of a graph from that class is given by the set of its vertex labels.
Due to the shortness constraint on the labels such schemes provide space-efficient representations for various graph classes, such as planar or interval graphs.
We consider what graph classes cannot be represented by labeling schemes when the algorithm which determines adjacency is subjected to computational constraints.
We consider the problem of answering queries about a sensitive dataset subject to differential privacy.
The queries may be chosen adversarially from a larger set Q of allowable queries in one of three ways, which we list in order from easiest to hardest to answer:   Offline: The queries are chosen all at once and the differentially private mechanism answers the queries in a single batch.
Online: The queries are chosen all at once, but the mechanism only receives the queries in a streaming fashion and must answer each query before seeing the next query.
Adaptive: The queries are chosen one at a time and the mechanism must answer each query before the next query is chosen.
In particular, each query may depend on the answers given to previous queries.
Many differentially private mechanisms are just as efficient in the adaptive model as they are in the offline model.
Meanwhile, most lower bounds for differential privacy hold in the offline setting.
This suggests that the three models may be equivalent.
We prove that these models are all, in fact, distinct.
Specifically, we show that there is a family of statistical queries such that exponentially more queries from this family can be answered in the offline model than in the online model.
We also exhibit a family of search queries such that exponentially more queries from this family can be answered in the online model than in the adaptive model.
We also investigate whether such separations might hold for simple queries like threshold queries over the real line.
We present a semantic vector space model for capturing complex polyphonic musical context.
A word2vec model based on a skip-gram representation with negative sampling was used to model slices of music from a dataset of Beethoven's piano sonatas.
A visualization of the reduced vector space using t-distributed stochastic neighbor embedding shows that the resulting embedded vector space captures tonal relationships, even without any explicit information about the musical contents of the slices.
Secondly, an excerpt of the Moonlight Sonata from Beethoven was altered by replacing slices based on context similarity.
The resulting music shows that the selected slice based on similar word2vec context also has a relatively short tonal distance from the original slice.
Immersive social interactions of mobile users are soon to be enabled within a virtual space, by means of virtual reality (VR) technologies and wireless cellular systems.
In a VR mobile social network, the states of all interacting users should be updated synchronously and with low latency via two-way communications with edge computing servers.
The resulting end-to-end latency depends on the relationship between the virtual and physical locations of the wireless VR users and of the edge servers.
In this work, the problem of analyzing and optimizing the end-to-end latency is investigated for a simple network topology, yielding important insights into the interplay between physical and virtual geometries.
Neural network models have shown promising results for text classification.
However, these solutions are limited by their dependence on the availability of annotated data.
The prospect of leveraging resource-rich languages to enhance the text classification of resource-poor languages is fascinating.
The performance on resource-poor languages can significantly improve if the resource availability constraints can be offset.
To this end, we present a twin Bidirectional Long Short Term Memory (Bi-LSTM) network with shared parameters consolidated by a contrastive loss function (based on a similarity metric).
The model learns the representation of resource-poor and resource-rich sentences in a common space by using the similarity between their assigned annotation tags.
Hence, the model projects sentences with similar tags closer and those with different tags farther from each other.
We evaluated our model on the classification tasks of sentiment analysis and emoji prediction for resource-poor languages - Hindi and Telugu and resource-rich languages - English and Spanish.
Our model significantly outperforms the state-of-the-art approaches in both the tasks across all metrics.
The present work provides a new approach to evolve ligand structures which represent possible drug to be docked to the active site of the target protein.
The structure is represented as a tree where each non-empty node represents a functional group.
It is assumed that the active site configuration of the target protein is known with position of the essential residues.
In this paper the interaction energy of the ligands with the protein target is minimized.
Moreover, the size of the tree is difficult to obtain and it will be different for different active sites.
To overcome the difficulty, a variable tree size configuration is used for designing ligands.
The optimization is done using a novel Neighbourhood Based Genetic Algorithm (NBGA) which uses dynamic neighbourhood topology.
To get variable tree size, a variable-length version of the above algorithm is devised.
To judge the merit of the algorithm, it is initially applied on the well known Travelling Salesman Problem (TSP).
Convolutional neural networks have achieved astonishing results in different application areas.
Various methods which allow us to use these models on mobile and embedded devices have been proposed.
Especially binary neural networks seem to be a promising approach for these devices with low computational power.
However, understanding binary neural networks and training accurate models for practical applications remains a challenge.
In our work, we focus on increasing our understanding of the training process and making it accessible to everyone.
We publish our code and models based on BMXNet for everyone to use.
Within this framework, we systematically evaluated different network architectures and hyperparameters to provide useful insights on how to train a binary neural network.
Further, we present how we improved accuracy by increasing the number of connections in the network.
The rapid developments of Artificial Intelligence in the last decade are influencing Aerospace Engineering to a great extent and research in this context is proliferating.
We share our observations on the recent developments in the area of Spacecraft Guidance Dynamics and Control, giving selected examples on success stories that have been motivated by mission designs.
Our focus is on evolutionary optimisation, tree searches and machine learning, including deep learning and reinforcement learning as the key technologies and drivers for current and future research in the field.
From a high-level perspective, we survey various scenarios for which these approaches have been successfully applied or are under strong scientific investigation.
Whenever possible, we highlight the relations and synergies that can be obtained by combining different techniques and projects towards future domains for which newly emerging artificial intelligence techniques are expected to become game changers.
The fields of neural computation and artificial neural networks have developed much in the last decades.
Most of the works in these fields focus on implementing and/or learning discrete functions or behavior.
However, technical, physical, and also cognitive processes evolve continuously in time.
This cannot be described directly with standard architectures of artificial neural networks such as multi-layer feed-forward perceptrons.
Therefore, in this paper, we will argue that neural networks modeling continuous time are needed explicitly for this purpose, because with them the synthesis and analysis of continuous and possibly periodic processes in time are possible (e.g. for robot behavior) besides computing discrete classification functions (e.g. for logical reasoning).
We will relate possible neural network architectures with (hybrid) automata models that allow to express continuous processes.
This paper introduces some foundations of wavelets over Galois fields.
Standard orthogonal finite-field wavelets (FF-Wavelets) including FF-Haar and FF-Daubechies are derived.
Non-orthogonal FF-wavelets such as B-spline over GF(p) are also considered.
A few examples of multiresolution analysis over Finite fields are presented showing how to perform Laplacian pyramid filtering of finite block lengths sequences.
An application of FF-wavelets to design spread-spectrum sequences is presented.
Network pruning is widely used for reducing the heavy computational cost of deep models.
A typical pruning algorithm is a three-stage pipeline, i.e., training (a large model), pruning and fine-tuning.
During pruning, according to a certain criterion, redundant weights are pruned and important weights are kept to best preserve the accuracy.
In this work, we make several surprising observations which contradict common beliefs.
For all the six state-of-the-art pruning algorithms we examined, fine-tuning a pruned model only gives comparable or even worse performance than training that model with randomly initialized weights.
For pruning algorithms which assume a predefined target network architecture, one can get rid of the full pipeline and directly train the target network from scratch.
Our observations are consistent for a wide variety of pruning algorithms with multiple network architectures, datasets, and tasks.
Our results have several implications: 1) training a large, over-parameterized model is not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are not necessarily useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is what leads to the efficiency benefit in the final model, which suggests that some pruning algorithms could be seen as performing network architecture search.
Journal of the History of Biology provides a fifty-year long record for examining the evolution of the history of biology as a scholarly discipline.
In this paper, we present a new dataset and preliminary quantitative analysis of the thematic content of JHB from the perspectives of geography, organisms, and thematic fields.
The geographic diversity of authors whose work appears in JHB has increased steadily since 1968, but the geographic coverage of the content of JHB articles remains strongly lopsided toward the United States, United Kingdom, and western Europe and has diversified much less dramatically over time.
The taxonomic diversity of organisms discussed in JHB increased steadily between 1968 and the late 1990s but declined in later years, mirroring broader patterns of diversification previously reported in the biomedical research literature.
Finally, we used a combination of topic modeling and nonlinear dimensionality reduction techniques to develop a model of multi-article fields within JHB.
We found evidence for directional changes in the representation of fields on multiple scales.
The diversity of JHB with regard to the representation of thematic fields has increased overall, with most of that diversification occurring in recent years.
Drawing on the dataset generated in the course of this analysis, as well as web services in the emerging digital history and philosophy of science ecosystem, we have developed an interactive web platform for exploring the content of JHB, and we provide a brief overview of the platform in this article.
As a whole, the data and analyses presented here provide a starting-place for further critical reflection on the evolution of the history of biology over the past half-century.
Recently, deep learning has been playing a central role in machine learning research and applications.
Since AlexNet, increasingly more advanced networks have achieved state-of-the-art performance in computer vision, speech recognition, language processing, game playing, medical imaging, and so on.
In our previous studies, we proposed quadratic/second-order neurons and deep quadratic neural networks.
In a quadratic neuron, the inner product of a vector of data and the corresponding weights in a conventional neuron is replaced with a quadratic function.
The resultant second-order neuron enjoys an enhanced expressive capability over the conventional neuron.
However, how quadratic neurons improve the expressing capability of a deep quadratic network has not been studied up to now, preferably in relation to that of a conventional neural network.
In this paper, we ask three basic questions regarding the expressive capability of a quadratic network: (1) for the one-hidden-layer network structure, is there any function that a quadratic network can approximate much more efficiently than a conventional network?
(2) for the same multi-layer network structure, is there any function that can be expressed by a quadratic network but cannot be expressed with conventional neurons in the same structure?
(3) Does a quadratic network give a new insight into universal approximation?
Our main contributions are the three theorems shedding light upon these three questions and demonstrating the merits of a quadratic network in terms of expressive efficiency, unique capability, and compact architecture respectively.
For complexity of the heterogeneous minimum spanning forest problem has not been determined, we reduce 3-SAT which is NP-complete to 2-heterogeneous minimum spanning forest problem to prove this problem is NP-hard and spread result to general problem, which determines complexity of this problem.
It provides a theoretical basis for the future designing of approximation algorithms for the problem.
In online internet advertising, machine learning models are widely used to compute the likelihood of a user engaging with product related advertisements.
However, the performance of traditional machine learning models is often impacted due to variations in user and advertiser behavior.
For example, search engine traffic for florists usually tends to peak around Valentine's day, Mother's day, etc.
To overcome, this challenge, in this manuscript we propose three models which are able to incorporate the effects arising due to variations in product demand.
The proposed models are a combination of product demand features, specialized data sampling methodologies and ensemble techniques.
We demonstrate the performance of our proposed models on datasets obtained from a real-world setting.
Our results show that the proposed models more accurately predict the outcome of users interactions with product related advertisements while simultaneously being robust to fluctuations in user and advertiser behaviors.
Coaching technology, wearables and exergames can provide quantitative feedback based on measured activity, but there is little evidence of qualitative feedback to aid technique improvement.
To achieve personalised qualitative feedback, we demonstrated a proof-of-concept prototype combining kinesiology and computational intelligence that could help improving tennis swing technique utilising three-dimensional tennis motion data acquired from multi-camera video.
Expert data labelling relied on virtual 3D stick figure replay.
Diverse assessment criteria for novice to intermediate skill levels and configurable coaching scenarios matched with a variety of tennis swings (22 backhands and 21 forehands), included good technique and common errors.
A set of selected coaching rules was transferred to adaptive assessment modules able to learn from data, evolve their internal structures and produce autonomous personalised feedback including verbal cues over virtual camera 3D replay and an end-of-session progress report.
The prototype demonstrated autonomous assessment on future data based on learning from prior examples, aligned with skill level, flexible coaching scenarios and coaching rules.
The generated intuitive diagnostic feedback consisted of elements of safety and performance for tennis swing technique, where each swing sample was compared with the expert.
For safety aspects of the relative swing width, the prototype showed improved assessment ...
Within the framework of linear vector Gaussian channels with arbitrary signaling, closed-form expressions for the Jacobian of the minimum mean square error and Fisher information matrices with respect to arbitrary parameters of the system are calculated in this paper.
Capitalizing on prior research where the minimum mean square error and Fisher information matrices were linked to information-theoretic quantities through differentiation, closed-form expressions for the Hessian of the mutual information and the differential entropy are derived.
These expressions are then used to assess the concavity properties of mutual information and differential entropy under different channel conditions and also to derive a multivariate version of the entropy power inequality due to Costa.
In this paper, an underlay cognitive radio (CR) multicast network, consisting of a cognitive base station (CBS) and multiple multicast groups of secondary users (SUs), is considered.
All SUs, belonging to a particular multicast group, are served by the CBS using a common primary user (PU) channel.
The goal is to maximize the energy efficiency (EE) of the system, through dynamic adaptation of target rate and transmit power for each multicast group, under the PUs' individual interference constraints.
The optimization problem formulated for this is proved to be non quasi-concave with respect to the joint variation of the CBS's transmit power and target rate.
An efficient iterative algorithm for EE maximization is proposed along with its complexity analysis.
Simulation results illustrate the performance gain of our proposed scheme.
In this paper, we study a large-scale distributed coordination problem and propose efficient adaptive strategies to solve the problem.
The basic problem is to allocate finite number of resources to individual agents such that there is as little congestion as possible and the fraction of unutilized resources is reduced as far as possible.
In the absence of a central planner and global information, agents can employ adaptive strategies that uses only finite knowledge about the competitors.
In this paper, we show that a combination of finite information sets and reinforcement learning can increase the utilization rate of resources substantially.
The authors introduce a new vision for providing computing services for connected devices.
It is based on the key concept that future computing resources will be coupled with communication resources, for enhancing user experience of the connected users, and also for optimising resources in the providers' infrastructures.
Such coupling is achieved by Joint/Cooperative resource allocation algorithms, by integrating computing and communication services and by integrating hardware in networks.
Such type of computing, by which computing services are not delivered independently but dependent of networking services, is named Aqua Computing.
The authors see Aqua Computing as a novel approach for delivering computing resources to end devices, where computing power of the devices are enhanced automatically once they are connected to an Aqua Computing enabled network.
The process of resource coupling is named computation dissolving.
Then, an Aqua Computing architecture is proposed for mobile edge networks, in which computing and wireless networking resources are allocated jointly or cooperatively by a Mobile Cloud Controller, for the benefit of the end-users and/or for the benefit of the service providers.
Finally, a working prototype of the system is shown and the gathered results show the performance of the Aqua Computing prototype.
Drawing from research on computational models of argumentation (particularly the Carneades Argumentation System), we explore the graphical representation of arguments in a dispute; then, comparing two different traditions on the limits of the justification of decisions, and devising an intermediate, semi-formal, model, we also show that it can shed light on the theory of dispute resolution.
We conclude our paper with an observation on the usefulness of highly constrained reasoning for Online Dispute Resolution systems.
Restricting the search space of arguments exclusively to reasons proposed by the parties (vetoing the introduction of new arguments by the human or artificial arbitrator) is the only way to introduce some kind of decidability -- together with foreseeability -- in the argumentation system.
Local misalignment caused by global homography is a common issue in image stitching task.
Content-Preserving Warping (CPW) is a typical method to deal with this issue, in which geometric and photometric constraints are imposed to guide the warping process.
One of its essential condition however, is colour consistency, and an elusive goal in real world applications.
In this paper, we propose a Generalized Content-Preserving Warping (GCPW) method to alleviate this problem.
GCPW extends the original CPW by applying a colour model that expresses the colour transformation between images locally, thus meeting the photometric constraint requirements for effective image stitching.
We combine the photometric and geometric constraints and jointly estimate the colour transformation and the warped mesh vertexes, simultaneously.
We align images locally with an optimal grid mesh generated by our GCPW method.
Experiments on both synthetic and real images demonstrate that our new method is robust to colour variations, outperforming other state-of-the-art CPW-based image stitching methods.
Advertising options have been recently studied as a special type of guaranteed contracts in online advertising, which are an alternative sales mechanism to real-time auctions.
An advertising option is a contract which gives its buyer a right but not obligation to enter into transactions to purchase page views or link clicks at one or multiple pre-specified prices in a specific future period.
Different from typical guaranteed contracts, the option buyer pays a lower upfront fee but can have greater flexibility and more control of advertising.
Many studies on advertising options so far have been restricted to the situations where the option payoff is determined by the underlying spot market price at a specific time point and the price evolution over time is assumed to be continuous.
The former leads to a biased calculation of option payoff and the latter is invalid empirically for many online advertising slots.
This paper addresses these two limitations by proposing a new advertising option pricing framework.
First, the option payoff is calculated based on an average price over a specific future period.
Therefore, the option becomes path-dependent.
The average price is measured by the power mean, which contains several existing option payoff functions as its special cases.
Second, jump-diffusion stochastic models are used to describe the movement of the underlying spot market price, which incorporate several important statistical properties including jumps and spikes, non-normality, and absence of autocorrelations.
A general option pricing algorithm is obtained based on Monte Carlo simulation.
In addition, an explicit pricing formula is derived for the case when the option payoff is based on the geometric mean.
This pricing formula is also a generalized version of several other option pricing models discussed in related studies.
The Container Relocation Problem (CRP) is concerned with finding a sequence of moves of containers that minimizes the number of relocations needed to retrieve all containers, while respecting a given order of retrieval.
However, the assumption of knowing the full retrieval order of containers is particularly unrealistic in real operations.
This paper studies the stochastic CRP (SCRP), which relaxes this assumption.
A new multi-stage stochastic model, called the batch model, is introduced, motivated, and compared with an existing model (the online model).
The two main contributions are an optimal algorithm called Pruning-Best-First-Search (PBFS) and a randomized approximate algorithm called PBFS-Approximate with a bounded average error.
Both algorithms, applicable in the batch and online models, are based on a new family of lower bounds for which we show some theoretical properties.
Moreover, we introduce two new heuristics outperforming the best existing heuristics.
Algorithms, bounds and heuristics are tested in an extensive computational section.
Finally, based on strong computational evidence, we conjecture the optimality of the "Leveling" heuristic in a special "no information" case, where at any retrieval stage, any of the remaining containers is equally likely to be retrieved next.
We present a novel architectural enhancement of Channel Boosting in deep convolutional neural network (CNN).
This idea of Channel Boosting exploits both the channel dimension of CNN (learning from multiple input channels) and Transfer learning (TL).
TL is utilized at two different stages; channel generation and channel exploitation.
In the proposed methodology, a deep CNN is boosted by various channels available through TL from already trained Deep Neural Networks, in addition to its own original channel.
The deep architecture of CNN then exploits the original and boosted channels down the stream for learning discriminative patterns.
Churn prediction in telecom is a challenging task due to high dimensionality and imbalanced nature of the data and it is therefore used to evaluate the performance of the proposed Channel Boosted CNN (CB CNN).
In the first phase, discriminative informative features are being extracted using a staked autoencoder, and then in the second phase, these features are combined with the original features to form Channel Boosted images.
Finally, the knowledge gained by a pre trained CNN is exploited by employing TL.
The results are promising and show the ability of the Channel Boosting concept in learning complex classification problem by discerning even minute differences in churners and non churners.
The proposed work validates the concept observed from the evolution of recent CNN architectures that the innovative restructuring of a CNN architecture may increase the representative capacity of the network.
We present a new approach to rigid-body motion segmentation from two views.
We use a previously developed nonlinear embedding of two-view point correspondences into a 9-dimensional space and identify the different motions by segmenting lower-dimensional subspaces.
In order to overcome nonuniform distributions along the subspaces, whose dimensions are unknown, we suggest the novel concept of global dimension and its minimization for clustering subspaces with some theoretical motivation.
We propose a fast projected gradient algorithm for minimizing global dimension and thus segmenting motions from 2-views.
We develop an outlier detection framework around the proposed method, and we present state-of-the-art results on outlier-free and outlier-corrupted two-view data for segmenting motion.
Why did only we humans evolve Turing completeness?
Turing completeness is the maximum computing power, and we are Turing complete because we can calculate whatever any Turing machine can compute.
Thus we can learn any natural or artificial language, and it seems that no other species can, so we are the only Turing complete species.
The evolutionary advantage of Turing completeness is full problem solving, and not syntactic proficiency, but the expression of problems requires a syntax because separate words are not enough, and only our ancestors evolved a protolanguage, and then a syntax, and finally Turing completeness.
Besides these results, the introduction of Turing completeness and problem solving to explain the evolution of syntax should help us to fit the evolution of language within the evolution of cognition, giving us some new clues to understand the elusive relation between language and thinking.
Deep networks are successfully used as classification models yielding state-of-the-art results when trained on a large number of labeled samples.
These models, however, are usually much less suited for semi-supervised problems because of their tendency to overfit easily when trained on small amounts of data.
In this work we will explore a new training objective that is targeting a semi-supervised regime with only a small subset of labeled data.
This criterion is based on a deep metric embedding over distance relations within the set of labeled samples, together with constraints over the embeddings of the unlabeled set.
The final learned representations are discriminative in euclidean space, and hence can be used with subsequent nearest-neighbor classification using the labeled samples.
Spectrum sensing is a fundamental operation in cognitive radio environment.
It gives information about spectrum availability by scanning the bands.
Usually a fixed amount of time is given to scan individual bands.
Most of the times, historical information about the traffic in the spectrum bands is not used.
But this information gives the idea, how busy a specific band is.
Therefore, instead of scanning a band for a fixed amount of time, more time can be given to less occupied bands and less time to heavily occupied ones.
In this paper we have formulated the time assignment problem as integer linear programming and source coding problems.
The time assignment problem is solved using the associated stochastic optimization problem.
This paper considers the problem of finite dimensional output feedback H-infinity control for a class of nonlinear spatially distributed processes (SDPs) described by highly dissipative partial differential equations (PDEs), whose state is observed by a sensor network (SN) with a given topology.
A highly dissipative PDE system typically involves a spatial differential operator with eigenspectrum that can be partitioned into a finite-dimensional slow one and an infinite-dimensional stable fast complement.
Motivated by this fact, the modal decomposition and singular perturbation techniques are initially applied to the PDE system to derive a finite dimensional ordinary differential equation model, which accurately captures the dynamics of the slow modes of the PDE system.
Subsequently, based on the slow system and the topology of the SN, a set of finite dimensional distributed consensus observers are constructed to estimate the state of the slow system.
Then, a centralized control scheme, which only uses the available estimates from a specified group of SN nodes, is proposed for the PDE system.
An H-infinity control design method is developed in terms of bilinear matrix inequality (BMI), such that the original closed-loop PDE system is exponentially stable and a prescribed level of disturbance attenuation is satisfied for the slow system.
Furthermore, a suboptimal H-infinity controller is also provided to make the attenuation level as small as possible, which can be obtained via a local optimization algorithm that treats the BMI as double linear matrix inequality.
Finally, the proposed method is applied to the control of one dimensional Kuramoto-Sivashinsky equation (KSE) system.
The task of voxel resolution for a space curve in video memory of 3D display is set.
Furthermore, an approach solution of voxel resolution of arbitrary space curve, given in parametric form, is studied.
Numerous numbers of intensive experiments are conducted and interesting results with significant recommendations are presented.
We study the underlying structure of data (approximately) generated from a union of independent subspaces.
Traditional methods learn only one subspace, failing to discover the multi-subspace structure, while state-of-the-art methods analyze the multi-subspace structure using data themselves as the dictionary, which cannot offer the explicit basis to span each subspace and are sensitive to errors via an indirect representation.
Additionally, they also suffer from a high computational complexity, being quadratic or cubic to the sample size.
To tackle all these problems, we propose a method, called Matrix Factorization with Column L0-norm constraint (MFC0), that can simultaneously learn the basis for each subspace, generate a direct sparse representation for each data sample, as well as removing errors in the data in an efficient way.
Furthermore, we develop a first-order alternating direction algorithm, whose computational complexity is linear to the sample size, to stably and effectively solve the nonconvex objective function and non- smooth l0-norm constraint of MFC0.
Experimental results on both synthetic and real-world datasets demonstrate that besides the superiority over traditional and state-of-the-art methods for subspace clustering, data reconstruction, error correction, MFC0 also shows its uniqueness for multi-subspace basis learning and direct sparse representation.
Verification problems of programs written in various paradigms (such as imperative, logic, concurrent, functional, and object-oriented ones) can be reduced to problems of solving Horn clause constraints on predicate variables that represent unknown inductive invariants.
This paper presents a novel Horn constraint solving method based on inductive theorem proving: the method reduces Horn constraint solving to validity checking of first-order formulas with inductively defined predicates, which are then checked by induction on the derivation of the predicates.
To automate inductive proofs, we introduce a novel proof system tailored to Horn constraint solving and use an SMT solver to discharge proof obligations arising in the proof search.
The main advantage of the proposed method is that it can verify relational specifications across programs in various paradigms where multiple function calls need to be analyzed simultaneously.
The class of specifications includes practically important ones such as functional equivalence, associativity, commutativity, distributivity, monotonicity, idempotency, and non-interference.
Furthermore, our novel combination of Horn clause constraints with inductive theorem proving enables us to naturally and automatically axiomatize recursive functions that are possibly non-terminating, non-deterministic, higher-order, exception-raising, and over non-inductively defined data types.
We have implemented a relational verification tool for the OCaml functional language based on the proposed method and obtained promising results in preliminary experiments.
Many artificial intelligences (AIs) are randomized.
One can be lucky or unlucky with the random seed; we quantify this effect and show that, maybe contrarily to intuition, this is far from being negligible.
Then, we apply two different existing algorithms for selecting good seeds and good probability distributions over seeds.
This mainly leads to learning an opening book.
We apply this to Phantom Go, which, as all phantom games, is hard for opening book learning.
We improve the winning rate from 50% to 70% in 5x5 against the same AI, and from approximately 0% to 40% in 5x5, 7x7 and 9x9 against a stronger (learning) opponent.
Deep learning has become the state-of-art tool in many applications, but the evaluation and training of deep models can be time-consuming and computationally expensive.
The conditional computation approach has been proposed to tackle this problem (Bengio et al., 2013; Davis & Arel, 2013).
It operates by selectively activating only parts of the network at a time.
In this paper, we use reinforcement learning as a tool to optimize conditional computation policies.
More specifically, we cast the problem of learning activation-dependent policies for dropping out blocks of units as a reinforcement learning problem.
We propose a learning scheme motivated by computation speed, capturing the idea of wanting to have parsimonious activations while maintaining prediction accuracy.
We apply a policy gradient algorithm for learning policies that optimize this loss function and propose a regularization mechanism that encourages diversification of the dropout policy.
We present encouraging empirical results showing that this approach improves the speed of computation without impacting the quality of the approximation.
The self-organizational ability of ad-hoc Wireless Sensor Networks (WSNs) has led them to be the most popular choice in ubiquitous computing.
Clustering sensor nodes organizing them hierarchically have proven to be an effective method to provide better data aggregation and scalability for the sensor network while conserving limited energy.
It has some limitation in energy and mobility of nodes.
In this paper we propose a mobility prediction technique which tries overcoming above mentioned problems and improves the life time of the network.
The technique used here is Exponential Moving Average for online updates of nodal contact probability in cluster based network.
Flow-Aware Multi-Topology Adaptive Routing (FAMTAR) is a new approach to multipath and adaptive routing in IP networks which enables automatic use of alternative paths when the primary one becomes congested.
It provides more efficient network resource utilization and higher quality of transmission compared to standard IP routing.
However, thus far it has only been evaluated through simulations.
In this paper we share our experiences from building a real-time FAMTAR router and present results of its tests in a physical network.
The results are in line with those obtained previously through simulations and they open the way to implementation of a production grade FAMTAR router.
The original Pascaline was a mechanical calculator able to sum and subtract integers.
It encodes information in the angles of mechanical wheels and through a set of gears, and aided by gravity, could perform the calculations.
Here, we show that such a concept can be realized in electronics using memory elements such as memristive systems.
By using memristive emulators we have demonstrated experimentally the memcomputing version of the mechanical Pascaline, capable of processing and storing the numerical results in the multiple levels of each memristive element.
Our result is the first experimental demonstration of multidigit arithmetics with multi-level memory devices that further emphasizes the versatility and potential of memristive systems for future massively-parallel high-density computing architectures.
We study the existence of asymptotically stable periodic trajectories induced by reset feedback.
The analysis is developed for a planar system.
Casting the problem into the hybrid setting, we show that a periodic orbit arises from the balance between the energy dissipated during flows and the energy restored by resets, at jumps.
The stability of the periodic orbit is studied with hybrid Lyapunov tools.
The satisfaction of the so-called hybrid basic conditions ensures the robustness of the asymptotic stability.
Extensions of the approach to more general mechanical systems are discussed.
Analysis and prediction of stock market time series data has attracted considerable interest from the research community over the last decade.
Rapid development and evolution of sophisticated algorithms for statistical analysis of time series data, and availability of high-performance hardware has made it possible to process and analyze high volume stock market time series data effectively, in real-time.
Among many other important characteristics and behavior of such data, forecasting is an area which has witnessed considerable focus.
In this work, we have used time series of the index values of the Auto sector in India during January 2010 to December 2015 for a deeper understanding of the behavior of its three constituent components, e.g., the trend, the seasonal component, and the random component.
Based on this structural analysis, we have also designed five approaches for forecasting and also computed their accuracy in prediction using suitably chosen training and test data sets.
Extensive results are presented to demonstrate the effectiveness of our proposed decomposition approaches of time series and the efficiency of our forecasting techniques, even in presence of a random component and a sharply changing trend component in the time-series.
This document summarizes the major milestones in mobile Augmented Reality between 1968 and 2014.
Major parts of the list were compiled by the member of the Christian Doppler Laboratory for Handheld Augmented Reality in 2010 (author list in alphabetical order) for the ISMAR society.
Later in 2013 it was updated, and more recent work was added during preparation of this report.
Permission is granted to copy and modify.
In recent years identity-vector (i-vector) based speaker verification (SV) systems have become very successful.
Nevertheless, environmental noise and speech duration variability still have a significant effect on degrading the performance of these systems.
In many real-life applications, duration of recordings are very short; as a result, extracted i-vectors cannot reliably represent the attributes of the speaker.
Here, we investigate the effect of speech duration on the performance of three state-of-the-art speaker recognition systems.
In addition, using a variety of available score fusion methods, we investigate the effect of score fusion for those speaker verification techniques to benefit from the performance difference of different methods under different enrollment and test speech duration conditions.
This technique performed significantly better than the baseline score fusion methods.
It is well known that for some tasks, labeled data sets may be hard to gather.
Therefore, we wished to tackle here the problem of having insufficient training data.
We examined learning methods from unlabeled data after an initial training on a limited labeled data set.
The suggested approach can be used as an online learning method on the unlabeled test set.
In the general classification task, whenever we predict a label with high enough confidence, we treat it as a true label and train the data accordingly.
For the semantic segmentation task, a classic example for an expensive data labeling process, we do so pixel-wise.
Our suggested approaches were applied on the MNIST data-set as a proof of concept for a vision classification task and on the ADE20K data-set in order to tackle the semi-supervised semantic segmentation problem.
We evaluate the secrecy performance of a multiple access cooperative network where the destination node is wiretapped by a malicious and passive eavesdropper.
We propose the application of the network coding technique as an alternative to increase the secrecy at the destination node, on the top of improving the error performance of the legitimate communication, already demonstrated in the literature.
Network coding is leveraged by assuming that the legitime cooperative nodes are able to perform non-binary linear combinations of different frames before the transmission.
Different scenarios with and without channel state information (CSI) at the transmitter side are evaluated.
The effectiveness of the proposed schemes is evaluated in terms of secrecy outage probability through theoretic and numerical analyses.
It is shown that, even when the legitimate transmitters do not have any CSI, the secrecy can be increased through the use of network coding when compared to the direct transmission and traditional cooperative techniques.
The use of annotations, referred to as assertions or contracts, to describe program properties for which run-time tests are to be generated, has become frequent in dynamic programing languages.
However, the frameworks proposed to support such run-time testing generally incur high time and/or space overheads over standard program execution.
We present an approach for reducing this overhead that is based on the use of memoization to cache intermediate results of check evaluation, avoiding repeated checking of previously verified properties.
Compared to approaches that reduce checking frequency, our proposal has the advantage of being exhaustive (i.e., all tests are checked at all points) while still being much more efficient than standard run-time checking.
Compared to the limited previous work on memoization, it performs the task without requiring modifications to data structure representation or checking code.
While the approach is general and system-independent, we present it for concreteness in the context of the Ciao run-time checking framework, which allows us to provide an operational semantics with checks and caching.
We also report on a prototype implementation and provide some experimental results that support that using a relatively small cache leads to significant decreases in run-time checking overhead.
Message digest algorithms are one of the underlying building blocks of blockchain platforms such as Ethereum.
This paper analyses situations in which the message digest collision resistance property can be exploited by attackers.
Two mitigations for possible attacks are described: longer message digest sizes make attacks more difficult; and, including timeliness properties limits the amount of time an attacker has to determine a hash collision.
Many fundamental problems in natural language processing rely on determining what entities appear in a given text.
Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation.
Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem.
We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation.
Input mentions (i.e.,~linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context.
The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned.
Our method does not require extensive feature engineering, nor an expensive training procedure.
We use loopy belief propagation to perform approximate inference.
The low complexity of our model makes this step sufficiently fast for real-time usage.
We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods.
For many large undirected models that arise in real-world applications, exact maximumlikelihood training is intractable, because it requires computing marginal distributions of the model.
Conditional training is even more difficult, because the partition function depends not only on the parameters, but also on the observed input, requiring repeated inference over each training example.
An appealing idea for such models is to independently train a local undirected classifier over each clique, afterwards combining the learned weights into a single global model.
In this paper, we show that this piecewise method can be justified as minimizing a new family of upper bounds on the log partition function.
On three natural-language data sets, piecewise training is more accurate than pseudolikelihood, and often performs comparably to global training using belief propagation.
We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures.
We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc.
The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs.
On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model.
We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google's neural machine translation system.
An essential part of building a data-driven organization is the ability to handle and process continuous streams of data to discover actionable insights.
The explosive growth of interconnected devices and the social Web has led to a large volume of data being generated on a continuous basis.
Streaming data sources such as stock quotes, credit card transactions, trending news, traffic conditions, time-sensitive patients data are not only very common but can rapidly depreciate if not processed quickly.
The ever-increasing volume and highly irregular nature of data rates pose new challenges to data stream processing systems.
One such challenging but important task is how to accurately ingest and integrate data streams from various sources and locations into an analytics platform.
These challenges demand new strategies and systems that can offer the desired degree of scalability and robustness in handling failures.
This paper investigates the fundamental requirements and the state of the art of existing data stream ingestion systems, propose a scalable and fault-tolerant data stream ingestion and integration framework that can serve as a reusable component across many feeds of structured and unstructured input data in a given platform, and demonstrate the utility of the framework in a real-world data stream processing case study that integrates Apache NiFi and Kafka for processing high velocity news articles from across the globe.
The study also identifies best practices and gaps for future research in developing large-scale data stream processing infrastructure.
We introduce a weighted version of the ranking algorithm by Karp et al.(STOC 1990), and prove a competitive ratio of 0.6534 for the vertex-weighted online bipartite matching problem when online vertices arrive in random order.
Our result shows that random arrivals help beating the 1-1/e barrier even in the vertex-weighted case.
We build on the randomized primal-dual framework by Devanur et al.(SODA 2013) and design a two dimensional gain sharing function, which depends not only on the rank of the offline vertex, but also on the arrival time of the online vertex.
To our knowledge, this is the first competitive ratio strictly larger than 1-1/e for an online bipartite matching problem achieved under the randomized primal-dual framework.
Our algorithm has a natural interpretation that offline vertices offer a larger portion of their weights to the online vertices as time goes by, and each online vertex matches the neighbor with the highest offer at its arrival.
This paper builds on altruistic locking which is an extension of 2PL.
It allows more relaxed rules as compared to 2PL.
But altruistic locking too enforces some rules which disallow some valid schedules (present in VSR and CSR) to be passed by AL.
This paper proposes a multiversion variant of AL which solves this problem.
The report also discusses the relationship or comparison between different protocols such as MAL and MV2PL, MAL and AL, MAL and 2PL and so on.
This paper also discusses the caveats involved in MAL and where it lies in the Venn diagram of multiversion serializable schedule protocols.
Finally, the possible use of MAL in hybrid protocols and the parameters involved in making MAL successful are discussed.
Andrew Tanenbaum and his textbooks -- e.g. on Operating Systems, Computer Networks, Structured Computer Organization and Distributed Systems, to name but a few -- have had a tremendous impact on generations of computer science students (and teachers at the same time).
Given this, it is striking to observe that this comprehensive body of work apparently does not provide a single line on a research topic that seems to be intimately related with his name (at least in German), i.e.
Xmas Research (XR).
Hence, the goal of this paper is to fill this gap and provide insight into a number of paradigmatic XR research questions, for instance: Can we today still count on Santa Claus?
Or at least on Xmas trees?
And does this depend on basic tree structures, or can we rather find solutions on the level of programming languages?
By addressing such basic open issues, we aim at providing a solid technical foundation for future steps towards the imminent evolution of Xmas 4.0.
The proposed Earth observation (EO) based value adding system (EO VAS), hereafter identified as AutoCloud+, consists of an innovative EO image understanding system (EO IUS) design and implementation capable of automatic spatial context sensitive cloud/cloud shadow detection in multi source multi spectral (MS) EO imagery, whether or not radiometrically calibrated, acquired by multiple platforms, either spaceborne or airborne, including unmanned aerial vehicles (UAVs).
It is worth mentioning that the same EO IUS architecture is suitable for a large variety of EO based value adding products and services, including: (i) low level image enhancement applications, such as automatic MS image topographic correction, co registration, mosaicking and compositing, (ii) high level MS image land cover (LC) and LC change (LCC) classification and (iii) content based image storage/retrieval in massive multi source EO image databases (big data mining).
Square grids are commonly used in robotics and game development as spatial models and well known in AI community heuristic search algorithms (such as A*, JPS, Theta* etc.) are widely used for path planning on grids.
A lot of research is concentrated on finding the shortest (in geometrical sense) paths while in many applications finding smooth paths (rather than the shortest ones but containing sharp turns) is preferable.
In this paper we study the problem of generating smooth paths and concentrate on angle constrained path planning.
We put angle-constrained path planning problem formally and present a new algorithm tailored to solve it - LIAN.
We examine LIAN both theoretically and empirically.
We show that it is sound and complete (under some restrictions).
We also show that LIAN outperforms the analogues when solving numerous path planning tasks within urban outdoor navigation scenarios.
We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed.
The latent memberships evolve according to Markov processes.
The optimal number of latent groups can be determined by data itself.
The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data.
We present batch and online Gibbs sampling algorithms to perform model inference.
Finally, we demonstrate the model's performance on both synthetic and real-world datasets compared to state-of-the-art methods.
Remote job submission and execution is fundamental requirement of distributed computing done using Cluster computing.
However, Cluster computing limits usage within a single organization.
Grid computing environment can allow use of resources for remote job execution that are available in other organizations.
This paper discusses concepts of batch-job execution using LRM and using Grid.
The paper discusses two ways of preparing test Grid computing environment that we use for experimental testing of concepts.
This paper presents experimental testing of remote job submission and execution mechanisms through LRM specific way and Grid computing ways.
Moreover, the paper also discusses various problems faced while working with Grid computing environment and discusses their trouble-shootings.
The understanding and experimental testing presented in this paper would become very useful to researchers who are new to the field of job management in Grid.
With the growing interest on Network Analysis, Relational Data Mining is becoming an emphasized domain of Data Mining.
This paper addresses the problem of extracting representative elements from a relational dataset.
After defining the notion of degree of representativeness, computed using the Borda aggregation procedure, we present the extraction of exemplars which are the representative elements of the dataset.
We use these concepts to build a network on the dataset.
We expose the main properties of these notions and we propose two typical applications of our framework.
The first application consists in resuming and structuring a set of binary images and the second in mining co-authoring relation in a research team.
Detection of transitions between broad phonetic classes in a speech signal is an important problem which has applications such as landmark detection and segmentation.
The proposed hierarchical method detects silence to non-silence transitions, high amplitude (mostly sonorants) to low ampli- tude (mostly fricatives/affricates/stop bursts) transitions and vice-versa.
A subset of the extremum (minimum or maximum) samples between every pair of successive zero-crossings is selected above a second pass threshold, from each bandpass filtered speech signal frame.
Relative to the mid-point (reference) of a frame, locations of the first and the last extrema lie on either side, if the speech signal belongs to a homogeneous segment; else, both these locations lie on the left or the right side of the reference, indicating a transition frame.
When tested on the entire TIMIT database, of the transitions detected, 93.6% are within a tolerance of 20 ms from the hand labeled boundaries.
Sonorant, unvoiced non-sonorant and silence classes and their respective onsets are detected with an accuracy of about 83.5% for the same tolerance.
The results are as good as, and in some respects better than the state-of-the-art methods for similar tasks.
With the rapid advances of microarray technologies, large amounts of high-dimensional gene expression data are being generated, which poses significant computational challenges.
A first step towards addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data.
A robust gene expression clustering approach to minimize undesirable clustering is proposed.
In this paper, Penalized Fuzzy C-Means (PFCM) Clustering algorithm is described and compared with the most representative off-line clustering techniques: K-Means Clustering, Rough K-Means Clustering and Fuzzy C-Means clustering.
These techniques are implemented and tested for a Brain Tumor gene expression Dataset.
Analysis of the performance of the proposed approach is presented through qualitative validation experiments.
From experimental results, it can be observed that Penalized Fuzzy C-Means algorithm shows a much higher usability than the other projected clustering algorithms used in our comparison study.
Significant and promising clustering results are presented using Brain Tumor Gene expression dataset.
Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes.
In these clustering results, we find that Penalized Fuzzy C-Means algorithm provides useful information as an aid to diagnosis in oncology.
Dense video captioning is a fine-grained video understanding task that involves two sub-problems: localizing distinct events in a long video stream, and generating captions for the localized events.
We propose the Joint Event Detection and Description Network (JEDDi-Net), which solves the dense video captioning task in an end-to-end fashion.
Our model continuously encodes the input video stream with three-dimensional convolutional layers, proposes variable-length temporal events based on pooled features, and generates their captions.
Proposal features are extracted within each proposal segment through 3D Segment-of-Interest pooling from shared video feature encoding.
In order to explicitly model temporal relationships between visual events and their captions in a single video, we also propose a two-level hierarchical captioning module that keeps track of context.
On the large-scale ActivityNet Captions dataset, JEDDi-Net demonstrates improved results as measured by standard metrics.
We also present the first dense captioning results on the TACoS-MultiLevel dataset.
Recent research on problem formulations based on decomposition into low-rank plus sparse matrices shows a suitable framework to separate moving objects from the background.
The most representative problem formulation is the Robust Principal Component Analysis (RPCA) solved via Principal Component Pursuit (PCP) which decomposes a data matrix in a low-rank matrix and a sparse matrix.
However, similar robust implicit or explicit decompositions can be made in the following problem formulations: Robust Non-negative Matrix Factorization (RNMF), Robust Matrix Completion (RMC), Robust Subspace Recovery (RSR), Robust Subspace Tracking (RST) and Robust Low-Rank Minimization (RLRM).
The main goal of these similar problem formulations is to obtain explicitly or implicitly a decomposition into low-rank matrix plus additive matrices.
In this context, this work aims to initiate a rigorous and comprehensive review of the similar problem formulations in robust subspace learning and tracking based on decomposition into low-rank plus additive matrices for testing and ranking existing algorithms for background/foreground separation.
For this, we first provide a preliminary review of the recent developments in the different problem formulations which allows us to define a unified view that we called Decomposition into Low-rank plus Additive Matrices (DLAM).
Then, we examine carefully each method in each robust subspace learning/tracking frameworks with their decomposition, their loss functions, their optimization problem and their solvers.
Furthermore, we investigate if incremental algorithms and real-time implementations can be achieved for background/foreground separation.
Finally, experimental results on a large-scale dataset called Background Models Challenge (BMC 2012) show the comparative performance of 32 different robust subspace learning/tracking methods.
Hosting platforms for software projects can form collaborative social networks and a prime example of this is GitHub which is arguably the most popular platform of this kind.
An open source project recommendation system could be a major feature for a platform like GitHub, enabling its users to find relevant projects in a fast and simple manner.
We perform network analysis on a constructed graph based on GitHub data and present a recommendation system that uses link prediction.
In video super-resolution, the spatio-temporal coherence between, and among the frames must be exploited appropriately for accurate prediction of the high resolution frames.
Although 2D convolutional neural networks (CNNs) are powerful in modelling images, 3D-CNNs are more suitable for spatio-temporal feature extraction as they can preserve temporal information.
To this end, we propose an effective 3D-CNN for video super-resolution, called the 3DSRnet that does not require motion alignment as preprocessing.
Our 3DSRnet maintains the temporal depth of spatio-temporal feature maps to maximally capture the temporally nonlinear characteristics between low and high resolution frames, and adopts residual learning in conjunction with the sub-pixel outputs.
It outperforms the most state-of-the-art method with average 0.45 and 0.36 dB higher in PSNR for scales 3 and 4, respectively, in the Vidset4 benchmark.
Our 3DSRnet first deals with the performance drop due to scene change, which is important in practice but has not been previously considered.
The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data.
In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes.
Typically, nonlinear kernels produce significantly higher classification quality to linear kernels but introduce extra kernel and model parameters which computationally expensive fitting.
This increases the quality but also reduces the performance dramatically.
We introduce a generalized fast multilevel framework for regular and weighted SVM and discuss several versions of its algorithmic components that lead to a good trade-off between quality and time.
Our framework is implemented using PETSc which allows an easy integration with scientific computing tasks.
The experimental results demonstrate significant speedup compared to the state-of-the-art nonlinear SVM libraries.
We study Doob's martingale convergence theorem for computable continuous time martingales on Brownian motion, in the context of algorithmic randomness.
A characterization of the class of sample points for which the theorem holds is given.
Such points are given the name of Doob random points.
It is shown that a point is Doob random if its tail is computably random in a certain sense.
Moreover, Doob randomness is strictly weaker than computable randomness and is incomparable with Schnorr randomness.
We develop a new algorithm for fitting circles that does not have drawbacks commonly found in existing circle fits.
Our fit achieves ultimate accuracy (to machine precision), avoids divergence, and is numerically stable even when fitting circles get arbitrary large.
Lastly, our algorithm takes less than 10 iterations to converge, on average.
The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning.
However, planning time is not free in most real-world settings.
For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them.
Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved.
We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs).
Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.
For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations.
We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.
General human action recognition requires understanding of various visual cues.
In this paper, we propose a network architecture that computes and integrates the most important visual cues for action recognition: pose, motion, and the raw images.
For the integration, we introduce a Markov chain model which adds cues successively.
The resulting approach is efficient and applicable to action classification as well as to spatial and temporal action localization.
The two contributions clearly improve the performance over respective baselines.
The overall approach achieves state-of-the-art action classification performance on HMDB51, J-HMDB and NTU RGB+D datasets.
Moreover, it yields state-of-the-art spatio-temporal action localization results on UCF101 and J-HMDB.
We construct custom regularization functions for use in supervised training of deep neural networks.
Our technique is applicable when the ground-truth labels themselves exhibit internal structure; we derive a regularizer by learning an autoencoder over the set of annotations.
Training thereby becomes a two-phase procedure.
The first phase models labels with an autoencoder.
The second phase trains the actual network of interest by attaching an auxiliary branch that must predict output via a hidden layer of the autoencoder.
After training, we discard this auxiliary branch.
We experiment in the context of semantic segmentation, demonstrating this regularization strategy leads to consistent accuracy boosts over baselines, both when training from scratch, or in combination with ImageNet pretraining.
Gains are also consistent over different choices of convolutional network architecture.
As our regularizer is discarded after training, our method has zero cost at test time; the performance improvements are essentially free.
We are simply able to learn better network weights by building an abstract model of the label space, and then training the network to understand this abstraction alongside the original task.
Learning by contrasting positive and negative samples is a general strategy adopted by many methods.
Noise contrastive estimation (NCE) for word embeddings and translating embeddings for knowledge graphs are examples in NLP employing this approach.
In this work, we view contrastive learning as an abstraction of all such methods and augment the negative sampler into a mixture distribution containing an adversarially learned sampler.
The resulting adaptive sampler finds harder negative examples, which forces the main model to learn a better representation of the data.
We evaluate our proposal on learning word embeddings, order embeddings and knowledge graph embeddings and observe both faster convergence and improved results on multiple metrics.
We propose a method for visual question answering which combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions.
This allows more complex questions to be answered using the predominant neural network-based approach than has previously been possible.
It particularly allows questions to be asked about the contents of an image, even when the image itself does not contain the whole answer.
The method constructs a textual representation of the semantic content of an image, and merges it with textual information sourced from a knowledge base, to develop a deeper understanding of the scene viewed.
Priming a recurrent neural network with this combined information, and the submitted question, leads to a very flexible visual question answering approach.
We are specifically able to answer questions posed in natural language, that refer to information not contained in the image.
We demonstrate the effectiveness of our model on two publicly available datasets, Toronto COCO-QA and MS COCO-VQA and show that it produces the best reported results in both cases.
A novel decentralised trajectory generation algorithm for Multi Agent systems is presented.
Multi-robot systems have the capacity to transform lives in a variety of fields.
But, trajectory generation for multi-robot systems is still in its nascent stage and limited to heavily controlled environments.
To overcome that, an online trajectory optimization algorithm that generates collision-free trajectories for robots, when given initial state and desired end pose, is proposed.
It utilizes a simple method for obstacle detection, local shape based maps for obstacles and communication of robots' current states.
Using the local maps, safe regions are formulated.
Based upon the communicated data, trajectories are predicted for other robots and incorporated for collision-avoidance by resizing the regions of free space that the robot can be in without colliding.
A trajectory is then optimized constraining the robot to remain within the safe region with the trajectories represented by piecewise polynomials parameterized by time.
The algorithm is implemented using a receding horizon principle.
The proposed algorithm is extensively tested in simulations on Gazebo using ROS with fourth order differentially flat aerial robots and non-holonomic second order wheeled robots in structured and unstructured environments.
Fire disasters are man-made disasters, which cause ecological, social, and economic damage.
To minimize these losses, early detection of fire and an autonomous response are important and helpful to disaster management systems.
Therefore, in this article, we propose an early fire detection framework using fine-tuned convolutional neural networks for CCTV surveillance cameras, which can detect fire in varying indoor and outdoor environments.
To ensure the autonomous response, we propose an adaptive prioritization mechanism for cameras in the surveillance system.
Finally, we propose a dynamic channel selection algorithm for cameras based on cognitive radio networks, ensuring reliable data dissemination.
Experimental results verify the higher accuracy of our fire detection scheme compared to state-of-the-art methods and validate the applicability of our framework for effective fire disaster management.
In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities.
This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past.
Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory.
In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications).
PageRank is a fundamental link analysis algorithm that also functions as a key representative of the performance of Sparse Matrix-Vector (SpMV) multiplication.
The traditional PageRank implementation generates fine granularity random memory accesses resulting in large amount of wasteful DRAM traffic and poor bandwidth utilization.
In this paper, we present a novel Partition-Centric Processing Methodology (PCPM) to compute PageRank, that drastically reduces the amount of DRAM communication while achieving high sustained memory bandwidth.
PCPM uses a Partition-centric abstraction coupled with the Gather-Apply-Scatter (GAS) programming model.
By carefully examining how a PCPM based implementation impacts communication characteristics of the algorithm, we propose several system optimizations that improve the execution time substantially.
More specifically, we develop (1) a new data layout that significantly reduces communication and random DRAM accesses, and (2) branch avoidance mechanisms to get rid of unpredictable data-dependent branches.
We perform detailed analytical and experimental evaluation of our approach using 6 large graphs and demonstrate an average 2.7x speedup in execution time and 1.7x reduction in communication volume, compared to the state-of-the-art.
We also show that unlike other GAS based implementations, PCPM is able to further reduce main memory traffic by taking advantage of intelligent node labeling that enhances locality.
Although we use PageRank as the target application in this paper, our approach can be applied to generic SpMV computation.
In one-way quantum computation (1WQC) model, universal quantum computations are performed using measurements to designated qubits in a highly entangled state.
The choices of bases for these measurements as well as the structure of the entanglements specify a quantum algorithm.
As scalable and reliable quantum computers have not been implemented yet, quantum computation simulators are the only widely available tools to design and test quantum algorithms.
However, simulating the quantum computations on a standard classical computer in most cases requires exponential memory and time.
In this paper, a general direct simulator for 1WQC, called OWQS, is presented.
Some techniques such as qubit elimination, pattern reordering and implicit simulation of actions are used to considerably reduce the time and memory needed for the simulations.
Moreover, our simulator is adjusted to simulate the measurement patterns with a generalized flow without calculating the measurement probabilities which is called extended one-way quantum computation simulator (EOWQS).
Experimental results validate the feasibility of the proposed simulators and that OWQS and EOWQS are faster as compared with the well-known quantum circuit simulators, i.e., QuIDDPro and libquantum for simulating 1WQC model.
Grid computing has attracted many researchers over a few years, and as a result many new protocols have emerged and also evolved since its inception a decade ago.
Grid protocols play major role in implementing services that facilitate coordinated resource sharing across diverse organizations.
In this paper, we provide comprehensive coverage of different core Grid protocols that can be used in Global Grid Computing.
We establish the classification of core Grid protocols into i) Grid network communication and Grid data transfer protocols, ii) Grid information security protocols, iii) Grid resource information protocols, iv) Grid management protocols, and v) Grid interface protocols, depending upon the kind of activities handled by these protocols.
All the classified protocols are also organized into layers of the Hourglass model of Grid architecture to understand dependency among these protocols.
We also present the characteristics of each protocol.
For better understanding of these protocols, we also discuss applied protocols as examples from either Globus toolkit or other popular Grid middleware projects.
We believe that our classification and characterization of Grid protocols will enable better understanding of core Grid protocols and will motivate further research in the area of Global Grid Computing.
Organizations and teams collect and acquire data from various sources, such as social interactions, financial transactions, sensor data, and genome sequencers.
Different teams in an organization as well as different data scientists within a team are interested in extracting a variety of insights which require combining and collaboratively analyzing datasets in diverse ways.
DataHub is a system that aims to provide robust version control and provenance management for such a scenario.
To be truly useful for collaborative data science, one also needs the ability to specify queries and analysis tasks over the versioning and the provenance information in a unified manner.
In this paper, we present an initial design of our query language, called VQuel, that aims to support such unified querying over both types of information, as well as the intermediate and final results of analyses.
We also discuss some of the key language design and implementation challenges moving forward.
This article reports on an exploratory case study conducted to examine the viability of Second Life (SL) as an environment for physical simulations and microworlds.
It begins by discussing specific features of the SL environment relevant to its use as a support for microworlds and simulations as well as a few differences found between SL and traditional simulators such as Modellus, along with their implications to simulations, as a support for subsequent analysis.
Afterwards, we will use Narayanasamy et al. and Johnston and Whitehead criteria to analyze the SL environment and determine into which of training simulators, games, simulation games, or serious games categories SL fits best.
We conclude that SL shows itself as a huge and sophisticated simulator of an entire Earthlike world used by thousands of users to simulate real life in some sense and a viable and flexible platform for microworlds and simulations.
Eigenvector localization refers to the situation when most of the components of an eigenvector are zero or near-zero.
This phenomenon has been observed on eigenvectors associated with extremal eigenvalues, and in many of those cases it can be meaningfully interpreted in terms of "structural heterogeneities" in the data.
For example, the largest eigenvectors of adjacency matrices of large complex networks often have most of their mass localized on high-degree nodes; and the smallest eigenvectors of the Laplacians of such networks are often localized on small but meaningful community-like sets of nodes.
Here, we describe localization associated with low-order eigenvectors, i.e., eigenvectors corresponding to eigenvalues that are not extremal but that are "buried" further down in the spectrum.
Although we have observed it in several unrelated applications, this phenomenon of low-order eigenvector localization defies common intuitions and simple explanations, and it creates serious difficulties for the applicability of popular eigenvector-based machine learning and data analysis tools.
After describing two examples where low-order eigenvector localization arises, we present a very simple model that qualitatively reproduces several of the empirically-observed results.
This model suggests certain coarse structural similarities among the seemingly-unrelated applications where we have observed low-order eigenvector localization, and it may be used as a diagnostic tool to help extract insight from data graphs when such low-order eigenvector localization is present.
This paper introduces deep neural networks (DNNs) as add-on blocks to baseline feedback control systems to enhance tracking performance of arbitrary desired trajectories.
The DNNs are trained to adapt the reference signals to the feedback control loop.
The goal is to achieve a unity map between the desired and the actual outputs.
In previous work, the efficacy of this approach was demonstrated on quadrotors; on 30 unseen test trajectories, the proposed DNN approach achieved an average impromptu tracking error reduction of 43% as compared to the baseline feedback controller.
Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture.
In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases the training efficiency can be further increased.
An important disadvantage of the h-index is that typically it cannot take into account the specific field of research of a researcher.
Usually sample point estimates of the average and median h-index values for the various fields are reported that are highly variable and dependent of the specific samples and it would be useful to provide confidence intervals of prediction accuracy.
In this paper we apply the non-parametric bootstrap technique for constructing confidence intervals for the h-index for different fields of research.
In this way no specific assumptions about the distribution of the empirical hindex are required as well as no large samples since that the methodology is based on resampling from the initial sample.
The results of the analysis showed important differences between the various fields.
The performance of the bootstrap intervals for the mean and median h-index for most fields seems to be rather satisfactory as revealed by the performed simulation.
Random fields are useful mathematical objects in the characterization of non-deterministic complex systems.
A fundamental issue in the evolution of dynamical systems is how intrinsic properties of such structures change in time.
In this paper, we propose to quantify how changes in the spatial dependence structure affect the Riemannian metric tensor that equips the model's parametric space.
Defining Fisher curves, we measure the variations in each component of the metric tensor when visiting different entropic states of the system.
Simulations show that the geometric deformations induced by the metric tensor in case of a decrease in the inverse temperature are not reversible for an increase of the same amount, provided there is significant variation in the system entropy: the process of taking a system from a lower entropy state A to a higher entropy state B and then bringing it back to A, induces a natural intrinsic one-way direction of evolution.
In this context, Fisher curves resemble mathematical models of hysteresis in which the natural orientation is pointed by an arrow of time.
We present 3DTouch, a novel 3D wearable input device worn on the fingertip for 3D manipulation tasks.
3DTouch is designed to fill the missing gap of a 3D input device that is self-contained, mobile, and universally working across various 3D platforms.
This paper presents a low-cost solution to designing and implementing such a device.
Our approach relies on relative positioning technique using an optical laser sensor and a 9-DOF inertial measurement unit.
3DTouch is self-contained, and designed to universally work on various 3D platforms.
The device employs touch input for the benefits of passive haptic feedback, and movement stability.
On the other hand, with touch interaction, 3DTouch is conceptually less fatiguing to use over many hours than 3D spatial input devices.
We propose a set of 3D interaction techniques including selection, translation, and rotation using 3DTouch.
An evaluation also demonstrates the device's tracking accuracy of 1.10 mm and 2.33 degrees for subtle touch interaction in 3D space.
Modular solutions like 3DTouch opens up a whole new design space for interaction techniques to further develop on.
Deep Linking is the process of referring to a specific piece of web content.
Although users can browse their files in desktop environments, they are unable to directly traverse deeper into their content using deep links.
In order to solve this issue, we demonstrate "DeepLinker", a tool which generates and interprets deep links to desktop resources, thus enabling the reference to a certain location within a file using a simple hyperlink.
By default, the service responds with an HTML representation of the resource along with further links to follow.
Additionally, we allow the use of RDF to interlink our deep links with other resources.
Well known in the theory of network flows, Braess paradox states that in a congested network, it may happen that adding a new path between destinations can increase the level of congestion.
In transportation networks the phenomenon results from the decisions of network participants who selfishly seek to optimize their own performance metrics.
In an electric power distribution network, an analogous increase in congestion can arise as a consequence Kirchhoff's laws.
Even for the simplest linear network of resistors and voltage sources, the sudden appearance of congestion due to an additional conductive line is a nonlinear phenomenon that results in a discontinuous change in the network state.
It is argued that the phenomenon can occur in almost any grid in which they are loops, and with the increasing penetration of small-scale distributed generation it suggests challenges ahead in the operation of microgrids.
This paper is devoted to the online dominating set problem and its variants.
We believe the paper represents the first systematic study of the effect of two limitations of online algorithms: making irrevocable decisions while not knowing the future, and being incremental, i.e., having to maintain solutions to all prefixes of the input.
This is quantified through competitive analyses of online algorithms against two optimal algorithms, both knowing the entire input, but only one having to be incremental.
We also consider the competitive ratio of the weaker of the two optimal algorithms against the other.
We consider important graph classes, distinguishing between connected and not necessarily connected graphs.
For the classic graph classes of trees, bipartite, planar, and general graphs, we obtain tight results in almost all cases.
We also derive upper and lower bounds for the class of bounded-degree graphs.
From these analyses, we get detailed information regarding the significance of the necessary requirement that online algorithms be incremental.
In some cases, having to be incremental fully accounts for the online algorithm's disadvantage.
In this paper we consider cryptographic applications of the arithmetic on the hyperoctahedral group.
On an appropriate subgroup of the latter, we particularly propose to construct public key cryptosystems based on the discrete logarithm.
The fact that the group of signed permutations has rich properties provides fast and easy implementation and makes these systems resistant to attacks like the Pohlig-Hellman algorithm.
The only negative point is that storing and transmitting permutations need large memory.
Using together the hyperoctahedral enumeration system and what is called subexceedant functions, we define a one-to-one correspondance between natural numbers and signed permutations with which we label the message units.
Robot manipulation is increasingly poised to interact with humans in co-shared workspaces.
Despite increasingly robust manipulation and control algorithms, failure modes continue to exist whenever models do not capture the dynamics of the unstructured environment.
To obtain longer-term horizons in robot automation, robots must develop introspection and recovery abilities.
We contribute a set of recovery policies to deal with anomalies produced by external disturbances as well as anomaly classification through the use of non-parametric statistics with memoized variational inference with scalable adaptation.
A recovery critic stands atop of a tightly-integrated, graph-based online motion-generation and introspection system that resolves a wide range of anomalous situations.
Policies, skills, and introspection models are learned incrementally and contextually in a task.
Two task-level recovery policies: re-enactment and adaptation resolve accidental and persistent anomalies respectively.
The introspection system uses non-parametric priors along with Markov jump linear systems and memoized variational inference with scalable adaptation to learn a model from the data.
Extensive real-robot experimentation with various strenuous anomalous conditions is induced and resolved at different phases of a task and in different combinations.
The system executes around-the-clock introspection and recovery and even elicited self-recovery when misclassifications occurred.
This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as Dynamic Depth Images (DDI), Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images (DDMNI), for both isolated and continuous action recognition.
These dynamic images are constructed from a segmented sequence of depth maps using hierarchical bidirectional rank pooling to effectively capture the spatial-temporal information.
Specifically, DDI exploits the dynamics of postures over time and DDNI and DDMNI exploit the 3D structural information captured by depth maps.
Upon the proposed representations, a ConvNet based method is developed for action recognition.
The image-based representations enable us to fine-tune the existing Convolutional Neural Network (ConvNet) models trained on image data without training a large number of parameters from scratch.
The proposed method achieved the state-of-art results on three large datasets, namely, the Large-scale Continuous Gesture Recognition Dataset (means Jaccard index 0.4109), the Large-scale Isolated Gesture Recognition Dataset (59.21%), and the NTU RGB+D Dataset (87.08% cross-subject and 84.22% cross-view) even though only the depth modality was used.
Within the Semantic Web community, SPARQL is one of the predominant languages to query and update RDF knowledge.
However, the complexity of SPARQL, the underlying graph structure and various encodings are common sources of confusion for Semantic Web novices.
In this paper we present a general purpose approach to convert any given SPARQL endpoint into a simple to use REST API.
To lower the initial hurdle, we represent the underlying graph as an interlinked view of nested JSON objects that can be traversed by the API path.
Neural network-based Open-ended conversational agents automatically generate responses based on predictive models learned from a large number of pairs of utterances.
The generated responses are typically acceptable as a sentence but are often dull, generic, and certainly devoid of any emotion.
In this paper, we present neural models that learn to express a given emotion in the generated response.
We propose four models and evaluate them against 3 baselines.
An encoder-decoder framework-based model with multiple attention layers provides the best overall performance in terms of expressing the required emotion.
While it does not outperform other models on all emotions, it presents promising results in most cases.
A lot of progress has been made to solve the depth estimation problem in stereo vision.
Though, a very satisfactory performance is observed by utilizing the deep learning in supervised manner for depth estimation.
This approach needs huge amount of ground truth training data as well as depth maps which is very laborious to prepare and many times it is not available in real scenario.
Thus, the unsupervised depth estimation is the recent trend by utilizing the binocular stereo images to get rid of depth map ground truth.
In unsupervised depth computation, the disparity images are generated by training the CNN with an image reconstruction loss based on the epipolar geometry constraints.
The effective way of using CNN as well as investigating the better losses for the said problem needs to be addressed.
In this paper, a dual CNN based model is presented for unsupervised depth estimation with 6 losses (DNM6) with individual CNN for each view to generate the corresponding disparity map.
The proposed dual CNN model is also extended with 12 losses (DNM12) by utilizing the cross disparities.
The presented DNM6 and DNM12 models are experimented over KITTI driving and Cityscapes urban database and compared with the recent state-of-the-art result of unsupervised depth estimation.
We explore recurrent encoder multi-decoder neural network architectures for semi-supervised sequence classification and reconstruction.
We find that the use of multiple reconstruction modules helps models generalize in a classification task when only a small amount of labeled data is available, which is often the case in practice.
Such models provide useful high-level representations of motions allowing clustering, searching and faster labeling of new sequences.
We also propose a new, realistic partitioning of a well-known, high quality motion-capture dataset for better evaluations.
We further explore a novel formulation for future-predicting decoders based on conditional recurrent generative adversarial networks, for which we propose both soft and hard constraints for transition generation derived from desired physical properties of synthesized future movements and desired animation goals.
We find that using such constraints allow to stabilize the training of recurrent adversarial architectures for animation generation.
We study a general class of dynamic multi-agent decision problems with asymmetric information and non-strategic agents, which includes dynamic teams as a special case.
When agents are non-strategic, an agent's strategy is known to the other agents.
Nevertheless, the agents' strategy choices and beliefs are interdependent over times, a phenomenon known as signaling.
We introduce the notions of private information that effectively compresses the agents' information in a mutually consistent manner.
Based on the notions of sufficient information, we propose an information state for each agent that is sufficient for decision making purposes.
We present instances of dynamic multi-agent decision problems where we can determine an information state with a time-invariant domain for each agent.
Furthermore, we present a generalization of the policy-independence property of belief in Partially Observed Markov Decision Processes (POMDP) to dynamic multi-agent decision problems.
Within the context of dynamic teams with asymmetric information, the proposed set of information states leads to a sequential decomposition that decouples the interdependence between the agents' strategies and beliefs over time, and enables us to formulate a dynamic program to determine a globally optimal policy via backward induction.
An enduring issue in higher education is student retention to successful graduation.
National statistics indicate that most higher education institutions have four-year degree completion rates around 50 percent, or just half of their student populations.
While there are prediction models which illuminate what factors assist with college student success, interventions that support course selections on a semester-to-semester basis have yet to be deeply understood.
To further this goal, we develop a system to predict students' grades in the courses they will enroll in during the next enrollment term by learning patterns from historical transcript data coupled with additional information about students, courses and the instructors teaching them.
We explore a variety of classic and state-of-the-art techniques which have proven effective for recommendation tasks in the e-commerce domain.
In our experiments, Factorization Machines (FM), Random Forests (RF), and the Personalized Multi-Linear Regression model achieve the lowest prediction error.
Application of a novel feature selection technique is key to the predictive success and interpretability of the FM.
By comparing feature importance across populations and across models, we uncover strong connections between instructor characteristics and student performance.
We also discover key differences between transfer and non-transfer students.
Ultimately we find that a hybrid FM-RF method can be used to accurately predict grades for both new and returning students taking both new and existing courses.
Application of these techniques holds promise for student degree planning, instructor interventions, and personalized advising, all of which could improve retention and academic performance.
Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments.
Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure.
In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges.
We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap.
We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.
Recently, deep convolutional neural network (DCNN) achieved increasingly remarkable success and rapidly developed in the field of natural image recognition.
Compared with the natural image, the scale of remote sensing image is larger and the scene and the object it represents are more macroscopic.
This study inquires whether remote sensing scene and natural scene recognitions differ and raises the following questions: What are the key factors in remote sensing scene recognition?
Is the DCNN recognition mechanism centered on object recognition still applicable to the scenarios of remote sensing scene understanding?
We performed several experiments to explore the influence of the DCNN structure and the scale of remote sensing scene understanding from the perspective of scene complexity.
Our experiment shows that understanding a complex scene depends on an in-depth network and multiple-scale perception.
Using a visualization method, we qualitatively and quantitatively analyze the recognition mechanism in a complex remote sensing scene and demonstrate the importance of multi-objective joint semantic support.
Our paper is research in progress that is research investigating the use of games technology to enhance the learning of a physical skill.
The Microsoft Kinect is a system designed for gaming with the capability to track the movement of users.
Our research explored whether such a system could be used to provide feedback when teaching sign vocabulary.
Whilst there are technologies available for teaching sign language, currently none provide feedback on the accuracy of the users' attempts at making signs.
In this paper we report how the three-dimensional dsplay capability of the technology can enhance the users' experience.
Also, when using tracking to identify errors in physical movements, how and when should feedback be given.
A design science approach was undertaken to find a solution to this real world problem.
The design and implementation of the solution provides interesting insights into how technology can not only emulate but also improve upon traditional learning of physical skills.
With the success of deep learning techniques in a broad range of application domains, many deep learning software frameworks have been developed and are being updated frequently to adapt to new hardware features and software libraries, which bring a big challenge for end users and system administrators.
To address this problem, container techniques are widely used to simplify the deployment and management of deep learning software.
However, it remains unknown whether container techniques bring any performance penalty to deep learning applications.
The purpose of this work is to systematically evaluate the impact of docker container on the performance of deep learning applications.
We first benchmark the performance of system components (IO, CPU and GPU) in a docker container and the host system and compare the results to see if there's any difference.
According to our results, we find that computational intensive jobs, either running on CPU or GPU, have small overhead indicating docker containers can be applied to deep learning programs.
Then we evaluate the performance of some popular deep learning tools deployed in a docker container and the host system.
It turns out that the docker container will not cause noticeable drawbacks while running those deep learning tools.
So encapsulating deep learning tool in a container is a feasible solution.
Most learning algorithms require the practitioner to manually set the values of many hyperparameters before the learning process can begin.
However, with modern algorithms, the evaluation of a given hyperparameter setting can take a considerable amount of time and the search space is often very high-dimensional.
We suggest using a lower-dimensional representation of the original data to quickly identify promising areas in the hyperparameter space.
This information can then be used to initialize the optimization algorithm for the original, higher-dimensional data.
We compare this approach with the standard procedure of optimizing the hyperparameters only on the original input.
We perform experiments with various state-of-the-art hyperparameter optimization algorithms such as random search, the tree of parzen estimators (TPEs), sequential model-based algorithm configuration (SMAC), and a genetic algorithm (GA).
Our experiments indicate that it is possible to speed up the optimization process by using lower-dimensional data representations at the beginning, while increasing the dimensionality of the input later in the optimization process.
This is independent of the underlying optimization procedure, making the approach promising for many existing hyperparameter optimization algorithms.
Learning controllers for bipedal robots is a challenging problem, often requiring expert knowledge and extensive tuning of parameters that vary in different situations.
Recently, deep reinforcement learning has shown promise at automatically learning controllers for complex systems in simulation.
This has been followed by a push towards learning controllers that can be transferred between simulation and hardware, primarily with the use of domain randomization.
However, domain randomization can make the problem of finding stable controllers even more challenging, especially for underactuated bipedal robots.
In this work, we explore whether policies learned in simulation can be transferred to hardware with the use of high-fidelity simulators and structured controllers.
We learn a neural network policy which is a part of a more structured controller.
While the neural network is learned in simulation, the rest of the controller stays fixed, and can be tuned by the expert as needed.
We show that using this approach can greatly speed up the rate of learning in simulation, as well as enable transfer of policies between simulation and hardware.
We present our results on an ATRIAS robot and explore the effect of action spaces and cost functions on the rate of transfer between simulation and hardware.
Our results show that structured policies can indeed be learned in simulation and implemented on hardware successfully.
This has several advantages, as the structure preserves the intuitive nature of the policy, and the neural network improves the performance of the hand-designed policy.
In this way, we propose a way of using neural networks to improve expert designed controllers, while maintaining ease of understanding.
Spectrum sensing is the challenge for cognitive radio design and implementation, which allows the secondary user to access the primary bands without interference with primary users.
Cognitive radios should decide on the best spectrum band to meet the Quality of service requirements over all available spectrum bands.
This paper investigates the integrated centralized spectrum sensing techniques in multipath fading environment and the performance was analyzed with energy detection and wavelet based sensing techniques for unknown signal.
Keywords: Cognitive Radio, Spectrum Sensing, Signal Detection, Primary User, Secondary User
We propose a regularized zero-forcing transmit precoding (RZF-TPC) aided and distance-based adaptive coding and modulation (ACM) scheme to support aeronautical communication applications, by exploiting the high spectral efficiency of large-scale antenna arrays and link adaption.
Our RZF-TPC aided and distance-based ACM scheme switches its mode according to the distance between the communicating aircraft.
We derive the closed-form asymptotic signal-to-interference-plus-noise ratio (SINR) expression of the RZF-TPC for the aeronautical channel, which is Rician, relying on a non-centered channel matrix that is dominated by the deterministic line-of-sight component.
The effects of both realistic channel estimation errors and of the co-channel interference are considered in the derivation of this approximate closed-form SINR formula.
Furthermore, we derive the analytical expression of the optimal regularization parameter that minimizes the mean square detection error.
The achievable throughput expression based on our asymptotic approximate SINR formula is then utilized as the design metric for the proposed RZF-TPC aided and distance-based ACM scheme.
Monte-Carlo simulation results are presented for validating our theoretical analysis as well as for investigating the impact of the key system parameters.
The simulation results closely match the theoretical results.
In the specific example that two communicating aircraft fly at a typical cruising speed of 920 km/h, heading in opposite direction over the distance up to 740 km taking a period of about 24 minutes, the RZF-TPC aided and distance-based ACM is capable of transmitting a total of 77 Gigabyte of data with the aid of 64 transmit antennas and 4 receive antennas, which is significantly higher than that of our previous eigen-beamforming transmit precoding aided and distance-based ACM benchmark.
Dense word embeddings, which encode semantic meanings of words to low dimensional vector spaces have become very popular in natural language processing (NLP) research due to their state-of-the-art performances in many NLP tasks.
Word embeddings are substantially successful in capturing semantic relations among words, so a meaningful semantic structure must be present in the respective vector spaces.
However, in many cases, this semantic structure is broadly and heterogeneously distributed across the embedding dimensions, which makes interpretation a big challenge.
In this study, we propose a statistical method to uncover the latent semantic structure in the dense word embeddings.
To perform our analysis we introduce a new dataset (SEMCAT) that contains more than 6500 words semantically grouped under 110 categories.
We further propose a method to quantify the interpretability of the word embeddings; the proposed method is a practical alternative to the classical word intrusion test that requires human intervention.
This paper introduces the Java Software Evolution Tracker, a visualization and analysis tool that provides practitioners the means to examine the evolution of a software system from a top to bottom perspective, starting with changes in the graphical user interface all the way to source code modifications.
Hierarchical Task Network (HTN) planning uses task decomposition to plan for an executable sequence of actions as a solution to a problem.
In order to reason effectively, an HTN planner needs expressive domain knowledge.
For instance, a simplified HTN planning system such as JSHOP2 uses such expressivity and avoids some task interactions due to the increased complexity of the planning process.
We address the possibility of simplifying the domain representation needed for an HTN planner to find good solutions, especially in real-world domains describing home and building automation environments.
We extend the JSHOP2 planner to reason about task interaction that happens when task's effects are already achieved by other tasks.
The planner then prunes some of the redundant searches that can occur due to the planning process's interleaving nature.
We evaluate the original and our improved planner on two benchmark domains.
We show that our planner behaves better by using simplified domain knowledge and outperforms JSHOP2 in a number of relevant cases.
Many species dream, yet there remain many open research questions in the study of dreams.
The symbolism of dreams and their interpretation is present in cultures throughout history.
Analysis of online data sources for dream interpretation using network science leads to understanding symbolism in dreams and their associated meaning.
In this study, we introduce dream interpretation networks for English, Chinese and Arabic that represent different cultures from various parts of the world.
We analyze communities in these networks, finding that symbols within a community are semantically related.
The central nodes in communities give insight about cultures and symbols in dreams.
The community structure of different networks highlights cultural similarities and differences.
Interconnections between different networks are also identified by translating symbols from different languages into English.
Structural correlations across networks point out relationships between cultures.
Similarities between network communities are also investigated by analysis of sentiment in symbol interpretations.
We find that interpretations within a community tend to have similar sentiment.
Furthermore, we cluster communities based on their sentiment, yielding three main categories of positive, negative, and neutral dream symbols.
We consider peer review in a conference setting where there is typically an overlap between the set of reviewers and the set of authors.
This overlap can incentivize strategic reviews to influence the final ranking of one's own papers.
In this work, we address this problem through the lens of social choice, and present a theoretical framework for strategyproof and efficient peer review.
We first present and analyze an algorithm for reviewer-assignment and aggregation that guarantees strategyproofness and a natural efficiency property called unanimity, when the authorship graph satisfies a simple property.
Our algorithm is based on the so-called partitioning method, and can be thought as a generalization of this method to conference peer review settings.
We then empirically show that the requisite property on the authorship graph is indeed satisfied in the ICLR-17 submission data, and further demonstrate a simple trick to make the partitioning method more practically appealing for conference peer review.
Finally, we complement our positive results with negative theoretical results where we prove that under various ways of strengthening the requirements, it is impossible for any algorithm to be strategyproof and efficient.
We consider the problem of inference in discrete probabilistic models, that is, distributions over subsets of a finite ground set.
These encompass a range of well-known models in machine learning, such as determinantal point processes and Ising models.
Locally-moving Markov chain Monte Carlo algorithms, such as the Gibbs sampler, are commonly used for inference in such models, but their convergence is, at times, prohibitively slow.
This is often caused by state-space bottlenecks that greatly hinder the movement of such samplers.
We propose a novel sampling strategy that uses a specific mixture of product distributions to propose global moves and, thus, accelerate convergence.
Furthermore, we show how to construct such a mixture using semigradient information.
We illustrate the effectiveness of combining our sampler with existing ones, both theoretically on an example model, as well as practically on three models learned from real-world data sets.
A multiple instance dictionary learning method using functions of multiple instances (DL-FUMI) is proposed to address target detection and two-class classification problems with inaccurate training labels.
Given inaccurate training labels, DL-FUMI learns a set of target dictionary atoms that describe the most distinctive and representative features of the true positive class as well as a set of nontarget dictionary atoms that account for the shared information found in both the positive and negative instances.
Experimental results show that the estimated target dictionary atoms found by DL-FUMI are more representative prototypes and identify better discriminative features of the true positive class than existing methods in the literature.
DL-FUMI is shown to have significantly better performance on several target detection and classification problems as compared to other multiple instance learning (MIL) dictionary learning algorithms on a variety of MIL problems.
In this paper technological solutions for improving the quality of video transfer along wireless networks are investigated.
Tools have been developed to allow packets to be duplicated with key frames data.
In the paper we tested video streams with duplication of all frames, with duplication of key frames, and without duplication.
The experiments showed that the best results are obtained by duplication of packages which contain key frames.
The paper also provides an overview of the coefficients describing the dependence of video quality on packet loss and delay variation (network jitter).
We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images.
Our model contains a Relation Proposal Network (RePN) that efficiently deals with the quadratic number of potential relations between objects in an image.
We also propose an attentional Graph Convolutional Network (aGCN) that effectively captures contextual information between objects and relations.
Finally, we introduce a new evaluation metric that is more holistic and realistic than existing metrics.
We report state-of-the-art performance on scene graph generation as evaluated using both existing and our proposed metrics.
We consider the previously defined notion of finite-state independence and we focus specifically on normal words.
We characterize finite-state independence of normal words in three different ways, using three different kinds of asynchronous deterministic finite automata with two input tapes containing infinite words.
Based on one of the characterizations we give an algorithm to construct a pair of finite-state independent normal words.
Super point is a special host in network which communicates with lots of other hosts in a certain time period.
The number of hosts contacting with a super point is called as its cardinality.
Cardinality estimating plays important roles in network management and security.
All of existing works focus on how to estimate super point's cardinality under discrete time window.
But discrete time window causes great delay and the accuracy of estimating result is subject to the starting of the window. sliding time window, moving forwarding a small slice every time, offers a more accuracy and timely scale to monitor super point's cardinality.
On the other hand, super point's cardinality estimating under sliding time window is more difficult because it requires an algorithm to record the cardinality incrementally and report them immediately at the end of the sliding duration.
This paper firstly solves this problem by devising a sliding time window available algorithm SRLA.
SRLA records hosts cardinality by a novel structure which could be updated incrementally.
In order to reduce the cardinality estimating time at the end of every sliding time window, SRLA generates a super point candidate list while scanning packets and calculates the cardinality of hosts in the candidate list only.
It also has the ability to run parallel to deal with high speed network in line speed.
This paper gives the way to deploy SRLA on a common GPU.
Experiments on real world traffics which have 40 GB/s bandwidth show that SRLA successfully estimates super point's cardinality within 100 milliseconds under sliding time window when running on a low cost Nvidia GPU, GTX650 with 1 GB memory.
The estimating time of SRLA is much smaller than that of other algorithms which consumes more than 2000 milliseconds under discrete time window.
An important component of achieving language understanding is mastering the composition of sentence meaning, but an immediate challenge to solving this problem is the opacity of sentence vector representations produced by current neural sentence composition models.
We present a method to address this challenge, developing tasks that directly target compositional meaning information in sentence vector representations with a high degree of precision and control.
To enable the creation of these controlled tasks, we introduce a specialized sentence generation system that produces large, annotated sentence sets meeting specified syntactic, semantic and lexical constraints.
We describe the details of the method and generation system, and then present results of experiments applying our method to probe for compositional information in embeddings from a number of existing sentence composition models.
We find that the method is able to extract useful information about the differing capacities of these models, and we discuss the implications of our results with respect to these systems' capturing of sentence information.
We make available for public use the datasets used for these experiments, as well as the generation system.
In fairly elementary terms this paper presents how the theory of preordered fuzzy sets, more precisely quantale-valued preorders on quantale-valued fuzzy sets, is established under the guidance of enriched category theory.
Motivated by several key results from the theory of quantaloid-enriched categories, this paper develops all needed ingredients purely in order-theoretic languages for the readership of fuzzy set theorists, with particular attention paid to fuzzy Galois connections between preordered fuzzy sets.
This paper proposes a simple, yet very effective method to localize dominant foreground objects in an image, to pixel-level precision.
The proposed method 'MASON' (Model-AgnoStic ObjectNess) uses a deep convolutional network to generate category-independent and model-agnostic heat maps for any image.
The network is not explicitly trained for the task, and hence, can be used off-the-shelf in tandem with any other network or task.
We show that this framework scales to a wide variety of images, and illustrate the effectiveness of MASON in three varied application contexts.
The null vector method, based on a simple linear algebraic concept, is proposed as a solution to the phase retrieval problem.
In the case with complex Gaussian random measurement matrices, a non-asymptotic error bound is derived, yielding an asymptotic regime of accurate approximation comparable to that for the spectral vector method.
The number of applications on online mobile application stores is increasing at a rapid rate.
Smart-phones are used by a wide range of people varying in age, and also in the ability to use a smart phone.
With the increasing dependency on smart-phones, the paper aims to determine whether the popular applications on Google Play, the official store for Android applications, can be used by people with vision impairment.
The accessibility of the applications was tested using an external keyboard, and TalkBack, an accessibility tool developed by Google.
It was found that several popular applications on the store were not designed keeping accessibility in mind.
It was observed that there exists a weak positive relationship between the popularity of the application and its accessibility.
A framework is proposed that can be used by developers to improve the accessibility of an application.
The paper also discusses the programming aspects to be considered while developing an Android application, so that the application can be used by sighted as well as visually impaired users.
The abundance of poorly optimized mobile applications coupled with their increasing centrality in our digital lives make a framework for mobile app optimization an imperative.
While tuning strategies for desktop and server applications have a long history, it is difficult to adapt them for use on mobile phones.
Reference inputs which trigger behavior similar to a mobile application's typical are hard to construct.
For many classes of applications the very concept of typical behavior is nonexistent, each user interacting with the application in very different ways.
In contexts like this, optimization strategies need to evaluate their effectiveness against real user input, but doing so online runs the risk of user dissatisfaction when suboptimal optimizations are evaluated.
In this paper we present an iterative compiler which employs a novel capture and replay technique in order to collect real user input and use it later to evaluate different transformations offline.
The proposed mechanism identifies and stores only the set of memory pages needed to replay the most heavily used functions of the application.
At idle periods, this minimal state is combined with different binaries of the application, each one build with different optimizations enabled.
Replaying the targeted functions allows us to evaluate the effectiveness of each set of optimizations for the actual way the user interacts with the application.
For the BEEBS benchmark suite, our approach was able to improve performance by up to 57%, while keeping the slowdown experienced by the user on average at 0.8%.
By focusing only on heavily used functions, we are able to conserve storage space by between two and three orders of magnitude compared to typical capture and replay implementations.
This paper develops efficient algorithms for distributed average consensus with quantized communication using the alternating direction method of multipliers (ADMM).
We first study the effects of probabilistic and deterministic quantizations on a distributed ADMM algorithm.
With probabilistic quantization, this algorithm yields linear convergence to the desired average in the mean sense with a bounded variance.
When deterministic quantization is employed, the distributed ADMM either converges to a consensus or cycles with a finite period after a finite-time iteration.
In the cyclic case, local quantized variables have the same mean over one period and hence each node can also reach a consensus.
We then obtain an upper bound on the consensus error which depends only on the quantization resolution and the average degree of the network.
Finally, we propose a two-stage algorithm which combines both probabilistic and deterministic quantizations.
Simulations show that the two-stage algorithm, without picking small algorithm parameter, has consensus errors that are typically less than one quantization resolution for all connected networks where agents' data can be of arbitrary magnitudes.
This paper proposes a text summarization approach for factual reports using a deep learning model.
This approach consists of three phases: feature extraction, feature enhancement, and summary generation, which work together to assimilate core information and generate a coherent, understandable summary.
We are exploring various features to improve the set of sentences selected for the summary, and are using a Restricted Boltzmann Machine to enhance and abstract those features to improve resultant accuracy without losing any important information.
The sentences are scored based on those enhanced features and an extractive summary is constructed.
Experimentation carried out on several articles demonstrates the effectiveness of the proposed approach.
Source code available at: https://github.com/vagisha-nidhi/TextSummarizer
Sounds are essential to how humans perceive and interact with the world and are captured in recordings and shared on the Internet on a minute-by-minute basis.
These recordings, which are predominantly videos, constitute the largest archive of sounds we know.
However, most of these recordings have undescribed content making necessary methods for automatic sound analysis, indexing and retrieval.
These methods have to address multiple challenges, such as the relation between sounds and language, numerous and diverse sound classes, and large-scale evaluation.
We propose a system that continuously learns from the web relations between sounds and language, improves sound recognition models over time and evaluates its learning competency in the large-scale without references.
We introduce the Never-Ending Learner of Sounds (NELS), a project for continuously learning of sounds and their associated knowledge, available on line in nels.cs.cmu.edu
Now-a-days, speech-based biometric systems such as automatic speaker verification (ASV) are highly prone to spoofing attacks by an imposture.
With recent development in various voice conversion (VC) and speech synthesis (SS) algorithms, these spoofing attacks can pose a serious potential threat to the current state-of-the-art ASV systems.
To impede such attacks and enhance the security of the ASV systems, the development of efficient anti-spoofing algorithms is essential that can differentiate synthetic or converted speech from natural or human speech.
In this paper, we propose a set of novel speech features for detecting spoofing attacks.
The proposed features are computed using alternative frequency-warping technique and formant-specific block transformation of filter bank log energies.
We have evaluated existing and proposed features against several kinds of synthetic speech data from ASVspoof 2015 corpora.
The results show that the proposed techniques outperform existing approaches for various spoofing attack detection task.
The techniques investigated in this paper can also accurately classify natural and synthetic speech as equal error rates (EERs) of 0% have been achieved.
As networks expand in size and complexity, they pose greater administrative and management challenges.
Software Defined Networks (SDN) offer a promising approach to meeting some of these challenges.
In this paper, we propose a policy driven security architecture for securing end to end services across multiple SDN domains.
We develop a language based approach to design security policies that are relevant for securing SDN services and communications.
We describe the policy language and its use in specifying security policies to control the flow of information in a multi-domain SDN.
We demonstrate the specification of fine grained security policies based on a variety of attributes such as parameters associated with users and devices/switches, context information such as location and routing information, and services accessed in SDN as well as security attributes associated with the switches and Controllers in different domains.
An important feature of our architecture is its ability to specify path and flow based security policies, which are significant for securing end to end services in SDNs.
We describe the design and the implementation of our proposed policy based security architecture and demonstrate its use in scenarios involving both intra and inter-domain communications with multiple SDN Controllers.
We analyse the performance characteristics of our architecture as well as discuss how our architecture is able to counteract various security attacks.
The dynamic security policy based approach and the distribution of corresponding security capabilities intelligently as a service layer that enable flow based security enforcement and protection of multitude of network devices against attacks are important contributions of this paper.
Macro-management is an important problem in StarCraft, which has been studied for a long time.
Various datasets together with assorted methods have been proposed in the last few years.
But these datasets have some defects for boosting the academic and industrial research: 1) There're neither standard preprocessing, parsing and feature extraction procedures nor predefined training, validation and test set in some datasets.
2) Some datasets are only specified for certain tasks in macro-management.
3) Some datasets are either too small or don't have enough labeled data for modern machine learning algorithms such as deep neural networks.
So most previous methods are trained with various features, evaluated on different test sets from the same or different datasets, making it difficult to be compared directly.
To boost the research of macro-management in StarCraft, we release a new dataset MSC based on the platform SC2LE.
MSC consists of well-designed feature vectors, pre-defined high-level actions and final result of each match.
We also split MSC into training, validation and test set for the convenience of evaluation and comparison.
Besides the dataset, we propose a baseline model and present initial baseline results for global state evaluation and build order prediction, which are two of the key tasks in macro-management.
Various downstream tasks and analyses of the dataset are also described for the sake of research on macro-management in StarCraft II.
Homepage: https://github.com/wuhuikai/MSC.
Cyberbullying has emerged as an important and growing social problem, wherein people use online social networks and mobile phones to bully victims with offensive text, images, audio and video on a 247 basis.
This paper studies negative user behavior in the Ask.fm social network, a popular new site that has led to many cases of cyberbullying, some leading to suicidal behavior.We examine the occurrence of negative words in Ask.fms question+answer profiles along with the social network of likes of questions+answers.
We also examine properties of users with cutting behavior in this social network.
Defining and measuring internationality as a function of influence diffusion of scientific journals is an open problem.
There exists no metric to rank journals based on the extent or scale of internationality.
Measuring internationality is qualitative, vague, open to interpretation and is limited by vested interests.
With the tremendous increase in the number of journals in various fields and the unflinching desire of academics across the globe to publish in "international" journals, it has become an absolute necessity to evaluate, rank and categorize journals based on internationality.
Authors, in the current work have defined internationality as a measure of influence that transcends across geographic boundaries.
There are concerns raised by the authors about unethical practices reflected in the process of journal publication whereby scholarly influence of a select few are artificially boosted, primarily by resorting to editorial maneuvres.
To counter the impact of such tactics, authors have come up with a new method that defines and measures internationality by eliminating such local effects when computing the influence of journals.
A new metric, Non-Local Influence Quotient(NLIQ) is proposed as one such parameter for internationality computation along with another novel metric, Other-Citation Quotient as the complement of the ratio of self-citation and total citation.
In addition, SNIP and International Collaboration Ratio are used as two other parameters.
We describe in this paper Hydra, an ensemble of convolutional neural networks (CNN) for geospatial land classification.
The idea behind Hydra is to create an initial CNN that is coarsely optimized but provides a good starting pointing for further optimization, which will serve as the Hydra's body.
Then, the obtained weights are fine tuned multiple times to form an ensemble of CNNs that represent the Hydra's heads.
By doing so, we were able to reduce the training time while maintaining the classification performance of the ensemble.
We created ensembles using two state-of-the-art CNN architectures, ResNet and DenseNet, to participate in the Functional Map of the World challenge.
With this approach, we finished the competition in third place.
We also applied the proposed framework to the NWPU-RESISC45 database and achieved the best reported performance so far.
Code and CNN models are available at https://github.com/maups/hydra-fmow
In distributed detection, there does not exist an automatic way of generating optimal decision strategies for non-affine decision functions.
Consequently, in a detection problem based on a non-affine decision function, establishing optimality of a given decision strategy, such as a generalized likelihood ratio test, is often difficult or even impossible.
In this thesis we develop a novel detection network optimization technique that can be used to determine necessary and sufficient conditions for optimality in distributed detection for which the underlying objective function is monotonic and convex in probabilistic decision strategies.
Our developed approach leverages on basic concepts of optimization and statistical inference which are provided in sufficient detail.
These basic concepts are combined to form the basis of an optimal inference technique for signal detection.
We prove a central theorem that characterizes optimality in a variety of distributed detection architectures.
We discuss three applications of this result in distributed signal detection.
These applications include interactive distributed detection, optimal tandem fusion architecture, and distributed detection by acyclic graph networks.
In the conclusion we indicate several future research directions, which include possible generalizations of our optimization method and new research problems arising from each of the three applications considered.
Online social networks (OSN) contain extensive amount of information about the underlying society that is yet to be explored.
One of the most feasible technique to fetch information from OSN, crawling through Application Programming Interface (API) requests, poses serious concerns over the the guarantees of the estimates.
In this work, we focus on making reliable statistical inference with limited API crawls.
Based on regenerative properties of the random walks, we propose an unbiased estimator for the aggregated sum of functions over edges and proved the connection between variance of the estimator and spectral gap.
In order to facilitate Bayesian inference on the true value of the estimator, we derive the approximate posterior distribution of the estimate.
Later the proposed ideas are validated with numerical experiments on inference problems in real-world networks.
For pattern recognition like image recognition, it has become clear that each machine-learning dictionary data actually became data in probability space belonging to Euclidean space.
However, the distances in the Euclidean space and the distances in the probability space are separated and ununified when machine learning is introduced in the pattern recognition.
There is still a problem that it is impossible to directly calculate an accurate matching relation between the sampling data of the read image and the learned dictionary data.
In this research, we focused on the reason why the distance is changed and the extent of change when passing through the probability space from the original Euclidean distance among data belonging to multiple probability spaces containing Euclidean space.
By finding the reason of the cause of the distance error and finding the formula expressing the error quantitatively, a possible distance formula to unify Euclidean space and probability space is found.
Based on the results of this research, the relationship between machine-learning dictionary data and sampling data was clearly understood for pattern recognition.
As a result, the calculation of collation among data and machine-learning to compete mutually between data are cleared, and complicated calculations became unnecessary.
Finally, using actual pattern recognition data, experimental demonstration of a possible distance formula to unify Euclidean space and probability space discovered by this research was carried out, and the effectiveness of the result was confirmed.
This paper analyzes irrelevance and independence relations in graphical models associated with convex sets of probability distributions (called Quasi-Bayesian networks).
The basic question in Quasi-Bayesian networks is, How can irrelevance/independence relations in Quasi-Bayesian networks be detected, enforced and exploited?
This paper addresses these questions through Walley's definitions of irrelevance and independence.
Novel algorithms and results are presented for inferences with the so-called natural extensions using fractional linear programming, and the properties of the so-called type-1 extensions are clarified through a new generalization of d-separation.
We present a real-time method for synthesizing highly complex human motions using a novel training regime we call the auto-conditioned Recurrent Neural Network (acRNN).
Recently, researchers have attempted to synthesize new motion by using autoregressive techniques, but existing methods tend to freeze or diverge after a couple of seconds due to an accumulation of errors that are fed back into the network.
Furthermore, such methods have only been shown to be reliable for relatively simple human motions, such as walking or running.
In contrast, our approach can synthesize arbitrary motions with highly complex styles, including dances or martial arts in addition to locomotion.
The acRNN is able to accomplish this by explicitly accommodating for autoregressive noise accumulation during training.
Our work is the first to our knowledge that demonstrates the ability to generate over 18,000 continuous frames (300 seconds) of new complex human motion w.r.t. different styles.
A regular Hilberg process is a stationary process that satisfies both a hyperlogarithmic growth of maximal repetition and a power-law growth of topological entropy, which are a kind of dual conditions.
The hyperlogarithmic growth of maximal repetition has been experimentally observed for texts in natural language, whereas the power-law growth of topological entropy implies a vanishing Shannon entropy rate and thus probably does not hold for natural language.
In this paper, we provide a constructive example of regular Hilberg processes, which we call random hierarchical association (RHA) processes.
Our construction does not apply the standard cutting and stacking method.
For the constructed RHA processes, we demonstrate that the expected length of any uniquely decodable code is orders of magnitude larger than the Shannon block entropy of the ergodic component of the RHA process.
Our proposition supplements the classical result by Shields concerning nonexistence of universal redundancy rates.
Euphonic conjunctions (sandhis) form a very important aspect of Sanskrit morphology and phonology.
The traditional and modern methods of studying about euphonic conjunctions in Sanskrit follow different methodologies.
The former involves a rigorous study of the Paninian system embodied in Panini's Ashtadhyayi, while the latter usually involves the study of a few important sandhi rules with the use of examples.
The former is not suitable for beginners, and the latter, not sufficient to gain a comprehensive understanding of the operation of sandhi rules.
This is so since there are not only numerous sandhi rules and exceptions, but also complex precedence rules involved.
The need for a new ontology for sandhi-tutoring was hence felt.
This work presents a comprehensive ontology designed to enable a student-user to learn in stages all about euphonic conjunctions and the relevant aphorisms of Sanskrit grammar and to test and evaluate the progress of the student-user.
The ontology forms the basis of a multimedia sandhi tutor that was given to different categories of users including Sanskrit scholars for extensive and rigorous testing.
Peer assessment is an efficient and effective learning assessment method that has been used widely in diverse fields in higher education.
Despite its many benefits, a fundamental problem in peer assessment is that participants lack the motivation to assess others' work faithfully and fairly.
Non-consensus is a common challenge that makes the reliability of peer assessment a primary concern in practices.
This research proposes a motivation model that uses review deviation and radicalization to identify non-consensus in peer assessment.
The proposed model is implemented as a software module in a peer code review system called EduPCR4.
EduPCR4 is able to monitor this measure and trigger teacher's arbitration when it detects possible non-consensus.
An empirical study conducted in a university-level C programming course showed that the proposed model and its implementation helped to improve the peer assessment practices in many aspects.
Detecting community structures in social networks has gained considerable attention in recent years.
However, lack of prior knowledge about the number of communities, and their overlapping nature have made community detection a challenging problem.
Moreover, many of the existing methods only consider static networks, while most of real world networks are dynamic and evolve over time.
Hence, finding consistent overlapping communities in dynamic networks without any prior knowledge about the number of communities is still an interesting open research problem.
In this paper, we present an overlapping community detection method for dynamic networks called Dynamic Bayesian Overlapping Community Detector (DBOCD).
DBOCD assumes that in every snapshot of network, overlapping parts of communities are dense areas and utilizes link communities instead of common node communities.
Using Recurrent Chinese Restaurant Process and community structure of the network in the last snapshot, DBOCD simultaneously extracts the number of communities and soft community memberships of nodes while maintaining the consistency of communities over time.
We evaluated DBOCD on both synthetic and real dynamic data-sets to assess its ability to find overlapping communities in different types of network evolution.
The results show that DBOCD outperforms the recent state of the art dynamic community detection methods.
ViDaExpert is a tool for visualization and analysis of multidimensional vectorial data.
ViDaExpert is able to work with data tables of "object-feature" type that might contain numerical feature values as well as textual labels for rows (objects) and columns (features).
ViDaExpert implements several statistical methods such as standard and weighted Principal Component Analysis (PCA) and the method of elastic maps (non-linear version of PCA), Linear Discriminant Analysis (LDA), multilinear regression, K-Means clustering, a variant of decision tree construction algorithm.
Equipped with several user-friendly dialogs for configuring data point representations (size, shape, color) and fast 3D viewer, ViDaExpert is a handy tool allowing to construct an interactive 3D-scene representing a table of data in multidimensional space and perform its quick and insightfull statistical analysis, from basic to advanced methods.
Human-human joint-action in short-cycle repetitive handover tasks was investigated for a bottle handover task using a three-fold approach: work-methods field studies in multiple supermarkets, simulation analysis using an ergonomics software package and by conducting an in-house lab experiment on human-human collaboration by re-creating the environment and conditions of a supermarket.
Evaluation included both objective and subjective measures.
Subjective evaluation was done taking a psychological perspective and showcases among other things, the differences in the way a common joint-action is being perceived by individual team partners depending upon their role (giver or receiver).
The proposed approach can provide a systematic method to analyze similar tasks.
Combining the results of all the three analyses, this research gives insight into the science of joint-action for short-cycle repetitive tasks and its implications for human-robot collaborative system design.
The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases and co-existing conditions.
However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient.
Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change if taken with another drug.
The knowledge of drug interactions is limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing.
Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality.
Here, we present Decagon, an approach for modeling polypharmacy side effects.
The approach constructs a multimodal graph of protein-protein interactions, drug-protein target interactions, and the polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type.
Decagon is developed specifically to handle such multimodal graphs with a large number of edge types.
Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks.
Decagon predicts the exact side effect, if any, through which a given drug combination manifests clinically.
Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69%.
We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients.
Furthermore, Decagon models particularly well side effects with a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types.
Decagon creates opportunities to use large pharmacogenomic and patient data to flag and prioritize side effects for follow-up analysis.
A major challenge in cyber-threat analysis is combining information from different sources to find the person or the group responsible for the cyber-attack.
It is one of the most important technical and policy challenges in cyber-security.
The lack of ground truth for an individual responsible for an attack has limited previous studies.
In this paper, we take a first step towards overcoming this limitation by building a dataset from the capture-the-flag event held at DEFCON, and propose an argumentation model based on a formal reasoning framework called DeLP (Defeasible Logic Programming) designed to aid an analyst in attributing a cyber-attack.
We build models from latent variables to reduce the search space of culprits (attackers), and show that this reduction significantly improves the performance of classification-based approaches from 37% to 62% in identifying the attacker.
We devise a new formulation for the vertex coloring problem.
Different from other formulations, decision variables are associated with the pairs of vertices.
Consequently, colors will be distinguishable.
Although the objective function is fractional, it can be replaced by a piece-wise linear convex function.
Numerical experiments show that our formulation has significantly good performance for dense graphs.
We present an algorithm for graph based saliency computation that utilizes the underlying dense subgraphs in finding visually salient regions in an image.
To compute the salient regions, the model first obtains a saliency map using random walks on a Markov chain.
Next, k-dense subgraphs are detected to further enhance the salient regions in the image.
Dense subgraphs convey more information about local graph structure than simple centrality measures.
To generate the Markov chain, intensity and color features of an image in addition to region compactness is used.
For evaluating the proposed model, we do extensive experiments on benchmark image data sets.
The proposed method performs comparable to well-known algorithms in salient region detection.
We study the task of image inpainting, which is to fill in the missing region of an incomplete image with plausible contents.
To this end, we propose a learning-based approach to generate visually coherent completion given a high-resolution image with missing components.
In order to overcome the difficulty to directly learn the distribution of high-dimensional image data, we divide the task into inference and translation as two separate steps and model each step with a deep neural network.
We also use simple heuristics to guide the propagation of local textures from the boundary to the hole.
We show that, by using such techniques, inpainting reduces to the problem of learning two image-feature translation functions in much smaller space and hence easier to train.
We evaluate our method on several public datasets and show that we generate results of better visual quality than previous state-of-the-art methods.
Music genre classification is one example of content-based analysis of music signals.
Traditionally, human-engineered features were used to automatize this task and 61% accuracy has been achieved in the 10-genre classification.
However, it's still below the 70% accuracy that humans could achieve in the same task.
Here, we propose a new method that combines knowledge of human perception study in music genre classification and the neurophysiology of the auditory system.
The method works by training a simple convolutional neural network (CNN) to classify a short segment of the music signal.
Then, the genre of a music is determined by splitting it into short segments and then combining CNN's predictions from all short segments.
After training, this method achieves human-level (70%) accuracy and the filters learned in the CNN resemble the spectrotemporal receptive field (STRF) in the auditory system.
The preceding paper constructed tangle machines as diagrammatic models, and illustrated their utility with a number of examples.
The information content of a tangle machine is contained in characteristic quantities associated to equivalence classes of tangle machines, which are called invariants.
This paper constructs invariants of tangle machines.
Chief among these are the prime factorizations of a machine, which are essentially unique.
This is proven using low dimensional topology, through representing a colour-suppressed machine as a diagram for a network of jointly embedded spheres and intervals in 4-space.
The complexity of a tangle machine is defined as its number of prime factors.
Many natural language processing applications use language models to generate text.
These models are typically trained to predict the next word in a sequence, given the previous words and some context such as an image.
However, at test time the model is expected to generate the entire sequence from scratch.
This discrepancy makes generation brittle, as errors may accumulate along the way.
We address this issue by proposing a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE.
On three different tasks, our approach outperforms several strong baselines for greedy generation.
The method is also competitive when these baselines employ beam search, while being several times faster.
Detecting the occlusion from stereo images or video frames is important to many computer vision applications.
Previous efforts focus on bundling it with the computation of disparity or optical flow, leading to a chicken-and-egg problem.
In this paper, we leverage convolutional neural network to liberate the occlusion detection task from the interleaved, traditional calculation framework.
We propose a Symmetric Network (SymmNet) to directly exploit information from an image pair, without estimating disparity or motion in advance.
The proposed network is structurally left-right symmetric to learn the binocular occlusion simultaneously, aimed at jointly improving both results.
The comprehensive experiments show that our model achieves state-of-the-art results on detecting the stereo and motion occlusion.
Synthetic image translation has significant potentials in autonomous transportation systems.
That is due to the expense of data collection and annotation as well as the unmanageable diversity of real-words situations.
The main issue with unpaired image-to-image translation is the ill-posed nature of the problem.
In this work, we propose a novel method for constraining the output space of unpaired image-to-image translation.
We make the assumption that the environment of the source domain is known (e.g. synthetically generated), and we propose to explicitly enforce preservation of the ground-truth labels on the translated images.
We experiment on preserving ground-truth information such as semantic segmentation, disparity, and instance segmentation.
We show significant evidence that our method achieves improved performance over the state-of-the-art model of UNIT for translating images from SYNTHIA to Cityscapes.
The generated images are perceived as more realistic in human surveys and outperforms UNIT when used in a domain adaptation scenario for semantic segmentation.
The problem of distributed rate maximization in multi-channel ALOHA networks is considered.
First, we study the problem of constrained distributed rate maximization, where user rates are subject to total transmission probability constraints.
We propose a best-response algorithm, where each user updates its strategy to increase its rate according to the channel state information and the current channel utilization.
We prove the convergence of the algorithm to a Nash equilibrium in both homogeneous and heterogeneous networks using the theory of potential games.
The performance of the best-response dynamic is analyzed and compared to a simple transmission scheme, where users transmit over the channel with the highest collision-free utility.
Then, we consider the case where users are not restricted by transmission probability constraints.
Distributed rate maximization under uncertainty is considered to achieve both efficiency and fairness among users.
We propose a distributed scheme where users adjust their transmission probability to maximize their rates according to the current network state, while maintaining the desired load on the channels.
We show that our approach plays an important role in achieving the Nash bargaining solution among users.
Sequential and parallel algorithms are proposed to achieve the target solution in a distributed manner.
The efficiencies of the algorithms are demonstrated through both theoretical and simulation results.
System development often involves decisions about how a high-level design is to be implemented using primitives from a low-level platform.
Certain decisions, however, may introduce undesirable behavior into the resulting implementation, possibly leading to a violation of a desired property that has already been established at the design level.
In this paper, we introduce the problem of synthesizing a property-preserving platform mapping: A set of implementation decisions ensuring that a desired property is preserved from a high-level design into a low-level platform implementation.
We provide a formalization of the synthesis problem and propose a technique for synthesizing a mapping based on symbolic constraint search.
We describe our prototype implementation, and a real-world case study demonstrating the application of our technique to synthesizing secure mappings for the popular web authorization protocols OAuth 1.0 and 2.0.
Data-driven workflows, of which IBM's Business Artifacts are a prime exponent, have been successfully deployed in practice, adopted in industrial standards, and have spawned a rich body of research in academia, focused primarily on static analysis.
In previous work, we obtained theoretical results on the verification of a rich model incorporating core elements of IBM's successful Guard-Stage-Milestone (GSM) artifact model.
The results showed decidability of verification of temporal properties of a large class of GSM workflows and established its complexity.
Following up on these results, the present paper reports on the implementation of SpinArt, a practical verifier based on the classical model-checking tool Spin.
The implementation includes nontrivial optimizations and achieves good performance on real-world business process examples.
Our results shed light on the capabilities and limitations of off-the-shelf verifiers in the context of data-driven workflows.
A growing number of applications users daily interact with have to operate in (near) real-time: chatbots, digital companions, knowledge work support systems -- just to name a few.
To perform the services desired by the user, these systems have to analyze user activity logs or explicit user input extremely fast.
In particular, text content (e.g. in form of text snippets) needs to be processed in an information extraction task.
Regarding the aforementioned temporal requirements, this has to be accomplished in just a few milliseconds, which limits the number of methods that can be applied.
Practically, only very fast methods remain, which on the other hand deliver worse results than slower but more sophisticated Natural Language Processing (NLP) pipelines.
In this paper, we investigate and propose methods for real-time capable Named Entity Recognition (NER).
As a first improvement step we address are word variations induced by inflection, for example present in the German language.
Our approach is ontology-based and makes use of several language information sources like Wiktionary.
We evaluated it using the German Wikipedia (about 9.4B characters), for which the whole NER process took considerably less than an hour.
Since precision and recall are higher than with comparably fast methods, we conclude that the quality gap between high speed methods and sophisticated NLP pipelines can be narrowed a bit more without losing too much runtime performance.
This paper studies the problem of reproducible research in remote photoplethysmography (rPPG).
Most of the work published in this domain is assessed on privately-owned databases, making it difficult to evaluate proposed algorithms in a standard and principled manner.
As a consequence, we present a new, publicly available database containing a relatively large number of subjects recorded under two different lighting conditions.
Also, three state-of-the-art rPPG algorithms from the literature were selected, implemented and released as open source free software.
After a thorough, unbiased experimental evaluation in various settings, it is shown that none of the selected algorithms is precise enough to be used in a real-world scenario.
Modern autonomous underwater vehicles (AUVs) have advanced sensing capabilities including sonar, cameras, acoustic communication, and diverse bio-sensors.
Instead of just sensing its environment and storing the data for post-Mission inspection, an AUV could use the collected information to gain an understanding of its environment, and based on this understanding autonomously adapt its behavior to enhance the overall effectiveness of its mission.
Many such tasks are highly computation intensive.
This paper presents the results of a case study that illustrates the effectiveness of an energy-aware, many-core computing architecture to perform on-board path planning within a batteryoperated AUV.
A previously published path planning algorithm was ported onto the SCC, an experimental 48 core single-chip system developed by Intel.
The performance, power, and energy consumption of the application were measured for different numbers of cores and other system parameters.
This case study shows that computation intensive tasks can be executed within an AUV that relies mainly on battery power.
Future plans include the deployment and testing of an SCC system within a Teledyne Webb Research Slocum glider.
Extreme learning machine (ELM), proposed by Huang et al., has been shown a promising learning algorithm for single-hidden layer feedforward neural networks (SLFNs).
Nevertheless, because of the random choice of input weights and biases, the ELM algorithm sometimes makes the hidden layer output matrix H of SLFN not full column rank, which lowers the effectiveness of ELM.
This paper discusses the effectiveness of ELM and proposes an improved algorithm called EELM that makes a proper selection of the input weights and bias before calculating the output weights, which ensures the full column rank of H in theory.
This improves to some extend the learning rate (testing accuracy, prediction accuracy, learning time) and the robustness property of the networks.
The experimental results based on both the benchmark function approximation and real-world problems including classification and regression applications show the good performances of EELM.
Reconstruction of skilled humans sensation and control system often leads to a development of robust control for the robots.
We are developing an unscrewing robot for the automated disassembly which requires a comprehensive control system, but unscrewing experiments with robots are often limited to several conditions.
On the contrary, humans typically have a broad range of screwing experiences and sensations throughout their lives, and we conducted an experiment to find these haptic patterns.
Results show that people apply axial force to the screws to avoid screwdriver slippage (cam-outs), which is one of the key problems during screwing and unscrewing, and this axial force is proportional to the torque which is required for screwing.
We have found that type of the screw head influences the amount of axial force applied.
Using this knowledge an unscrewing robot for the smart disassembly factory RecyBot is developed, and experiments confirm the optimality of the strategy, used by humans.
Finally, a methodology for robust unscrewing algorithm design is presented as a generalization of the findings.
It can seriously speed up the development of the screwing and unscrewing robots and tools.
Topic modeling enables exploration and compact representation of a corpus.
The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis.
Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys.
To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors.
The novelty of the DAP model lies in its representation of authors by a persona --- where personas capture the propensity to write about certain topics over time.
Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model's personas to be distinct.
Our results show significant improvements over competing topic models --- particularly after regularization, and highlight the DAP model's unique ability to capture common journeys shared by different authors.
In this paper, we address the problem of inferring the layout of complex road scenes given a single camera as input.
To achieve that, we first propose a novel parameterized model of road layouts in a top-view representation, which is not only intuitive for human visualization but also provides an interpretable interface for higher-level decision making.
Moreover, the design of our top-view scene model allows for efficient sampling and thus generation of large-scale simulated data, which we leverage to train a deep neural network to infer our scene model's parameters.
Specifically, our proposed training procedure uses supervised domain-adaptation techniques to incorporate both simulated as well as manually annotated data.
Finally, we design a Conditional Random Field (CRF) that enforces coherent predictions for a single frame and encourages temporal smoothness among video frames.
Experiments on two public data sets show that: (1) Our parametric top-view model is representative enough to describe complex road scenes, (2) The proposed method outperforms baselines trained on manually-annotated or simulated data only, thus getting the best of both, (3) Our CRF is able to generate temporally smoothed while semantically meaningful results.
Deep Learning methods employ multiple processing layers to learn hierarchial representations of data.
They have already been deployed in a humongous number of applications and have produced state-of-the-art results.
Recently with the growth in processing power of computers to be able to do high dimensional tensor calculations, Natural Language Processing (NLP) applications have been given a significant boost in terms of efficiency as well as accuracy.
In this paper, we will take a look at various signal processing techniques and then application of them to produce a speech-to-text system using Deep Recurrent Neural Networks.
Anomaly detection problems (also called change-point detection problems) have been studied in data mining, statistics and computer science over the last several decades (mostly in non-network context) in applications such as medical condition monitoring, weather change detection and speech recognition.
In recent days, however, anomaly detection problems have become increasing more relevant in the context of network science since useful insights for many complex systems in biology, finance and social science are often obtained by representing them via networks.
Notions of local and non-local curvatures of higher-dimensional geometric shapes and topological spaces play a fundamental role in physics and mathematics in characterizing anomalous behaviours of these higher dimensional entities.
However, using curvature measures to detect anomalies in networks is not yet very common.
To this end, a main goal in this paper to formulate and analyze curvature analysis methods to provide the foundations of systematic approaches to find critical components and detect anomalies in networks.
For this purpose, we use two measures of network curvatures which depend on non-trivial global properties, such as distributions of geodesics and higher-order correlations among nodes, of the given network.
Based on these measures, we precisely formulate several computational problems related to anomaly detection in static or dynamic networks, and provide non-trivial computational complexity results for these problems.
This paper must not be viewed as delivering the final word on appropriateness and suitability of specific curvature measures.
Instead, it is our hope that this paper will stimulate and motivate further theoretical or empirical research concerning the exciting interplay between notions of curvatures from network and non-network domains, a much desired goal in our opinion.
Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications.
Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design.
In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization.
We present trends in DNN architectures and the resulting implications on parallelization strategies.
We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning.
We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search.
Based on those approaches, we extrapolate potential directions for parallelism in deep learning.
The abundance of unlicensed spectrum in the 60 GHz band makes it an attractive alternative for future wireless communication systems.
Such systems are expected to provide data transmission rates in the order of multi-gigabits per second in order to satisfy the ever-increasing demand for high rate data communication.
Unfortunately, 60 GHz radio is subject to severe path loss which limits its usability for long-range outdoor communication.
In this work, we propose a multi-hop 60 GHz wireless network for outdoor communication where multiple full-duplex buffered relays are used to extend the communication range while providing end-to-end performance guarantees to the traffic traversing the network.
We provide a cumulative service process characterization for the 60 GHz outdoor propagation channel with self-interference in terms of the moment generating function (MGF) of its channel capacity.
We then use this characterization to compute probabilistic upper bounds on the overall network performance, i.e., total backlog and end-to-end delay.
Furthermore, we study the effect of self-interference on the network performance and propose an optimal power allocation scheme to mitigate its impact in order to enhance network performance.
Finally, we investigate the relation between relay density and network performance under a total power budget constraint.
We show that increasing relay density may have adverse effects on network performance unless self-interference can be kept sufficiently small.
Convolutional neural networks (CNNs) tend to become a standard approach to solve a wide array of computer vision problems.
Besides important theoretical and practical advances in their design, their success is built on the existence of manually labeled visual resources, such as ImageNet.
The creation of such datasets is cumbersome and here we focus on alternatives to manual labeling.
We hypothesize that new resources are of uttermost importance in domains which are not or weakly covered by ImageNet, such as tourism photographs.
We first collect noisy Flickr images for tourist points of interest and apply automatic or weakly-supervised reranking techniques to reduce noise.
Then, we learn domain adapted models with a standard CNN architecture and compare them to a generic model obtained from ImageNet.
Experimental validation is conducted with publicly available datasets, including Oxford5k, INRIA Holidays and Div150Cred.
Results show that low-cost domain adaptation improves results compared to the use of generic models but also compared to strong non-CNN baselines such as triangulation embedding.
While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.
We investigate how the properties of natural language data affect an LSTM's ability to learn a nonlinguistic task: recalling elements from its input.
We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data.
Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input.
We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.
An instance of the maximum mixed graph orientation problem consists of a mixed graph and a collection of source-target vertex pairs.
The objective is to orient the undirected edges of the graph so as to maximize the number of pairs that admit a directed source-target path.
This problem has recently arisen in the study of biological networks, and it also has applications in communication networks.
In this paper, we identify an interesting local-to-global orientation property.
This property enables us to modify the best known algorithms for maximum mixed graph orientation and some of its special structured instances, due to Elberfeld et al.(CPM '11), and obtain improved approximation ratios.
We further proceed by developing an algorithm that achieves an even better approximation guarantee for the general setting of the problem.
Finally, we study several well-motivated variants of this orientation problem.
Authentication of individuals via palmprint based biometric system is becoming very popular due to its reliability as it contains unique and stable features.
In this paper, we present a novel approach for palmprint recognition and its representation.
To extract the palm lines, local thresholding technique Niblack binarization algorithm is adopted.
The endpoints of these lines are determined and a connection is created among them using the Delaunay triangulation thereby generating a distinct topological structure of each palmprint.
Next, we extract different geometric as well as quantitative features from the triangles of the Delaunay triangulation that assist in identifying different individuals.
To ensure that the proposed approach is invariant to rotation and scaling, features were made relative to topological and geometrical structure of the palmprint.
The similarity of the two palmprints is computed using the weighted sum approach and compared with the k-nearest neighbor.
The experimental results obtained reflect the effectiveness of the proposed approach to discriminate between different palmprint images and thus achieved a recognition rate of 90% over large databases.
In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text.
In particular, we propose to learn neural rewards to model cross-sentence ordering as a means to approximate desired discourse structure.
Empirical results demonstrate that a generator trained with the learned reward produces more coherent and less repetitive text than models trained with cross-entropy or with reinforcement learning with commonly used scores as rewards.
In spite of the theoretical and algorithmic developments for system synthesis in recent years, little effort has been dedicated to quantifying the quality of the specifications used for synthesis.
When dealing with unrealizable specifications, finding the weakest environment assumptions that would ensure realizability is typically a desirable property; in such context the weakness of the assumptions is a major quality parameter.
The question of whether one assumption is weaker than another is commonly interpreted using implication or, equivalently, language inclusion.
However, this interpretation does not provide any further insight into the weakness of assumptions when implication does not hold.
To our knowledge, the only measure that is capable of comparing two formulae in this case is entropy, but even it fails to provide a sufficiently refined notion of weakness in case of GR(1) formulae, a subset of linear temporal logic formulae which is of particular interest in controller synthesis.
In this paper we propose a more refined measure of weakness based on the Hausdorff dimension, a concept that captures the notion of size of the omega-language satisfying a linear temporal logic formula.
We identify the conditions under which this measure is guaranteed to distinguish between weaker and stronger GR(1) formulae.
We evaluate our proposed weakness measure in the context of computing GR(1) assumptions refinements.
Traditional frameworks for dynamic graphs have relied on processing only the stream of edges added into or deleted from an evolving graph, but not any additional related information such as the degrees or neighbor lists of nodes incident to the edges.
In this paper, we propose a new edge sampling framework for big-graph analytics in dynamic graphs which enhances the traditional model by enabling the use of additional related information.
To demonstrate the advantages of this framework, we present a new sampling algorithm, called Edge Sample and Discard (ESD).
It generates an unbiased estimate of the total number of triangles, which can be continuously updated in response to both edge additions and deletions.
We provide a comparative analysis of the performance of ESD against two current state-of-the-art algorithms in terms of accuracy and complexity.
The results of the experiments performed on real graphs show that, with the help of the neighborhood information of the sampled edges, the accuracy achieved by our algorithm is substantially better.
We also characterize the impact of properties of the graph on the performance of our algorithm by testing on several Barabasi-Albert graphs.
When performing data classification over a stream of continuously occurring instances, a key challenge is to develop an open-world classifier that anticipates instances from an unknown class.
Studies addressing this problem, typically called novel class detection, have considered classification methods that reactively adapt to such changes along the stream.
Importantly, they rely on the property of cohesion and separation among instances in feature space.
Instances belonging to the same class are assumed to be closer to each other (cohesion) than those belonging to different classes (separation).
Unfortunately, this assumption may not have large support when dealing with high dimensional data such as images.
In this paper, we address this key challenge by proposing a semisupervised multi-task learning framework called CSIM which aims to intrinsically search for a latent space suitable for detecting labels of instances from both known and unknown classes.
Particularly, we utilize a convolution neural network layer that aids in the learning of a latent feature space suitable for novel class detection.
We empirically measure the performance of CSIM over multiple realworld image datasets and demonstrate its superiority by comparing its performance with existing semi-supervised methods.
We propose a new order preserving bilinear framework that exploits low-resolution video for person detection in a multi-modal setting using deep neural networks.
In this setting cameras are strategically placed such that less robust sensors, e.g. geophones that monitor seismic activity, are located within the field of views (FOVs) of cameras.
The primary challenge is being able to leverage sufficient information from videos where there are less than 40 pixels on targets, while also taking advantage of less discriminative information from other modalities, e.g. seismic.
Unlike state-of-the-art methods, our bilinear framework retains spatio-temporal order when computing the vector outer products between pairs of features.
Despite the high dimensionality of these outer products, we demonstrate that our order preserving bilinear framework yields better performance than recent orderless bilinear models and alternative fusion methods.
Three-dimensional geometric data offer an excellent domain for studying representation learning and generative modeling.
In this paper, we look at geometric data represented as point clouds.
We introduce a deep AutoEncoder (AE) network with state-of-the-art reconstruction quality and generalization ability.
The learned representations outperform existing methods on 3D recognition tasks and enable shape editing via simple algebraic manipulations, such as semantic part editing, shape analogies and shape interpolation, as well as shape completion.
We perform a thorough study of different generative models including GANs operating on the raw point clouds, significantly improved GANs trained in the fixed latent space of our AEs, and Gaussian Mixture Models (GMMs).
To quantitatively evaluate generative models we introduce measures of sample fidelity and diversity based on matchings between sets of point clouds.
Interestingly, our evaluation of generalization, fidelity and diversity reveals that GMMs trained in the latent space of our AEs yield the best results overall.
Mp3 is a very popular audio format and hence it can be a good host for carrying hidden messages.
Therefore, different steganography methods have been proposed for mp3 hosts.
But, current literature has only focused on steganalysis of mp3stego.
In this paper we mention some of the limitations of mp3stego and argue that UnderMp3Cover (Ump3c) does not have those limitations.
Ump3c makes subtle changes only to the global gain of bitstream and keeps the rest of bitstream intact.
Therefore, its detection is much harder than mp3stego.
To address this, joint distributions between global gain and other fields of mp3 bit stream are used.
The changes are detected by measuring the mutual information from those joint distributions.
Furthermore, we show that different mp3 encoders have dissimilar performances.
Consequently, a novel multi-layer architecture for steganalysis of Ump3c is proposed.
In this manner, the first layer detects the encoder and the second layer performs the steganalysis job.
One of advantages of this architecture is that feature extraction and feature selection can be optimized for each encoder separately.
We show this multi-layer architecture outperforms the conventional single-layer methods.
Comparing results of the proposed method with other works shows an improvement of 20.4% in the accuracy of steganalysis.
Natural language generation (NLG) is a critical component in spoken dialogue systems.
Classic NLG can be divided into two phases: (1) sentence planning: deciding on the overall sentence structure, (2) surface realization: determining specific word forms and flattening the sentence structure into a string.
Many simple NLG models are based on recurrent neural networks (RNN) and sequence-to-sequence (seq2seq) model, which basically contains an encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization using a simple cross entropy loss training criterion.
However, the simple encoder-decoder architecture usually suffers from generating complex and long sentences, because the decoder has to learn all grammar and diction knowledge.
This paper introduces a hierarchical decoding NLG model based on linguistic patterns in different levels, and shows that the proposed method outperforms the traditional one with a smaller model size.
Furthermore, the design of the hierarchical decoding is flexible and easily-extensible in various NLG systems.
We motivate and give semantics to theory presentation combinators as the foundational building blocks for a scalable library of theories.
The key observation is that the category of contexts and fibered categories are the ideal theoretical tools for this purpose.
This paper uses a spatial Aloha model to describe a distributed autonomous wireless network in which a group of transmit-receive pairs (users) shares a common collision channel via slotted-Aloha-like random access.
The objective of this study is to develop an intelligent algorithm to be embedded into the transceivers so that all users know how to self-tune their medium access probability (MAP) to achieve overall Pareto optimality in terms of network throughput under spatial reuse while maintaining network stability.
While the optimal solution requires each user to have complete information about the network, our proposed algorithm only requires users to have local information.
The fundamental of our algorithm is that the users will first self-organize into a number of non-overlapping neighborhoods, and the user with the maximum node degree in each neighborhood is elected as the local leader (LL).
Each LL then adjusts its MAP according to a parameter R which indicates the radio intensity level in its neighboring region, whereas the remaining users in the neighborhood simply follow the same MAP value.
We show that by ensuring R less than or equal to 2 at the LLs, the stability of the entire network can be assured even when each user only has partial network information.
For practical implementation, we propose each LL to use R=2 as the constant reference signal to its built-in proportional and integral controller.
The settings of the control parameters are discussed and we validate through simulations that the proposed method is able to achieve close-to-Pareto-front throughput.
In this paper, we present our approach for the 2018 Medico Task classifying diseases in the gastrointestinal tract.
We have proposed a system based on global features and deep neural networks.
The best approach combines two neural networks, and the reproducible experimental results signify the efficiency of the proposed model with an accuracy rate of 95.80%, a precision of 95.87%, and an F1-score of 95.80%.
Recurrent neural networks (RNNs) are the state of the art in sequence modeling for natural language.
However, it remains poorly understood what grammatical characteristics of natural language they implicitly learn and represent as a consequence of optimizing the language modeling objective.
Here we deploy the methods of controlled psycholinguistic experimentation to shed light on to what extent RNN behavior reflects incremental syntactic state and grammatical dependency representations known to characterize human linguistic behavior.
We broadly test two publicly available long short-term memory (LSTM) English sequence models, and learn and test a new Japanese LSTM.
We demonstrate that these models represent and maintain incremental syntactic state, but that they do not always generalize in the same way as humans.
Furthermore, none of our models learn the appropriate grammatical dependency configurations licensing reflexive pronouns or negative polarity items.
Sampling above the Nyquist rate is at the heart of sigma-delta modulation, where the increase in sampling rate is translated to a reduction in the overall (mean-squared-error) reconstruction distortion.
This is attained by using a feedback filter at the encoder, in conjunction with a low-pass filter at the decoder.
The goal of this work is to characterize the optimal trade-off between the per-sample quantization rate and the resulting mean-squared-error distortion, under various restrictions on the feedback filter.
To this end, we establish a duality relation between the performance of sigma-delta modulation, and that of differential pulse-code modulation when applied to (discrete-time) band-limited inputs.
As the optimal trade-off for the latter scheme is fully understood, the full characterization for sigma-delta modulation, as well as the optimal feedback filters, immediately follow.
Learning intents and slot labels from user utterances is a fundamental step in all spoken language understanding (SLU) and dialog systems.
State-of-the-art neural network based methods, after deployment, often suffer from performance degradation on encountering paraphrased utterances, and out-of-vocabulary words, rarely observed in their training set.
We address this challenging problem by introducing a novel paraphrasing based SLU model which can be integrated with any existing SLU model in order to improve their overall performance.
We propose two new paraphrase generators using RNN and sequence-to-sequence based neural networks, which are suitable for our application.
Our experiments on existing benchmark and in house datasets demonstrate the robustness of our models to rare and complex paraphrased utterances, even under adversarial test distributions.
We introduce a saliency-based distortion layer for convolutional neural networks that helps to improve the spatial sampling of input data for a given task.
Our differentiable layer can be added as a preprocessing block to existing task networks and trained altogether in an end-to-end fashion.
The effect of the layer is to efficiently estimate how to sample from the original data in order to boost task performance.
For example, for an image classification task in which the original data might range in size up to several megapixels, but where the desired input images to the task network are much smaller, our layer learns how best to sample from the underlying high resolution data in a manner which preserves task-relevant information better than uniform downsampling.
This has the effect of creating distorted, caricature-like intermediate images, in which idiosyncratic elements of the image that improve task performance are zoomed and exaggerated.
Unlike alternative approaches such as spatial transformer networks, our proposed layer is inspired by image saliency, computed efficiently from uniformly downsampled data, and degrades gracefully to a uniform sampling strategy under uncertainty.
We apply our layer to improve existing networks for the tasks of human gaze estimation and fine-grained object classification.
Code for our method is available in: http://github.com/recasens/Saliency-Sampler
While game theory is widely used to model strategic interactions, a natural question is where do the game representations come from?
One answer is to learn the representations from data.
If one wants to learn both the payoffs and the players' strategies, a naive approach is to learn them both directly from the data.
This approach ignores the fact the players might be playing reasonably good strategies, so there is a connection between the strategies and the data.
The main contribution of this paper is to make this connection while learning.
We formulate the learning problem as a weighted constraint satisfaction problem, including constraints both for the fit of the payoffs and strategies to the data and the fit of the strategies to the payoffs.
We use quantal response equilibrium as our notion of rationality for quantifying the latter fit.
Our results show that incorporating rationality constraints can improve learning when the amount of data is limited.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification.
In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance.
Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner.
In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works.
We also emphasize and discuss the importance of feature normalization in the paper.
Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset.
Our code has also been made available at https://github.com/happynear/AMSoftmax
The secrecy capacity of the type II wiretap channel (WTC II) with a noisy main channel is currently an open problem.
Herein its secrecy-capacity is derived and shown to be equal to its semantic-security (SS) capacity.
In this setting, the legitimate users communicate via a discrete-memoryless (DM) channel in the presence of an eavesdropper that has perfect access to a subset of its choosing of the transmitted symbols, constrained to a fixed fraction of the blocklength.
The secrecy criterion is achieved simultaneously for all possible eavesdropper subset choices.
The SS criterion demands negligible mutual information between the message and the eavesdropper's observations even when maximized over all message distributions.
A key tool for the achievability proof is a novel and stronger version of Wyner's soft covering lemma.
Specifically, a random codebook is shown to achieve the soft-covering phenomenon with high probability.
The probability of failure is doubly-exponentially small in the blocklength.
Since the combined number of messages and subsets grows only exponentially with the blocklength, SS for the WTC II is established by using the union bound and invoking the stronger soft-covering lemma.
The direct proof shows that rates up to the weak-secrecy capacity of the classic WTC with a DM erasure channel (EC) to the eavesdropper are achievable.
The converse follows by establishing the capacity of this DM wiretap EC as an upper bound for the WTC II.
From a broader perspective, the stronger soft-covering lemma constitutes a tool for showing the existence of codebooks that satisfy exponentially many constraints, a beneficial ability for many other applications in information theoretic security.
In this paper we propose a novel approach to tracking by detection that can exploit both cameras as well as LIDAR data to produce very accurate 3D trajectories.
Towards this goal, we formulate the problem as a linear program that can be solved exactly, and learn convolutional networks for detection as well as matching in an end-to-end manner.
We evaluate our model in the challenging KITTI dataset and show very competitive results.
Two-step predictor/corrector methods are provided to solve three classes of problems that present themselves as systems of ordinary differential equations (ODEs).
In the first class, velocities are given from which displacements are to be solved.
In the second class, velocities and accelerations are given from which displacements are to be solved.
And in the third class, accelerations are given from which velocities and displacements are to be solved.
Two-step methods are not self starting, so compatible one-step methods are provided to take that first step with.
An algorithm is presented for controlling the step size so that the local truncation error does not exceed a specified tolerance.
This paper introduces a novel deep learning framework for image animation.
Given an input image with a target object and a driving video sequence depicting a moving object, our framework generates a video in which the target object is animated according to the driving sequence.
This is achieved through a deep architecture that decouples appearance and motion information.
Our framework consists of three main modules: (i) a Keypoint Detector unsupervisely trained to extract object keypoints, (ii) a Dense Motion prediction network for generating dense heatmaps from sparse keypoints, in order to better encode motion information and (iii) a Motion Transfer Network, which uses the motion heatmaps and appearance information extracted from the input image to synthesize the output frames.
We demonstrate the effectiveness of our method on several benchmark datasets, spanning a wide variety of object appearances, and show that our approach outperforms state-of-the-art image animation and video generation methods.
Energy consumption is an important concern in modern multicore processors.
The energy consumed during the execution of an application can be minimized by tuning the hardware state utilizing knobs such as frequency, voltage etc.
The existing theoretical work on energy mini- mization using Global DVFS (Dynamic Voltage and Frequency Scaling), despite being thorough, ignores the energy consumed by the CPU on memory accesses and the dynamic energy consumed by the idle cores.
This article presents an analytical model for the performance and the overall energy consumed by the CPU chip on CPU instructions as well as the memory accesses without ignoring the dynamic energy consumed by the idle cores.
We present an analytical framework around our energy-performance model to predict the operating frequencies for global DVFS that minimize the overall CPU energy consumption within a performance budget.
Finally, we suggest a scheduling criteria for energy aware scheduling of memory intensive parallel applications.
Changes in technology have resulted in new ways for bankers to deliver their services to costumers.
Electronic banking systems in various forms are the evidence of such advancement.
However, information security threats also evolving along this trend.
This paper proposes the application of Analytic Hierarchy Process (AHP) methodology to guide decision makers in banking industries to deal with information security policy.
The model is structured according aspects of information security policy in conjunction with information security elements.
We found that cultural aspect is valued on the top priority among other security aspects, while confidentiality is considered as the most important factor in terms of information security elements.
Cooperative transmission of data fosters rapid accumulation of knowledge by efficiently combining experiences across learners.
Although well studied in human learning and increasingly in machine learning, we lack formal frameworks through which we may reason about the benefits and limitations of cooperative inference.
We present such a framework.
We introduce novel indices for measuring the effectiveness of probabilistic and cooperative information transmission.
We relate our indices to the well-known Teaching Dimension in deterministic settings.
We prove conditions under which optimal cooperative inference can be achieved, including a representation theorem that constrains the form of inductive biases for learners optimized for cooperative inference.
We conclude by demonstrating how these principles may inform the design of machine learning algorithms and discuss implications for human and machine learning.
Acquisition of labeled training samples for affective computing is usually costly and time-consuming, as affects are intrinsically subjective, subtle and uncertain, and hence multiple human assessors are needed to evaluate each affective sample.
Particularly, for affect estimation in the 3D space of valence, arousal and dominance, each assessor has to perform the evaluations in three dimensions, which makes the labeling problem even more challenging.
Many sophisticated machine learning approaches have been proposed to reduce the data labeling requirement in various other domains, but so far few have considered affective computing.
This paper proposes two multi-task active learning for regression approaches, which select the most beneficial samples to label, by considering the three affect primitives simultaneously.
Experimental results on the VAM corpus demonstrated that our optimal sample selection approaches can result in better estimation performance than random selection and several traditional single-task active learning approaches.
Thus, they can help alleviate the data labeling problem in affective computing, i.e., better estimation performance can be obtained from fewer labeling queries.
Symbolic and logic computation systems ranging from computer algebra systems to theorem provers are finding their way into science, technology, mathematics and engineering.
But such systems rely on explicitly or implicitly represented mathematical knowledge that needs to be managed to use such systems effectively.
While mathematical knowledge management (MKM) "in the small" is well-studied, scaling up to large, highly interconnected corpora remains difficult.
We hold that in order to realize MKM "in the large", we need representation languages and software architectures that are designed systematically with large-scale processing in mind.
Therefore, we have designed and implemented the MMT language -- a module system for mathematical theories.
MMT is designed as the simplest possible language that combines a module system, a foundationally uncommitted formal semantics, and web-scalable implementations.
Due to a careful choice of representational primitives, MMT allows us to integrate existing representation languages for formal mathematical knowledge in a simple, scalable formalism.
In particular, MMT abstracts from the underlying mathematical and logical foundations so that it can serve as a standardized representation format for a formal digital library.
Moreover, MMT systematically separates logic-dependent and logic-independent concerns so that it can serve as an interface layer between computation systems and MKM systems.
Neural Style Transfer based on Convolutional Neural Networks (CNN) aims to synthesize a new image that retains the high-level structure of a content image, rendered in the low-level texture of a style image.
This is achieved by constraining the new image to have high-level CNN features similar to the content image, and lower-level CNN features similar to the style image.
However in the traditional optimization objective, low-level features of the content image are absent, and the low-level features of the style image dominate the low-level detail structures of the new image.
Hence in the synthesized image, many details of the content image are lost, and a lot of inconsistent and unpleasing artifacts appear.
As a remedy, we propose to steer image synthesis with a novel loss function: the Laplacian loss.
The Laplacian matrix ("Laplacian" in short), produced by a Laplacian operator, is widely used in computer vision to detect edges and contours.
The Laplacian loss measures the difference of the Laplacians, and correspondingly the difference of the detail structures, between the content image and a new image.
It is flexible and compatible with the traditional style transfer constraints.
By incorporating the Laplacian loss, we obtain a new optimization objective for neural style transfer named Lapstyle.
Minimizing this objective will produce a stylized image that better preserves the detail structures of the content image and eliminates the artifacts.
Experiments show that Lapstyle produces more appealing stylized images with less artifacts, without compromising their "stylishness".
How an information spreads throughout a social network is a valuable knowledge sought by many groups such as marketing enterprises and political parties.
If they can somehow predict the impact of a given message or manipulate it in order to amplify how long it will spread, it would give them a huge advantage over their competitors.
Intuitively, it is expected that two factors contribute to make an information becoming viral: how influential the person who spreads is inside its network and the content of the message.
The former should have a more important role, since people will not just blindly share any content, or will they?
In this work it is found that the degree of a node alone is capable of accurately predicting how many followers of the seed user will spread the information through a simple linear regression.
The analysis was performed with five different messages from Twitter network that was shared with different degrees along the users.
The results show evidences that no matter the content, the number of affected neighbors is predictable.
The role of the content of the messages of a user is likely to influence the network formation and the path the message will follow through the network.
As people become more concerned with the need to conserve their power consumption we need to find ways to inform them of how electricity is being consumed within the home.
There are a number of devices that have been designed using different forms, sizes, and technologies.
We are interested in large ambient displays that can be read at a glance and from a distance as informative art.
However, from these objectives come a number of questions that need to be explored and answered.
To what degree might lifestyle factors influence the design of eco-visualizations?
To answer this we need to ask how people with varying lifestyle factors perceive the utility of such devices and their placement within a home.
We explore these questions by creating four ambient display prototypes.
We take our prototypes and subject them to a user study to gain insight as to the questions posed above.
This paper discusses our prototypes in detail and the results and findings of our user study.
In this note we study the connection between the existence of a projective reconstruction and the existence of a fundamental matrix satisfying the epipolar constraints.
Image cropping aims at improving the aesthetic quality of images by adjusting their composition.
Most weakly supervised cropping methods (without bounding box supervision) rely on the sliding window mechanism.
The sliding window mechanism requires fixed aspect ratios and limits the cropping region with arbitrary size.
Moreover, the sliding window method usually produces tens of thousands of windows on the input image which is very time-consuming.
Motivated by these challenges, we firstly formulate the aesthetic image cropping as a sequential decision-making process and propose a weakly supervised Aesthetics Aware Reinforcement Learning (A2-RL) framework to address this problem.
Particularly, the proposed method develops an aesthetics aware reward function which especially benefits image cropping.
Similar to human's decision making, we use a comprehensive state representation including both the current observation and the historical experience.
We train the agent using the actor-critic architecture in an end-to-end manner.
The agent is evaluated on several popular unseen cropping datasets.
Experiment results show that our method achieves the state-of-the-art performance with much fewer candidate windows and much less time compared with previous weakly supervised methods.
A new transform over finite fields, the finite field Hartley transform (FFHT), was recently introduced and a number of promising applications on the design of efficient multiple access systems and multilevel spread spectrum sequences were proposed.
The FFHT exhibits interesting symmetries, which are exploited to derive tailored fast transform algorithms.
The proposed fast algorithms are based on successive decompositions of the FFHT by means of Hadamard-Walsh transforms (HWT).
The introduced decompositions meet the lower bound on the multiplicative complexity for all the cases investigated.
The complexity of the new algorithms is compared with that of traditional algorithms.
E-commerce users may expect different products even for the same query, due to their diverse personal preferences.
It is well-known that there are two types of preferences: long-term ones and short-term ones.
The former refers to user' inherent purchasing bias and evolves slowly.
By contrast, the latter reflects users' purchasing inclination in a relatively short period.
They both affect users' current purchasing intentions.
However, few research efforts have been dedicated to jointly model them for the personalized product search.
To this end, we propose a novel Attentive Long Short-Term Preference model, dubbed as ALSTP, for personalized product search.
Our model adopts the neural networks approach to learn and integrate the long- and short-term user preferences with the current query for the personalized product search.
In particular, two attention networks are designed to distinguish which factors in the short-term as well as long-term user preferences are more relevant to the current query.
This unique design enables our model to capture users' current search intentions more accurately.
Our work is the first to apply attention mechanisms to integrate both long- and short-term user preferences with the given query for the personalized search.
Extensive experiments over four Amazon product datasets show that our model significantly outperforms several state-of-the-art product search methods in terms of different evaluation metrics.
The Multi-Carrier Code Division Multiple Access (MC-CDMA) is becoming a very significant downlink multiple access technique for high-rate data transmission in the fourth generation wireless communication systems.
By means of efficient resource allocation higher data rate i.e. throughput can be achieved.
This paper evaluates the performance of criteria used for group (subchannel) allocation employed in downlink transmission, which results in throughput maximization.
Proposed algorithm gives the modified technique of sub channel allocation in the downlink transmission of MC-CDMA systems.
Simulation are carried out for all the three combining schemes, results shows that for the given power and BER proposed algorithm comparatively gives far better results
Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life.
This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields.
We introduce the Java Information Dynamics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data.
While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher-level measures for information dynamics.
That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time.
For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants.
JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g.Gaussian, box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time due to Java's object-oriented polymorphism.
Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments.
We present the principles behind the code design, and provide several examples to guide users.
This article discusses how the automation of tensor algorithms, based on A Mathematics of Arrays and Psi Calculus, and a new way to represent numbers, Unum Arithmetic, enables mechanically provable, scalable, portable, and more numerically accurate software.
Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks.
However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies.
In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time.
We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph.
This model can also be encouraged to perform fewer state updates through a budget constraint.
We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models.
Source code is publicly available at https://imatge-upc.github.io/skiprnn-2017-telecombcn/ .
We address the problem of executing tool-using manipulation skills in scenarios where the objects to be used may vary.
We assume that point clouds of the tool and target object can be obtained, but no interpretation or further knowledge about these objects is provided.
The system must interpret the point clouds and decide how to use the tool to complete a manipulation task with a target object; this means it must adjust motion trajectories appropriately to complete the task.
We tackle three everyday manipulations: scraping material from a tool into a container, cutting, and scooping from a container.
Our solution encodes these manipulation skills in a generic way, with parameters that can be filled in at run-time via queries to a robot perception module; the perception module abstracts the functional parts for the tool and extracts key parameters that are needed for the task.
The approach is evaluated in simulation and with selected examples on a PR2 robot.
Modern websites include various types of third-party content such as JavaScript, images, stylesheets, and Flash objects in order to create interactive user interfaces.
In addition to explicit inclusion of third-party content by website publishers, ISPs and browser extensions are hijacking web browsing sessions with increasing frequency to inject third-party content (e.g., ads).
However, third-party content can also introduce security risks to users of these websites, unbeknownst to both website operators and users.
Because of the often highly dynamic nature of these inclusions as well as the use of advanced cloaking techniques in contemporary malware, it is exceedingly difficult to preemptively recognize and block inclusions of malicious third-party content before it has the chance to attack the user's system.
In this paper, we propose a novel approach to achieving the goal of preemptive blocking of malicious third-party content inclusion through an analysis of inclusion sequences on the Web.
We implemented our approach, called Excision, as a set of modifications to the Chromium browser that protects users from malicious inclusions while web pages load.
Our analysis suggests that by adopting our in-browser approach, users can avoid a significant portion of malicious third-party content on the Web.
Our evaluation shows that Excision effectively identifies malicious content while introducing a low false positive rate.
Our experiments also demonstrate that our approach does not negatively impact a user's browsing experience when browsing popular websites drawn from the Alexa Top 500.
The majority of existing color naming methods focuses on the eleven basic color terms of the English language.
However, in many applications, different sets of color names are used for the accurate description of objects.
Labeling data to learn these domain-specific color names is an expensive and laborious task.
Therefore, in this article we aim to learn color names from weakly labeled data.
For this purpose, we add an attention branch to the color naming network.
The attention branch is used to modulate the pixel-wise color naming predictions of the network.
In experiments, we illustrate that the attention branch correctly identifies the relevant regions.
Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains.
In light of the quick proliferation of Internet of things (IoT) devices and applications, fog radio access network (Fog-RAN) has been recently proposed for fifth generation (5G) wireless communications to assure the requirements of ultra-reliable low-latency communication (URLLC) for the IoT applications which cannot accommodate large delays.
Hence, fog nodes (FNs) are equipped with computing, signal processing and storage capabilities to extend the inherent operations and services of the cloud to the edge.
We consider the problem of sequentially allocating the FN's limited resources to the IoT applications of heterogeneous latency requirements.
For each access request from an IoT user, the FN needs to decide whether to serve it locally utilizing its own resources or to refer it to the cloud to conserve its valuable resources for future users of potentially higher utility to the system (i.e., lower latency requirement).
We formulate the Fog-RAN resource allocation problem in the form of a Markov decision process (MDP), and employ several reinforcement learning (RL) methods, namely Q-learning, SARSA, Expected SARSA, and Monte Carlo, for solving the MDP problem by learning the optimum decision-making policies.
We verify the performance and adaptivity of the RL methods and compare it with the performance of a fixed-threshold-based algorithm.
Extensive simulation results considering 19 IoT environments of heterogeneous latency requirements corroborate that RL methods always achieve the best possible performance regardless of the IoT environment.
In this work, we consider a 1-bit quantized massive MIMO channel with superimposed pilot (SP) scheme, dubbed QSP.
With linear minimum mean square error (LMMSE) channel estimator and maximum ratio combining (MRC) receiver at the BS, we derive an approximate lower bound on the achievable rate.
When optimizing pilot and data powers, the optimal power allocation maximizing the data rate is obtained in a closed-form solution.
Although there is a performance gap between the quantized and unquantized systems, it is shown that this gap diminishes as the number of BS antennas is asymptotically large.
Moreover, we show that pilot removal from the received signal by using the channel estimate doesn't result in a significant increase in information, especially in the cases of low signal-to-noise ratio (SNR) and a large number of users.
We present some numerical results to corroborate our analytical findings and insights are provided for further exploration of the quantized systems with SP.
In order to avoid the state space explosion problem encountered in the quantitative analysis of large scale PEPA models, a fluid approximation approach has recently been proposed, which results in a set of ordinary differential equations (ODEs) to approximate the underlying continuous time Markov chain (CTMC).
This paper presents a mapping semantics from PEPA to ODEs based on a numerical representation scheme, which extends the class of PEPA models that can be subjected to fluid approximation.
Furthermore, we have established the fundamental characteristics of the derived ODEs, such as the existence, uniqueness, boundedness and nonnegativeness of the solution.
The convergence of the solution as time tends to infinity for several classes of PEPA models, has been proved under some mild conditions.
For general PEPA models, the convergence is proved under a particular condition, which has been revealed to relate to some famous constants of Markov chains such as the spectral gap and the Log-Sobolev constant.
This thesis has established the consistency between the fluid approximation and the underlying CTMCs for PEPA, i.e. the limit of the solution is consistent with the equilibrium probability distribution corresponding to a family of underlying density dependent CTMCs.
We focus on sorting, which is the building block of many machine learning algorithms, and propose a novel distributed sorting algorithm, named Coded TeraSort, which substantially improves the execution time of the TeraSort benchmark in Hadoop MapReduce.
The key idea of Coded TeraSort is to impose structured redundancy in data, in order to enable in-network coding opportunities that overcome the data shuffling bottleneck of TeraSort.
We empirically evaluate the performance of CodedTeraSort algorithm on Amazon EC2 clusters, and demonstrate that it achieves 1.97x - 3.39x speedup, compared with TeraSort, for typical settings of interest.
Robots capable of participating in complex social interactions have shown great potential in a variety of applications.
As these robots grow more popular, it is essential to continuously evaluate the dynamics of the human-robot relationship.
One factor shown to have potential impacts on this critical relationship is the human projection of stereotypes onto social robots, a practice that is implicitly known to effect both developers and users of this technology.
As such, in this research, we wished to investigate the difference in participants' perceptions of the robot interaction if we removed stereotype priming.
This has not yet been a common practice in similar studies.
Given the stereotypes of emotions among ethnic groups, especially in the U.S., this study specifically sought to investigate the impact that robot "skin color" could potentially have on the human perception of a robot's emotional expressive behavior.
A between-subject experiment with 198 individuals was conducted.
The results showed no significant differences in the overall emotion classification or intensity ratings for the different robot skin colors.
These results lend credence to our hypothesis that when individuals are not primed with information related to human stereotypes, robots are evaluated based on functional attributes versus stereotypical attributes.
This provides some confidence that robots, if designed correctly, can potentially be used as a tool to override stereotype-based biases associated with human behavior.
Symmetry is an important composition feature by investigating similar sides inside an image plane.
It has a crucial effect to recognize man-made or nature objects within the universe.
Recent symmetry detection approaches used a smoothing kernel over different voting maps in the polar coordinate system to detect symmetry peaks, which split the regions of symmetry axis candidates in inefficient way.
We propose a reliable voting representation based on weighted linear-directional kernel density estimation, to detect multiple symmetries over challenging real-world and synthetic images.
Experimental evaluation on two public datasets demonstrates the superior performance of the proposed algorithm to detect global symmetry axes respect to the major image shapes.
Enhancement and detection of 3D vessel-like structures has long been an open problem as most existing image processing methods fail in many aspects, including a lack of uniform enhancement between vessels of different radii and a lack of enhancement at the junctions.
Here, we propose a method based on mathematical morphology to enhance 3D vessel-like structures in biomedical images.
The proposed method, 3D bowler-hat transform, combines sphere and line structuring elements to enhance vessel-like structures.
The proposed method is validated on synthetic and real data and compared with state-of-the-art methods.
Our results show that the proposed method achieves a high-quality vessel-like structures enhancement in both synthetic and real biomedical images, and is able to cope with variations in vessels thickness throughout vascular networks while remaining robust at junctions.
Cooperative networking brings performance improvement to most of the issues in wireless networks, such as fading or delay due to slow stations.
However, due to cooperation when data is relayed via other nodes, there network is more prone to attacks.
Since, channel access is very important for cooperation, most of the attacks happens at MAC.
One of the most critical attack is denial of service, which is reason of cooperation failure.
Therefore, the cooperative network as well as simple wireless LAN must be defensive against DOS attacks.
In this article we analyzed all possible of DoS attacks that can happen at MAC layer of WLAN.
The cooperative protocols must consider defense against these attacks.
This article also provided survey of available solutions to these attacks.
At the end it described its damages and cost as well as how to handle these attacks while devising cooperative MAC.
Salient object detection is a problem that has been considered in detail and many solutions proposed.
In this paper, we argue that work to date has addressed a problem that is relatively ill-posed.
Specifically, there is not universal agreement about what constitutes a salient object when multiple observers are queried.
This implies that some objects are more likely to be judged salient than others, and implies a relative rank exists on salient objects.
Initially, we present a novel deep learning solution based on a hierarchical representation of relative saliency and stage-wise refinement.
Furthermore, we present data, analysis and benchmark baseline results towards addressing the problem of salient object ranking.
Methods for deriving suitable ranked salient object instances are presented, along with metrics suitable to measuring algorithm performance.
In addition, we show how a derived dataset can be successively refined to provide cleaned results that correlate well with pristine ground truth.
Finally, we provide a comparison among prevailing algorithms that address salient object ranking or detection to establish initial baselines.
As digital goods and services become an integral part of modern day society, the demand for a standardized and ubiquitous form of digital currency increases.
And it is not just about digital goods; the adoption of electronic and mobile commerce has not reached its expected level at all parts of the globe as expected.
One of the main reasons behind that is the lack of a universal digital as well as virtual currency.
Many countries in the world have failed to realize the potential of e-commerce, let alone m-commerce, because of rigid financial regulations and apparent disorientation & gap between monetary stakeholders across borders and continents.
Digital currency which is internet-based, non-banks issued and circulated within a certain range of networks has brought a significant impact on the development of e-commerce.
The research and analysis of this paper would focus on the feasibility of the operation of a digital currency and its economic implications.
Problems of the switching parallel system designing provided spatial switching of packets from random time are discussed.
Results of modeling of switching system as systems of mass service are resulted.
One problem with load test quality, almost always overlooked, is the potential for the load generator's user thread pool to sync up and dispatch queries in bunches rather than independently from each other like real users initiate their requests.
A spiky launch pattern misrepresents workload flow as well as yields erroneous application response time statistics.
This paper describes what a real user request timing pattern looks like, illustrates how to identify it in the load generation environment, and exercises a free downloadable tool which measures how well the load generator is mimicking the timing pattern of real web user requests.
Virtual machine live migration in cloud environments aims at reducing energy costs and increasing resource utilization.
However, its potential has not been fully explored because of simultaneous migrations that may cause user application performance degradation and network congestion.
Research efforts on live migration orchestration policies still mostly rely on system level metrics.
This work introduces an Application-aware Live Migration Architecture (ALMA) that selects suitable moments for migrations using application characterization data.
This characterization consists in recognizing resource usage cycles via Fast Fourier Transform.
From our experiments, live migration times were reduced by up to 74% for benchmarks and by up to 67% for real applications, when compared to migration policies with no application workload analysis.
Network data transfer during the live migration was reduced by up to 62%.
The robustness and security of the biometric watermarking approach can be improved by using a multiple watermarking.
This multiple watermarking proposed for improving security of biometric features and data.
When the imposter tries to create the spoofed biometric feature, the invisible biometric watermark features can provide appropriate protection to multimedia data.
In this paper, a biometric watermarking technique with multiple biometric watermarks are proposed in which biometric features of fingerprint, face, iris and signature is embedded in the image.
Before embedding, fingerprint, iris, face and signature features are extracted using Shen-Castan edge detection and Principal Component Analysis.
These all biometric watermark features are embedded into various mid band frequency curvelet coefficients of host image.
All four fingerprint features, iris features, facial features and signature features are the biometric characteristics of the individual and they are used for cross verification and copyright protection if any manipulation occurs.
The proposed technique is fragile enough; features cannot be extracted from the watermarked image when an imposter tries to remove watermark features illegally.
It can use for multiple copyright authentication and verification.
Teens are using mobile devices for an increasing number of activities.
Smartphones and a variety of mobile apps for communication, entertainment, and productivity have become an integral part of their lives.
This mobile phone use has evolved rapidly as technology has changed and thus studies from even 2 or 3 years ago may not reflect new patterns and practices as smartphones have become more sophisticated.
In order to understand current teen's practices around smartphone use, we conducted a two week, mixed-methods study with 14 diverse teens.
Through voicemail diaries, interviews, and real world usage data from a logging application installed on their smartphones, we developed an understanding of the types of apps used by teens, when they use these apps, and their reasons for using specific apps in particular situations.
We found that the teens in our study used their smartphones for an average of almost 3 hours per day and that two-thirds of all app use involved interacting with an average of almost 10 distinct communications applications.
From our study data, we highlight key implications for the design of future mobile apps or services, specifically new social and communications-related applications that allow teens to maintain desired levels of privacy and permanence on the content that they share.
SAML assertions are becoming popular method for passing authentication and authorisation information between identity providers and consumers using various single sign-on protocols.
However their practical security strongly depends on correct implementation, especially on the consumer side.
Somorovsky and others have demonstrated a number of XML signature related vulnerabilities in SAML assertion validation frameworks.
This article demonstrates how bad library documentation and examples can lead to vulnerable consumer code and how this can be avoided.
We propose a Visual-SLAM based localization and navigation system for service robots.
Our system is built on top of the ORB-SLAM monocular system but extended by the inclusion of wheel odometry in the estimation procedures.
As a case study, the proposed system is validated using the Pepper robot, whose short-range LIDARs and RGB-D camera do not allow the robot to self-localize in large environments.
The localization system is tested in navigation tasks using Pepper in two different environments: a medium-size laboratory, and a large-size hall.
Software-Defined Networking (SDN) is a novel networking paradigm that provides enhanced programming abilities, which can be used to solve traditional security challenges on the basis of more efficient approaches.
The most important element in the SDN paradigm is the controller, which is responsible for managing the flows of each correspondence forwarding element (switch or router).
Flow statistics provided by the controller are considered to be useful information that can be used to develop a network-based intrusion detection system.
Therefore, in this paper, we propose a 5-level hybrid classification system based on flow statistics in order to attain an improvement in the overall accuracy of the system.
For the first level, we employ the k-Nearest Neighbor approach (kNN); for the second level, we use the Extreme Learning Machine (ELM); and for the remaining levels, we utilize the Hierarchical Extreme Learning Machine (H-ELM) approach.
In comparison with conventional supervised machine learning algorithms based on the NSL-KDD benchmark dataset, the experimental study showed that our system achieves the highest level of accuracy (84.29%).
Therefore, our approach presents an efficient approach for intrusion detection in SDNs.
Machine Learning has been a big success story during the AI resurgence.
One particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully.
In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media.
What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques.
Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes.
The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions.
The model has been implemented, and it is able to understand a broad range of spatial referring expressions.
We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework.
The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases.
In an analysis of the system's successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account.
We examine the supervised learning problem in its continuous setting and give a general optimality condition through techniques of functional analysis and the calculus of variations.
This enables us to solve the optimality condition for the desired function u numerically and make several comparisons with other widely utilized supervised learning models.
We employ the accuracy and area under the receiver operating characteristic curve as metrics of the performance.
Finally, 3 analyses are conducted based on these two mentioned metrics where we compare the models and make conclusions to determine whether or not our method is competitive.
As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies.
In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private.
This is motivated by the fact that this spatiotemporal information can easily be used as the basis for inferences for a person's activities.
More specifically, we consider the differential privacy of the mapping from the amount of flow for each origin-destination pair to the traffic flow measurements on each link of a traffic network.
We use a stochastic online learning framework for the population dynamics, which is known to converge to the Nash equilibrium of the routing game.
We analyze the sensitivity of this process and provide theoretical guarantees on the convergence rates as well as differential privacy values for these models.
We confirm these with simulations on a small example.
Recognizing an object's material can inform a robot on the object's fragility or appropriate use.
To estimate an object's material during manipulation, many prior works have explored the use of haptic sensing.
In this paper, we explore a technique for robots to estimate the materials of objects using spectroscopy.
We demonstrate that spectrometers provide several benefits for material recognition, including fast response times and accurate measurements with low noise.
Furthermore, spectrometers do not require direct contact with an object.
To explore this, we collected a dataset of spectral measurements from two commercially available spectrometers during which a robotic platform interacted with 50 flat material objects, and we show that a neural network model can accurately analyze these measurements.
Due to the similarity between consecutive spectral measurements, our model achieved a material classification accuracy of 94.6% when given only one spectral sample per object.
Similar to prior works with haptic sensors, we found that generalizing material recognition to new objects posed a greater challenge, for which we achieved an accuracy of 79.1% via leave-one-object-out cross-validation.
Finally, we demonstrate how a PR2 robot can leverage spectrometers to estimate the materials of everyday objects found in the home.
From this work, we find that spectroscopy poses a promising approach for material classification during robotic manipulation.
This paper presents a multidisciplinary task approach for assessing the impact of artificial intelligence on the future of work.
We provide definitions of a task from two main perspectives: socio-economic and computational.
We propose to explore ways in which we can integrate or map these perspectives, and link them with the skills or capabilities required by them, for humans and AI systems.
Finally, we argue that in order to understand the dynamics of tasks, we have to explore the relevance of autonomy and generality of AI systems for the automation or alteration of the workplace.
We describe computer algorithms that produce the complete set of isohedral tilings by n-omino or n-iamond tiles in which the tiles are fundamental domains and the tilings have 3-, 4-, or 6-fold rotational symmetry.
The symmetry groups of such tilings are of types p3, p31m, p4, p4g, and p6.
There are no isohedral tilings with symmetry groups p3m1, p4m, or p6m that have polyominoes or polyiamonds as fundamental domains.
We display the algorithms' output and give enumeration tables for small values of n. This expands on our earlier works (Fukuda et al 2006, 2008).
Automatic storytelling is challenging since it requires generating long, coherent natural language to describes a sensible sequence of events.
Despite considerable efforts on automatic story generation in the past, prior work either is restricted in plot planning, or can only generate stories in a narrow domain.
In this paper, we explore open-domain story generation that writes stories given a title (topic) as input.
We propose a plan-and-write hierarchical generation framework that first plans a storyline, and then generates a story based on the storyline.
We compare two planning strategies.
The dynamic schema interweaves story planning and its surface realization in text, while the static schema plans out the entire storyline before generating stories.
Experiments show that with explicit storyline planning, the generated stories are more diverse, coherent, and on topic than those generated without creating a full plan, according to both automatic and human evaluations.
Identifying user's identity is a key problem in many data mining applications, such as product recommendation, customized content delivery and criminal identification.
Given a set of accounts from the same or different social network platforms, user identification attempts to identify all accounts belonging to the same person.
A commonly used solution is to build the relationship among different accounts by exploring their collective patterns, e.g., user profile, writing style, similar comments.
However, this kind of method doesn't work well in many practical scenarios, since the information posted explicitly by users may be false due to various reasons.
In this paper, we re-inspect the user identification problem from a novel perspective, i.e., identifying user's identity by matching his/her cameras.
The underlying assumption is that multiple accounts belonging to the same person contain the same or similar camera fingerprint information.
The proposed framework, called User Camera Identification (UCI), is based on camera fingerprints, which takes fully into account the problems of multiple cameras and reposting behaviors.
The dynamic software development organizations optimize the usage of resources to deliver the products in the specified time with the fulfilled requirements.
This requires prevention or repairing of the faults as quick as possible.
In this paper an approach for predicting the run-time errors in java is introduced.
The paper is concerned with faults due to inheritance and violation of java constraints.
The proposed fault prediction model is designed to separate the faulty classes in the field of software testing.
Separated faulty classes are classified according to the fault occurring in the specific class.
The results are papered by clustering the faults in the class.
This model can be used for predicting software reliability.
Reducing network latency in mobile applications is an effective way of improving the mobile user experience and has tangible economic benefits.
This paper presents PALOMA, a novel client-centric technique for reducing the network latency by prefetching HTTP requests in Android apps.
Our work leverages string analysis and callback control-flow analysis to automatically instrument apps using PALOMA's rigorous formulation of scenarios that address "what" and "when" to prefetch.
PALOMA has been shown to incur significant runtime savings (several hundred milliseconds per prefetchable HTTP request), both when applied on a reusable evaluation benchmark we have developed and on real applications
This work studies the problem of content-based image retrieval, specifically, texture retrieval.
It focuses on feature extraction and similarity measure for texture images.
Our approach employs a recently developed method, the so-called Scattering transform, for the process of feature extraction in texture retrieval.
It shares a distinctive property of providing a robust representation, which is stable with respect to spatial deformations.
Recent work has demonstrated its capability for texture classification, and hence as a promising candidate for the problem of texture retrieval.
Moreover, we adopt a common approach of measuring the similarity of textures by comparing the subband histograms of a filterbank transform.
To this end we derive a similarity measure based on the popular Bhattacharyya Kernel.
Despite the popularity of describing histograms using parametrized probability density functions, such as the Generalized Gaussian Distribution, it is unfortunately not applicable for describing most of the Scattering transform subbands, due to the complex modulus performed on each one of them.
In this work, we propose to use the Weibull distribution to model the Scattering subbands of descendant layers.
Our numerical experiments demonstrated the effectiveness of the proposed approach, in comparison with several state of the arts.
Grounding textual phrases in visual content is a meaningful yet challenging problem with various potential applications such as image-text inference or text-driven multimedia interaction.
Most of the current existing methods adopt the supervised learning mechanism which requires ground-truth at pixel level during training.
However, fine-grained level ground-truth annotation is quite time-consuming and severely narrows the scope for more general applications.
In this extended abstract, we explore methods to localize flexibly image regions from the top-down signal (in a form of one-hot label or natural languages) with a weakly supervised attention learning mechanism.
In our model, two types of modules are utilized: a backbone module for visual feature capturing, and an attentive module generating maps based on regularized bilinear pooling.
We construct the model in an end-to-end fashion which is trained by encouraging the spatial attentive map to shift and focus on the region that consists of the best matched visual features with the top-down signal.
We demonstrate the preliminary yet promising results on a testbed that is synthesized with multi-label MNIST data.
We introduce CCN-RAMP (Routing to Anchors Matching Prefixes), a new approach to content-centric networking.
CCN-RAMP offers all the advantages of the Named Data Networking (NDN) and Content-Centric Networking (CCNx) but eliminates the need to either use Pending Interest Tables (PIT) or lookup large Forwarding Information Bases (FIB) listing name prefixes in order to forward Interests.
CCN-RAMP uses small forwarding tables listing anonymous sources of Interests and the locations of name prefixes.
Such tables are immune to Interest-flooding attacks and are smaller than the FIBs used to list IP address ranges in the Internet.
We show that no forwarding loops can occur with CCN-RAMP, and that Interests flow over the same routes that NDN and CCNx would maintain using large FIBs.
The results of simulation experiments comparing NDN with CCN-RAMP based on ndnSIM show that CCN-RAMP requires forwarding state that is orders of magnitude smaller than what NDN requires, and attains even better performance.
We demonstrate an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and inference.
The network connectivity uses pre-determined, structured sparsity to significantly reduce complexity by lowering memory and computational requirements.
The architecture uses a notion of edge-processing, leading to efficient pipelining and parallelization.
Moreover, the device can be reconfigured to trade off resource utilization with training time to fit networks and datasets of varying sizes.
The combined effects of complexity reduction and easy reconfigurability enable significantly greater exploration of network hyperparameters and structures on-chip.
As proof of concept, we show implementation results on an Artix-7 FPGA.
This work focuses on building language models (LMs) for code-switched text.
We propose two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately 2) Pretraining the LM using synthetic text from a generative model estimated using the training data.
We demonstrate the effectiveness of our proposed techniques by reporting perplexities on a Mandarin-English task and derive significant reductions in perplexity.
Activity detection is a fundamental problem in computer vision.
Detecting activities of different temporal scales is particularly challenging.
In this paper, we propose the contextual multi-scale region convolutional 3D network (CMS-RC3D) for activity detection.
To deal with the inherent temporal scale variability of activity instances, the temporal feature pyramid is used to represent activities of different temporal scales.
On each level of the temporal feature pyramid, an activity proposal detector and an activity classifier are learned to detect activities of specific temporal scales.
Temporal contextual information is fused into activity classifiers for better recognition.
More importantly, the entire model at all levels can be trained end-to-end.
Our CMS-RC3D detector can deal with activities at all temporal scale ranges with only a single pass through the backbone network.
We test our detector on two public activity detection benchmarks, THUMOS14 and ActivityNet.
Extensive experiments show that the proposed CMS-RC3D detector outperforms state-of-the-art methods on THUMOS14 by a substantial margin and achieves comparable results on ActivityNet despite using a shallow feature extractor.
We address the problem of using hand-drawn sketches to create exaggerated deformations to faces in videos, such as enlarging the shape or modifying the position of eyes or mouth.
This task is formulated as a 3D face model reconstruction and deformation problem.
We first recover the facial identity and expressions from the video by fitting a face morphable model for each frame.
At the same time, user's editing intention is recognized from input sketches as a set of facial modifications.
Then a novel identity deformation algorithm is proposed to transfer these facial deformations from 2D space to the 3D facial identity directly while preserving the facial expressions.
After an optional stage for further refining the 3D face model, these changes are propagated to the whole video with the modified identity.
Both the user study and experimental results demonstrate that our sketching framework can help users effectively edit facial identities in videos, while high consistency and fidelity are ensured at the same time.
This article extends the Generalized Asypmtotic Equipartition Property of Networked Data Structures to cover the Wireless Sensor Network modelled as coloured geometric random graph (CGRG).
The main techniques used to prove this result remains large deviation principles for properly defined empirical measures on CGRGs.
As a motivation for this article, we apply our results to some data from Wireless Sensor Network for Monitoring Water Quality from a Lake..
We consider generation and comprehension of natural language referring expression for objects in an image.
Unlike generic "image captioning" which lacks natural standard evaluation criteria, quality of a referring expression may be measured by the receiver's ability to correctly infer which object is being described.
Following this intuition, we propose two approaches to utilize models trained for comprehension task to generate better expressions.
First, we use a comprehension module trained on human-generated expressions, as a "critic" of referring expression generator.
The comprehension module serves as a differentiable proxy of human evaluation, providing training signal to the generation module.
Second, we use the comprehension module in a generate-and-rerank pipeline, which chooses from candidate expressions generated by a model according to their performance on the comprehension task.
We show that both approaches lead to improved referring expression generation on multiple benchmark datasets.
The task of classifying videos of natural dynamic scenes into appropriate classes has gained lot of attention in recent years.
The problem especially becomes challenging when the camera used to capture the video is dynamic.
In this paper, we analyse the performance of statistical aggregation (SA) techniques on various pre-trained convolutional neural network(CNN) models to address this problem.
The proposed approach works by extracting CNN activation features for a number of frames in a video and then uses an aggregation scheme in order to obtain a robust feature descriptor for the video.
We show through results that the proposed approach performs better than the-state-of-the arts for the Maryland and YUPenn dataset.
The final descriptor obtained is powerful enough to distinguish among dynamic scenes and is even capable of addressing the scenario where the camera motion is dominant and the scene dynamics are complex.
Further, this paper shows an extensive study on the performance of various aggregation methods and their combinations.
We compare the proposed approach with other dynamic scene classification algorithms on two publicly available datasets - Maryland and YUPenn to demonstrate the superior performance of the proposed approach.
A domain adaptation method for urban scene segmentation is proposed in this work.
We develop a fully convolutional tri-branch network, where two branches assign pseudo labels to images in the unlabeled target domain while the third branch is trained with supervision based on images in the pseudo-labeled target domain.
The re-labeling and re-training processes alternate.
With this design, the tri-branch network learns target-specific discriminative representations progressively and, as a result, the cross-domain capability of the segmenter improves.
We evaluate the proposed network on large-scale domain adaptation experiments using both synthetic (GTA) and real (Cityscapes) images.
It is shown that our solution achieves the state-of-the-art performance and it outperforms previous methods by a significant margin.
Quality of service (QoS) provisioning in next-generation mobile communications systems entails a deep understanding of the delay performance.
The delay in wireless networks is strongly affected by the traffic arrival process and the service process, which in turn depends on the medium access protocol and the signal-to-interference-plus-noise ratio (SINR) distribution.
In this work, we characterize the conditional distribution of the service process given the point process in Poisson bipolar networks.
We then provide an upper bound on the delay violation probability combining tools from stochastic network calculus and stochastic geometry.
Furthermore, we analyze the delay performance under statistical queueing constraints using the effective capacity formulation.
The impact of QoS requirements, network geometry and link distance on the delay performance is identified.
Our results provide useful insights for guaranteeing stringent delay requirements in large wireless networks.
The web provides a rich, open-domain environment with textual, structural, and spatial properties.
We propose a new task for grounding language in this environment: given a natural language command (e.g., "click on the second article"), choose the correct element on the web page (e.g., a hyperlink or text box).
We collected a dataset of over 50,000 commands that capture various phenomena such as functional references (e.g."find who made this site"), relational reasoning (e.g."article by john"), and visual reasoning (e.g."top-most article").
We also implemented and analyzed three baseline models that capture different phenomena present in the dataset.
Many real-world optimization problems require significant resources for objective function evaluations.
This is a challenge to evolutionary algorithms, as it limits the number of available evaluations.
One solution are surrogate models, which replace the expensive objective.
A particular issue in this context are hierarchical variables.
Hierarchical variables only influence the objective function if other variables satisfy some condition.
We study how this kind of hierarchical structure can be integrated into the model based optimization framework.
We discuss an existing kernel and propose alternatives.
An artificial test function is used to investigate how different kernels and assumptions affect model quality and search performance.
Active range sensing using structured-light is the most accurate and reliable method for obtaining 3D information.
However, most of the work has been limited to range sensing of static objects, and range sensing of dynamic (moving or deforming) objects has been investigated recently only by a few researchers.
Sinusoidal structured-light is one of the well-known optical methods for 3D measurement.
In this paper, we present a novel method for rapid high-resolution range imaging using color sinusoidal pattern.
We consider the real-world problem of nonlinearity and color-band crosstalk in the color light projector and color camera, and present methods for accurate recovery of color-phase.
For high-resolution ranging, we use high-frequency patterns and describe new unwrapping algorithms for reliable range recovery.
The experimental results demonstrate the effectiveness of our methods.
The upcoming big data era is likely to demand tremendous computation and storage resources for communications.
By pushing computation and storage to network edges, fog radio access networks (Fog-RAN) can effectively increase network throughput and reduce transmission latency.
Furthermore, we can exploit the benefits of cache enabled architecture in Fog-RAN to deliver contents with low latency.
Radio access units (RAUs) need content delivery from fog servers through wireline links whereas multiple mobile devices acquire contents from RAUs wirelessly.
This work proposes a unified low-rank matrix completion (LRMC) approach to solving the content delivery problem in both wireline and wireless parts of Fog-RAN.
To attain a low caching latency, we present a high precision approach with Riemannian trust-region method to solve the challenging LRMC problem by exploiting the quotient manifold geometry of fixed-rank matrices.
Numerical results show that the new approach has a faster convergence rate, is able to achieve optimal results, and outperforms other state-of-art algorithms.
This work presents an efficient method to solve a class of continuous-time, continuous-space stochastic optimal control problems of robot motion in a cluttered environment.
The method builds upon a path integral representation of the stochastic optimal control problem that allows computation of the optimal solution through sampling and estimation process.
As this sampling process often leads to a local minimum especially when the state space is highly non-convex due to the obstacle field, we present an efficient method to alleviate this issue by devising a proposed topological motion planning algorithm.
Combined with a receding-horizon scheme in execution of the optimal control solution, the proposed method can generate a dynamically feasible and collision-free trajectory while reducing concern about local optima.
Illustrative numerical examples are presented to demonstrate the applicability and validity of the proposed approach.
The Software Defined Networking (SDN) paradigm decouples control and data planes, offering high programmability and a global view of the network.
However, it is a challenge not only provide security in these next generation networks as well as allow that network attacks could be subjected to an incident and forensic treatment procedure.
This paper proposes the implementation of flexible mechanisms of monitoring and treatment of security events categorized per type of attack and associated with whitelist and blacklist resources by means of the SDN controller programmability.
The resources to perform intrusion and attack analysis are validated by means of a real SDN/OpenFlow testbed.
In this paper, we propose a useful replacement for quicksort-style utility functions.
The replacement is called Symmetry Partition Sort, which has essentially the same principle as Proportion Extend Sort.
The maximal difference between them is that the new algorithm always places already partially sorted inputs (used as a basis for the proportional extension) on both ends when entering the partition routine.
This is advantageous to speeding up the partition routine.
The library function based on the new algorithm is more attractive than Psort which is a library function introduced in 2004.
Its implementation mechanism is simple.
The source code is clearer.
The speed is faster, with O(n log n) performance guarantee.
Both the robustness and adaptivity are better.
As a library function, it is competitive.
Attacks against the control processor of a power-grid system, especially zero-day attacks, can be catastrophic.
Earlier detection of the attacks can prevent further damage.
However, detecting zero-day attacks can be challenging because they have no known code and have unknown behavior.
In order to address the zero-day attack problem, we propose a data-driven defense by training a temporal deep learning model, using only normal data from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller.
Then, we can quickly find malicious codes running on the processor, by estimating deviations from the normal behavior with a statistical test.
Experimental results on a real power-grid controller show that we can detect anomalous behavior with over 99.9% accuracy and nearly zero false positives.
Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function.
On the other hand, Bayesian methods, such as Gaussian Processes (GPs), exploit prior knowledge to quickly infer the shape of a new function at test time.
Yet GPs are computationally expensive, and it can be hard to design appropriate priors.
In this paper we propose a family of neural models, Conditional Neural Processes (CNPs), that combine the benefits of both.
CNPs are inspired by the flexibility of stochastic processes such as GPs, but are structured as neural networks and trained via gradient descent.
CNPs make accurate predictions after observing only a handful of training data points, yet scale to complex functions and large datasets.
We demonstrate the performance and versatility of the approach on a range of canonical machine learning tasks, including regression, classification and image completion.
Multiple-antenna "based" transmitter (TX) cooperation has been established as a promising tool towards avoiding, aligning, or shaping the interference resulting from aggressive spectral reuse.
The price paid in the form of feedback and exchanging channel state information (CSI) between cooperating devices in most existing methods is often underestimated however.
In reality, feedback and information overhead threatens the practicality and scalability of TX cooperation approaches in dense networks.
Hereby we addresses a "Who needs to know what?" problem, when it comes to CSI at cooperating transmitters.
A comprehensive answer to this question remains beyond our reach and the scope of this paper.
Nevertheless, recent results in this area suggest that CSI overhead can be contained for even large networks provided the allocation of feedback to TXs is made non-uniform and to properly depend on the network's topology.
This paper provides a few hints toward solving the problem.
Humans have an unparalleled visual intelligence and can overcome visual ambiguities that machines currently cannot.
Recent works have shown that incorporating guidance from humans during inference for monocular viewpoint-estimation can help overcome difficult cases in which the computer-alone would have otherwise failed.
These hybrid intelligence approaches are hence gaining traction.
However, deciding what question to ask the human at inference time remains an unknown for these problems.
We address this question by formulating it as an Adviser Problem: can we learn a mapping from the input to a specific question to ask the human to maximize the expected positive impact to the overall task?
We formulate a solution to the adviser problem for viewpoint estimation using a deep network where the question asks for the location of a keypoint in the input image.
We show that by using the Adviser Network's recommendations, the model and the human outperforms the previous hybrid-intelligence state-of-the-art by 3.7%, and the computer-only state-of-the-art by 5.28% absolute.
In many multirobot applications, planning trajectories in a way to guarantee that the collective behavior of the robots satisfies a certain high-level specification is crucial.
Motivated by this problem, we introduce counting temporal logics---formal languages that enable concise expression of multirobot task specifications over possibly infinite horizons.
We first introduce a general logic called counting linear temporal logic plus (cLTL+), and propose an optimization-based method that generates individual trajectories such that satisfaction of a given cLTL+ formula is guaranteed when these trajectories are synchronously executed.
We then introduce a fragment of cLTL+, called counting linear temporal logic (cLTL), and show that a solution to planning problem with cLTL constraints can be obtained more efficiently if all robots have identical dynamics.
In the second part of the paper, we relax the synchrony assumption and discuss how to generate trajectories that can be asynchronously executed, while preserving the satisfaction of the desired cLTL+ specification.
In particular, we show that when the asynchrony between robots is bounded, the method presented in this paper can be modified to generate robust trajectories.
We demonstrate these ideas with an experiment and provide numerical results that showcase the scalability of the method.
Discovering automatically the semantic structure of tagged visual data (e.g. web videos and images) is important for visual data analysis and interpretation, enabling the machine intelligence for effectively processing the fast-growing amount of multi-media data.
However, this is non-trivial due to the need for jointly learning underlying correlations between heterogeneous visual and tag data.
The task is made more challenging by inherently sparse and incomplete tags.
In this work, we develop a method for modelling the inherent visual data concept structures based on a novel Hierarchical-Multi-Label Random Forest model capable of correlating structured visual and tag information so as to more accurately interpret the visual semantics, e.g. disclosing meaningful visual groups with similar high-level concepts, and recovering missing tags for individual visual data samples.
Specifically, our model exploits hierarchically structured tags of different semantic abstractness and multiple tag statistical correlations in addition to modelling visual and tag interactions.
As a result, our model is able to discover more accurate semantic correlation between textual tags and visual features, and finally providing favourable visual semantics interpretation even with highly sparse and incomplete tags.
We demonstrate the advantages of our proposed approach in two fundamental applications, visual data clustering and missing tag completion, on benchmarking video (i.e.TRECVID MED 2011) and image (i.e.NUS-WIDE) datasets.
As ontologies proliferate and automatic reasoners become more powerful, the problem of protecting sensitive information becomes more serious.
In particular, as facts can be inferred from other facts, it becomes increasingly likely that information included in an ontology, while not itself deemed sensitive, may be able to be used to infer other sensitive information.
We first consider the problem of testing an ontology for safeness defined as its not being able to be used to derive any sensitive facts using a given collection of inference rules.
We then consider the problem of optimizing an ontology based on the criterion of making as much useful information as possible available without revealing any sensitive facts.
CT protocol design and quality control would benefit from automated tools to estimate the quality of generated CT images.
These tools could be used to identify erroneous CT acquisitions or refine protocols to achieve certain signal to noise characteristics.
This paper investigates blind estimation methods to determine global signal strength and noise levels in chest CT images.
Methods: We propose novel performance metrics corresponding to the accuracy of noise and signal estimation.
We implement and evaluate the noise estimation performance of six spatial- and frequency- based methods, derived from conventional image filtering algorithms.
Algorithms were tested on patient data sets from whole-body repeat CT acquisitions performed with a higher and lower dose technique over the same scan region.
Results: The proposed performance metrics can evaluate the relative tradeoff of filter parameters and noise estimation performance.
The proposed automated methods tend to underestimate CT image noise at low-flux levels.
Initial application of methodology suggests that anisotropic diffusion and Wavelet-transform based filters provide optimal estimates of noise.
Furthermore, methodology does not provide accurate estimates of absolute noise levels, but can provide estimates of relative change and/or trends in noise levels.
Ad hoc network is a collection of wireless mobile nodes that dynamically form a temporary network without the use of any existing network infrastructure or centralized administration.
A cognitive radio is a radio that can change its transmitter parameters based on interaction with the environment in which it operates.
The basic idea of cognitive radio networks is that the unlicensed devices (cognitive radio users or secondary users) need to vacate the spectrum band once the licensed device (primary user) is detected.
Cognitive capability and reconfigurability are the key characteristics of cognitive radio.
Routing is an important issue in Mobile Cognitive Radio Ad Hoc Networks (MCRAHNs).
In this paper, a survey of routing protocols for mobile cognitive radio ad networks is discussed.
The continuing expansion of Internet media consumption has increased traffic volumes, and hence congestion, on access links.
In response, both mobile and wireline ISPs must either increase capacity or perform traffic engineering over existing resources.
Unfortunately, provisioning timescales are long, the process is costly, and single-homing means operators cannot balance across the last mile.
Inspired by energy and transport networks, we propose demand-side management of users to reduce the impact caused by consumption patterns out-pacing that of edge network provision.
By directly affecting user behaviour through a range of incentives, our techniques enable resource management over shorter timescales than is possible in conventional networks.
Using survey data from 100 participants we explore the feasibility of introducing the principles of demand-side management in today's networks.
The Good is Blondie, a wandering gunman with a strong personal sense of honor.
The Bad is Angel Eyes, a sadistic hitman who always hits his mark.
The Ugly is Tuco, a Mexican bandit who's always only looking out for himself.
Against the backdrop of the BOWS contest, they search for a watermark in gold buried in three images.
Each knows only a portion of the gold's exact location, so for the moment they're dependent on each other.
However, none are particularly inclined to share...
Writing concurrent programs for shared memory multiprocessor systems is a nightmare.
This hinders users to exploit the full potential of multiprocessors.
STM (Software Transactional Memory) is a promising concurrent programming paradigm which addresses woes of programming for multiprocessor systems.
In this paper, we implement BTO (Basic Timestamp Ordering), SGT (Serialization Graph Testing) and MVTO(Multi-Version Time-Stamp Ordering) concurrency control protocols and build an STM(Software Transactional Memory) library to evaluate the performance of these protocols.
The deferred write approach is followed to implement the STM.
A SET data structure is implemented using the transactions of our STM library.
And this transactional SET is used as a test application to evaluate the STM.
The performance of the protocols is rigorously compared against the linked-list module of the Synchrobench benchmark.
Linked list module implements SET data structure using lazy-list, lock-free list, lock-coupling list and ESTM (Elastic Software Transactional Memory).
Our analysis shows that for a number of threads greater than 60 and update rate 70%, BTO takes (17% to 29%) and (6% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively.
MVTO takes (13% to 24%) and (3% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively.
BTO and MVTO have similar per thread CPU time.
BTO and MVTO outperform SGT by 9% to 36%.
The multi-criteria decision making, which is possible with the advent of skyline queries, has been applied in many areas.
Though most of the existing research is concerned with only a single relation, several real world applications require finding the skyline set of records over multiple relations.
Consequently, the join operation over skylines where the preferences are local to each relation, has been proposed.
In many of those cases, however, the join often involves performing aggregate operations among some of the attributes from the different relations.
In this paper, we introduce such queries as "aggregate skyline join queries".
Since the naive algorithm is impractical, we propose three algorithms to efficiently process such queries.
The algorithms utilize certain properties of skyline sets, and processes the skylines as much as possible locally before computing the join.
Experiments with real and synthetic datasets exhibit the practicality and scalability of the algorithms with respect to the cardinality and dimensionality of the relations.
We investigate the problem of inconsistency measurement on large knowledge bases by considering stream-based inconsistency measurement, i.e., we investigate inconsistency measures that cannot consider a knowledge base as a whole but process it within a stream.
For that, we present, first, a novel inconsistency measure that is apt to be applied to the streaming case and, second, stream-based approximations for the new and some existing inconsistency measures.
We conduct an extensive empirical analysis on the behavior of these inconsistency measures on large knowledge bases, in terms of runtime, accuracy, and scalability.
We conclude that for two of these measures, the approximation of the new inconsistency measure and an approximation of the contension inconsistency measure, large-scale inconsistency measurement is feasible.
In this paper, we establish the matroid structures corresponding to data-local and local maximally recoverable codes (MRC).
The matroid structures of these codes can be used to determine the associated Tutte polynomial.
Greene proved that the weight enumerators of any code can be determined from its associated Tutte polynomial.
We will use this result to derive explicit expressions for the weight enumerators of data-local and local MRC.
Also, Britz proved that the higher support weights of any code can be determined from its associated Tutte polynomial.
We will use this result to derive expressions for the higher support weights of data-local and local MRC with two local codes.
We address the problem of bootstrapping language acquisition for an artificial system similarly to what is observed in experiments with human infants.
Our method works by associating meanings to words in manipulation tasks, as a robot interacts with objects and listens to verbal descriptions of the interactions.
The model is based on an affordance network, i.e., a mapping between robot actions, robot perceptions, and the perceived effects of these actions upon objects.
We extend the affordance model to incorporate spoken words, which allows us to ground the verbal symbols to the execution of actions and the perception of the environment.
The model takes verbal descriptions of a task as the input and uses temporal co-occurrence to create links between speech utterances and the involved objects, actions, and effects.
We show that the robot is able form useful word-to-meaning associations, even without considering grammatical structure in the learning process and in the presence of recognition errors.
These word-to-meaning associations are embedded in the robot's own understanding of its actions.
Thus, they can be directly used to instruct the robot to perform tasks and also allow to incorporate context in the speech recognition task.
We believe that the encouraging results with our approach may afford robots with a capacity to acquire language descriptors in their operation's environment as well as to shed some light as to how this challenging process develops with human infants.
In this paper, we propose a learning rule based on a back-propagation (BP) algorithm that can be applied to a hardware-based deep neural network (HW-DNN) using electronic devices that exhibit discrete and limited conductance characteristics.
This adaptive learning rule, which enables forward, backward propagation, as well as weight updates in hardware, is helpful during the implementation of power-efficient and high-speed deep neural networks.
In simulations using a three-layer perceptron network, we evaluate the learning performance according to various conductance responses of electronic synapse devices and weight-updating methods.
It is shown that the learning accuracy is comparable to that obtained when using a software-based BP algorithm when the electronic synapse device has a linear conductance response with a high dynamic range.
Furthermore, the proposed unidirectional weight-updating method is suitable for electronic synapse devices which have nonlinear and finite conductance responses.
Because this weight-updating method can compensate the demerit of asymmetric weight updates, we can obtain better accuracy compared to other methods.
This adaptive learning rule, which can be applied to full hardware implementation, can also compensate the degradation of learning accuracy due to the probable device-to-device variation in an actual electronic synapse device.
A simple proof is given for the monotonicity of entropy and Fisher information associated to sums of i.i.d. random variables.
The proof relies on a characterization of maximal correlation for partial sums due to Dembo, Kagan and Shepp.
Despite huge success of artificial intelligence, hardware systems running these algorithms consume orders of magnitude higher energy compared to the human brain, mainly due to heavy data movements between the memory unit and the computation cores.
Spiking neural networks (SNNs) built using bio-plausible neuron and synaptic models have emerged as the power-efficient choice for designing cognitive applications.
These algorithms involve several lookup-table (LUT) based function evaluations such as high-order polynomials and transcendental functions for solving complex neuro-synaptic models, that typically require additional storage.
To that effect, we propose `SPARE' - an in-memory, distributed processing architecture built on ROM-embedded RAM technology, for accelerating SNNs.
ROM-embedded RAMs allow storage of LUTs, embedded within a typical memory array, without additional area overhead.
Our proposed architecture consists of a 2-D array of Processing Elements (PEs).
Since most of the computations are done locally within each PE, unnecessary data transfers are restricted, thereby alleviating the von-Neumann bottleneck.
We evaluate SPARE for two different ROM-Embedded RAM structures - CMOS based ROM-Embedded SRAMs (R-SRAMs) and STT-MRAM based ROM-Embedded MRAMs (R-MRAMs).
Moreover, we analyze trade-offs in terms of energy, area and performance, for using the two technologies on a range of image classification benchmarks.
Furthermore, we leverage the additional storage density to implement complex neuro-synaptic functionalities.
This enhances the utility of the proposed architecture by provisioning implementation of any neuron/synaptic behavior as necessitated by the application.
Our results show up-to 1.75x, 1.95x and 1.95x improvement in energy, iso-storage area, and iso-area performance, respectively, by using neural network accelerators built on ROM-embedded RAM primitives.
Subgraph discovery in a single data graph---finding subsets of vertices and edges satisfying a user-specified criteria---is an essential and general graph analytics operation with a wide spectrum of applications.
Depending on the criteria, subgraphs of interest may correspond to cliques of friends in social networks, interconnected entities in RDF data, or frequent patterns in protein interaction networks to name a few.
Existing systems usually examine a large number of subgraphs while employing many computers and often produce an enormous result set of subgraphs.
How can we enable fast discovery of only the most relevant subgraphs while minimizing the computational requirements?
We present Nuri, a general subgraph discovery system that allows users to succinctly specify subgraphs of interest and criteria for ranking them.
Given such specifications, Nuri efficiently finds the k most relevant subgraphs using only a single computer.
It prioritizes (i.e., expands earlier than others) subgraphs that are more likely to expand into the desired subgraphs (prioritized subgraph expansion) and proactively discards irrelevant subgraphs from which the desired subgraphs cannot be constructed (pruning).
Nuri can also efficiently store and retrieve a large number of subgraphs on disk without being limited by the size of main memory.
We demonstrate using both real and synthetic datasets that Nuri on a single core outperforms the closest alternative distributed system consuming 40 times more computational resources by more than 2 orders of magnitude for clique discovery and 1 order of magnitude for subgraph isomorphism and pattern mining.
We propose a simple, yet powerful regularization technique that can be used to significantly improve both the pairwise and triplet losses in learning local feature descriptors.
The idea is that in order to fully utilize the expressive power of the descriptor space, good local feature descriptors should be sufficiently "spread-out" over the space.
In this work, we propose a regularization term to maximize the spread in feature descriptor inspired by the property of uniform distribution.
We show that the proposed regularization with triplet loss outperforms existing Euclidean distance based descriptor learning techniques by a large margin.
As an extension, the proposed regularization technique can also be used to improve image-level deep feature embedding.
While there are many approaches for automatically proving termination of term rewrite systems, up to now there exist only few techniques to disprove their termination automatically.
Almost all of these techniques try to find loops, where the existence of a loop implies non-termination of the rewrite system.
However, most programming languages use specific evaluation strategies, whereas loop detection techniques usually do not take strategies into account.
So even if a rewrite system has a loop, it may still be terminating under certain strategies.
Therefore, our goal is to develop decision procedures which can determine whether a given loop is also a loop under the respective evaluation strategy.
In earlier work, such procedures were presented for the strategies of innermost, outermost, and context-sensitive evaluation.
In the current paper, we build upon this work and develop such decision procedures for important strategies like leftmost-innermost, leftmost-outermost, (max-)parallel-innermost, (max-)parallel-outermost, and forbidden patterns (which generalize innermost, outermost, and context-sensitive strategies).
In this way, we obtain the first approach to disprove termination under these strategies automatically.
Generative adversarial networks (GANs) are a class of deep generative models which aim to learn a target distribution in an unsupervised fashion.
While they were successfully applied to many problems, training a GAN is a notoriously challenging task and requires a significant amount of hyperparameter tuning, neural architecture engineering, and a non-trivial amount of "tricks".
The success in many practical applications coupled with the lack of a measure to quantify the failure modes of GANs resulted in a plethora of proposed losses, regularization and normalization schemes, and neural architectures.
In this work we take a sober view of the current state of GANs from a practical perspective.
We reproduce the current state of the art and go beyond fairly exploring the GAN landscape.
We discuss common pitfalls and reproducibility issues, open-source our code on Github, and provide pre-trained models on TensorFlow Hub.
Spreadsheets provide a flexible and easy to use software development environment, but that leads to error proneness.
Work has been done to prevent errors in spreadsheets, including using models to specify distinct parts of a spreadsheet as it is done with model-driven software development.
Previous model languages for spreadsheets offer a limited expressiveness, and cannot model several features present in most real world spreadsheets.
In this paper, the modeling language Tabula is introduced.
It extends previous spreadsheet models with features like type constraints and nested classes with repetitions.
Tabula is not only more expressive than other models but it can also be extended with more features.
Moreover, Tabula includes a bidirectional transformation engine that guarantees synchronization after an update either in the model or spreadsheet.
Deep learning has demonstrated abilities to learn complex structures, but they can be restricted by available data.
Recently, Consensus Networks (CNs) were proposed to alleviate data sparsity by utilizing features from multiple modalities, but they too have been limited by the size of labeled data.
In this paper, we extend CN to Transductive Consensus Networks (TCNs), suitable for semi-supervised learning.
In TCNs, different modalities of input are compressed into latent representations, which we encourage to become indistinguishable during iterative adversarial training.
To understand TCNs two mechanisms, consensus and classification, we put forward its three variants in ablation studies on these mechanisms.
To further investigate TCN models, we treat the latent representations as probability distributions and measure their similarities as the negative relative Jensen-Shannon divergences.
We show that a consensus state beneficial for classification desires a stable but imperfect similarity between the representations.
Overall, TCNs outperform or align with the best benchmark algorithms given 20 to 200 labeled samples on the Bank Marketing and the DementiaBank datasets.
A generalized Nonlinear Fourier Transform (GNFT), which includes eigenvalues of higher multiplicity, is considered for information transmission over fiber optic channels.
Numerical algorithms are developed to compute the direct and inverse GNFTs.
For closely-spaced eigenvalues, examples suggest that the GNFT is more robust than the NFT to the practical impairments of truncation, discretization, attenuation and noise.
Communication using a soliton with one double eigenvalue is numerically demonstrated, and its information rates are compared to solitons with one and two simple eigenvalues.
We present a novel solution for Channel Assignment Problem (CAP) in Device-to-Device (D2D) wireless networks that takes into account the throughput estimation noise.
CAP is known to be NP-hard in the literature and there is no practical optimal learning algorithm that takes into account the estimation noise.
In this paper, we first formulate the CAP as a stochastic optimization problem to maximize the expected sum data rate.
To capture the estimation noise, CAP is modeled as a noisy potential game, a novel notion we introduce in this paper.
Then, we propose a distributed Binary Log-linear Learning Algorithm (BLLA) that converges to the optimal channel assignments.
Convergence of BLLA is proved for bounded and unbounded noise.
Proofs for fixed and decreasing temperature parameter of BLLA are provided.
A sufficient number of estimation samples is given that guarantees the convergence to the optimal state.
We assess the performance of BLLA by extensive simulations, which show that the sum data rate increases with the number of channels and users.
Contrary to the better response algorithm, the proposed algorithm achieves the optimal channel assignments distributively even in presence of estimation noise.
This paper defines The Dead Cryptographers Society Problem - DCS (where several great cryptographers created many polynomial-time Deterministic Turing Machines (DTMs) of a specific type, ran them on their proper descriptions concatenated with some arbitrary strings, deleted them and left only the results from those running, after they died: if those DTMs only permute and sometimes invert the bits on input, is it possible to decide the language formed by such resulting strings within polynomial time?
), proves some facts about its computational complexity, and discusses some possible uses on Cryptography, such as into distance keys distribution, online reverse auction and secure communication.
Receiver-initiated medium access control protocols for wireless sensor networks are theoretically able to adapt to changing network conditions in a distributed manner.
However, existing algorithms rely on fixed beacon rates at each receiver.
We present a new received initiated MAC protocol that adapts the beacon rate at each receiver to its actual traffic load.
Our proposal uses a computationally inexpensive formula for calculating the optimum beacon rate that minimizes network energy consumption and, so, it can be easily adopted by receivers.
Simulation results show that our proposal reduces collisions and diminishes delivery time maintaining a low duty cycle.
Online harassment has been a problem to a greater or lesser extent since the early days of the internet.
Previous work has applied anti-spam techniques like machine-learning based text classification (Reynolds, 2011) to detecting harassing messages.
However, existing public datasets are limited in size, with labels of varying quality.
The #HackHarassment initiative (an alliance of 1 tech companies and NGOs devoted to fighting bullying on the internet) has begun to address this issue by creating a new dataset superior to its predecssors in terms of both size and quality.
As we (#HackHarassment) complete further rounds of labelling, later iterations of this dataset will increase the available samples by at least an order of magnitude, enabling corresponding improvements in the quality of machine learning models for harassment detection.
In this paper, we introduce the first models built on the #HackHarassment dataset v1.0 (a new open dataset, which we are delighted to share with any interested researcherss) as a benchmark for future research.
Unambiguous non-deterministic finite automata have intermediate expressive power and succinctness between deterministic and non-deterministic automata.
It has been conjectured that every unambiguous non-deterministic one-way finite automaton (1UFA) recognizing some language L can be converted into a 1UFA recognizing the complement of the original language L with polynomial increase in the number of states.
We disprove this conjecture by presenting a family of 1UFAs on a single-letter alphabet such that recognizing the complements of the corresponding languages requires superpolynomial increase in the number of states even for generic non-deterministic one-way finite automata.
We also note that both the languages and their complements can be recognized by sweeping deterministic automata with a linear increase in the number of states.
Source code is rarely written in isolation.
It depends significantly on the programmatic context, such as the class that the code would reside in.
To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class.
This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to "return the smallest element" in a particular member variable list).
We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment.
We also present a detailed error analysis suggesting that there is significant room for future work on this task.
In the field of digital image processing, JPEG image compression technique has been widely applied.
And numerous image processing software suppose this.
It is likely for the images undergoing double JPEG compression to be tampered.
Therefore, double JPEG compression detection schemes can provide an important clue for image forgery detection.
In this paper, we propose an effective algorithm to detect double JPEG compression with different quality factors.
Firstly, the quantized DCT coefficients with same frequency are extracted to build the new data matrices.
Then, considering the direction effect on the correlation between the adjacent positions in DCT domain, twelve kinds of high-pass filter templates with different directions are executed and the translation probability matrix is calculated for each filtered data.
Furthermore, principal component analysis and support vector machine technique are applied to reduce the feature dimension and train a classifier, respectively.
Experimental results have demonstrated that the proposed method is effective and has comparable performance.
Semantic NLP applications often rely on dependency trees to recognize major elements of the proposition structure of sentences.
Yet, while much semantic structure is indeed expressed by syntax, many phenomena are not easily read out of dependency trees, often leading to further ad-hoc heuristic post-processing or to information loss.
To directly address the needs of semantic applications, we present PropS -- an output representation designed to explicitly and uniformly express much of the proposition structure which is implied from syntax, and an associated tool for extracting it from dependency trees.
Instance-level human parsing towards real-world human analysis scenarios is still under-explored due to the absence of sufficient data resources and technical difficulty in parsing multiple instances in a single pass.
Several related works all follow the "parsing-by-detection" pipeline that heavily relies on separately trained detection models to localize instances and then performs human parsing for each instance sequentially.
Nonetheless, two discrepant optimization targets of detection and parsing lead to suboptimal representation learning and error accumulation for final results.
In this work, we make the first attempt to explore a detection-free Part Grouping Network (PGN) for efficiently parsing multiple people in an image in a single pass.
Our PGN reformulates instance-level human parsing as two twinned sub-tasks that can be jointly learned and mutually refined via a unified network: 1) semantic part segmentation for assigning each pixel as a human part (e.g., face, arms); 2) instance-aware edge detection to group semantic parts into distinct person instances.
Thus the shared intermediate representation would be endowed with capabilities in both characterizing fine-grained parts and inferring instance belongings of each part.
Finally, a simple instance partition process is employed to get final results during inference.
We conducted experiments on PASCAL-Person-Part dataset and our PGN outperforms all state-of-the-art methods.
Furthermore, we show its superiority on a newly collected multi-person parsing dataset (CIHP) including 38,280 diverse images, which is the largest dataset so far and can facilitate more advanced human analysis.
The CIHP benchmark and our source code are available at http://sysu-hcp.net/lip/.
We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences.
In contrast with prior work on tree-structured models in which the trees are either provided as input or predicted using supervision from explicit treebank annotations, the tree structures in this work are optimized to improve performance on a downstream task.
Experiments demonstrate the benefit of learning task-specific composition orders, outperforming both sequential encoders and recursive encoders based on treebank annotations.
We analyze the induced trees and show that while they discover some linguistically intuitive structures (e.g., noun phrases, simple verb phrases), they are different than conventional English syntactic structures.
Recurrent neural networks are a powerful means to cope with time series.
We show how a type of linearly activated recurrent neural networks can approximate any time-dependent function f(t) given by a number of function values.
The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed.
Furthermore, the network size can be reduced by taking only the most relevant components of the network.
Thus, in contrast to others, our approach not only learns network weights but also the network architecture.
The networks have interesting properties: They end up in ellipse trajectories in the long run and allow the prediction of further values and compact representations of functions.
We demonstrate this by several experiments, among them multiple superimposed oscillators (MSO) and robotic soccer.
Predictive neural networks outperform the previous state-of-the-art for the MSO task with a minimal number of units.
A constantly growing amount of information is available through the web.
Unfortunately, extracting useful content from this massive amount of data still remains an open issue.
The lack of standard data models and structures forces developers to create adhoc solutions from the scratch.
The figure of the expert is still needed in many situations where developers do not have the correct background knowledge.
This forces developers to spend time acquiring the needed background from the expert.
In other directions, there are promising solutions employing machine learning techniques.
However, increasing accuracy requires an increase in system complexity that cannot be endured in many projects.
In this work, we approach the web knowledge extraction problem using an expertcentric methodology.
This methodology defines a set of configurable, extendible and independent components that permit the reutilisation of large pieces of code among projects.
Our methodology differs from similar solutions in its expert-driven design.
This design, makes it possible for subject-matter expert to drive the knowledge extraction for a given set of documents.
Additionally, we propose the utilization of machine assisted solutions that guide the expert during this process.
To demonstrate the capabilities of our methodology, we present a real use case scenario in which public procurement data is extracted from the web-based repositories of several public institutions across Europe.
We provide insightful details about the challenges we had to deal with in this use case and additional discussions about how to apply our methodology.
In this paper, we consider the problem of allocating cache resources among multiple content providers.
The cache can be partitioned into slices and each partition can be dedicated to a particular content provider, or shared among a number of them.
It is assumed that each partition employs the LRU policy for managing content.
We propose utility-driven partitioning, where we associate with each content provider a utility that is a function of the hit rate observed by the content provider.
We consider two scenarios: i)~content providers serve disjoint sets of files, ii)~there is some overlap in the content served by multiple content providers.
In the first case, we prove that cache partitioning outperforms cache sharing as cache size and numbers of contents served by providers go to infinity.
In the second case, It can be beneficial to have separate partitions for overlapped content.
In the case of two providers, it is usually always beneficial to allocate a cache partition to serve all overlapped content and separate partitions to serve the non-overlapped contents of both providers.
We establish conditions when this is true asymptotically but also present an example where it is not true asymptotically.
We develop online algorithms that dynamically adjust partition sizes in order to maximize the overall utility and prove that they converge to optimal solutions, and through numerical evaluations, we show they are effective.
Nowadays, Tehran Urban and Suburban Railway System (TUSRS) is going to be completed by eight lines and 149 stations.
This complex transportation system contains 168 links between each station pairs and 20 cross-section and Y-branch stations among all eight lines.
In this study, we considered TUSRS as a complex network and undertook several analyzes based on graph theory.
Examining e.g. centrality measures, we identified central stations within TUSRS.
This analysis could be useful for redistributing strategy of the overcrowded stations and improving the organization of maintaining system.
These findings are also promising for better designing the systems of tomorrow in other metropolitan areas in Iran.
A recent study reported development of Muscorian, a generic text processing tool for extracting protein-protein interactions from text that achieved comparable performance to biomedical-specific text processing tools.
This result was unexpected since potential errors from a series of text analysis processes is likely to adversely affect the outcome of the entire process.
Most biomedical entity relationship extraction tools have used biomedical-specific parts-of-speech (POS) tagger as errors in POS tagging and are likely to affect subsequent semantic analysis of the text, such as shallow parsing.
This study aims to evaluate the parts-of-speech (POS) tagging accuracy and attempts to explore whether a comparable performance is obtained when a generic POS tagger, MontyTagger, was used in place of MedPost, a tagger trained in biomedical text.
Our results demonstrated that MontyTagger, Muscorian's POS tagger, has a POS tagging accuracy of 83.1% when tested on biomedical text.
Replacing MontyTagger with MedPost did not result in a significant improvement in entity relationship extraction from text; precision of 55.6% from MontyTagger versus 56.8% from MedPost on directional relationships and 86.1% from MontyTagger compared to 81.8% from MedPost on nondirectional relationships.
This is unexpected as the potential for poor POS tagging by MontyTagger is likely to affect the outcome of the information extraction.
An analysis of POS tagging errors demonstrated that 78.5% of tagging errors are being compensated by shallow parsing.
Thus, despite 83.1% tagging accuracy, MontyTagger has a functional tagging accuracy of 94.6%.
Millimeter wave (mm-wave) and massive MIMO have been proposed for next generation wireless systems.
However, there are many open problems for the implementation of those technologies.
In particular, beamforming is necessary in mm-wave systems in order to counter high propagation losses.
However, conventional beamsteering is not always appropriate in rich scattering multipath channels with frequency selective fading, such as those found in indoor environments.
In this context, time-reversal (TR) is considered a promising beamforming technique for such mm-wave massive MIMO systems.
In this paper, we analyze a baseband TR beamforming system for mm-wave multi-user massive MIMO.
We verify that, as the number of antennas increases, TR yields good equalization and interference mitigation properties, but inter-user interference (IUI) remains a main impairment.
Thus, we propose a novel technique called interference-nulling TR (INTR) to minimize IUI.
We evaluate numerically the performance of INTR and compare it with conventional TR and equalized TR beamforming.
We use a 60 GHz MIMO channel model with spatial correlation based on the IEEE 802.11ad SISO NLoS model.
We demonstrate that INTR outperforms conventional TR with respect to average BER per user and achievable sum rate under diverse conditions, providing both diversity and multiplexing gains simultaneously.
Virtualization technology allows currently any application run any application complex and expensive computational (the scientific applications are a good example) on heterogeneous distributed systems, which make regular use of Grid and Cloud technologies, enabling significant savings in computing time.
This model is particularly interesting for the mass execution of scientific simulations and calculations, allowing parallel execution of applications using the same execution environment (unchanged) used by the scientist as usual.
However, the use and distribution of large virtual images can be a problem (up to tens of GBytes), which is aggravated when attempting a mass mailing on a large number of distributed computers.
This work has as main objective to present an analysis of how implementation and a proposal for the improvement (reduction in size) of the virtual images pretending reduce distribution time in distributed systems.
This analysis is done very specific requirements that need an operating system (guest OS) on some aspects of its execution.
Moments capture a huge part of our lives.
Accurate recognition of these moments is challenging due to the diverse and complex interpretation of the moments.
Action recognition refers to the act of classifying the desired action/activity present in a given video.
In this work, we perform experiments on Moments in Time dataset to recognize accurately activities occurring in 3 second clips.
We use state of the art techniques for visual, auditory and spatio temporal localization and develop method to accurately classify the activity in the Moments in Time dataset.
Our novel approach of using Visual Based Textual features and fusion techniques performs well providing an overall 89.23 % Top - 5 accuracy on the 20 classes - a significant improvement over the Baseline TRN model.
Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network.
We propose a general approach for distributed learning based on combining local estimators defined by pseudo-likelihood components, encompassing a number of combination methods, and provide both theoretical and experimental analysis.
We show that simple linear combination or max-voting methods, when combined with second-order information, are statistically competitive with more advanced and costly joint optimization.
Our algorithms have many attractive properties including low communication and computational cost and "any-time" behavior.
This work deals with non-native children's speech and investigates both multi-task and transfer learning approaches to adapt a multi-language Deep Neural Network (DNN) to speakers, specifically children, learning a foreign language.
The application scenario is characterized by young students learning English and German and reading sentences in these second-languages, as well as in their mother language.
The paper analyzes and discusses techniques for training effective DNN-based acoustic models starting from children native speech and performing adaptation with limited non-native audio material.
A multi-lingual model is adopted as baseline, where a common phonetic lexicon, defined in terms of the units of the International Phonetic Alphabet (IPA), is shared across the three languages at hand (Italian, German and English); DNN adaptation methods based on transfer learning are evaluated on significant non-native evaluation sets.
Results show that the resulting non-native models allow a significant improvement with respect to a mono-lingual system adapted to speakers of the target language.
A body of literature has demonstrated that users' mental health conditions, such as depression and anxiety, can be predicted from their social media language.
There is still a gap in the scientific understanding of how psychological stress is expressed on social media.
Stress is one of the primary underlying causes and correlates of chronic physical illnesses and mental health conditions.
In this paper, we explore the language of psychological stress with a dataset of 601 social media users, who answered the Perceived Stress Scale questionnaire and also consented to share their Facebook and Twitter data.
Firstly, we find that stressed users post about exhaustion, losing control, increased self-focus and physical pain as compared to posts about breakfast, family-time, and travel by users who are not stressed.
Secondly, we find that Facebook language is more predictive of stress than Twitter language.
Thirdly, we demonstrate how the language based models thus developed can be adapted and be scaled to measure county-level trends.
Since county-level language is easily available on Twitter using the Streaming API, we explore multiple domain adaptation algorithms to adapt user-level Facebook models to Twitter language.
We find that domain-adapted and scaled social media-based measurements of stress outperform sociodemographic variables (age, gender, race, education, and income), against ground-truth survey-based stress measurements, both at the user- and the county-level in the U.S. Twitter language that scores higher in stress is also predictive of poorer health, less access to facilities and lower socioeconomic status in counties.
We conclude with a discussion of the implications of using social media as a new tool for monitoring stress levels of both individuals and counties.
Context: Surveys constitute an valuable tool to capture a large-scale snapshot of the state of the practice.
Apparently trivial to adopt, surveys hide, however, several pitfalls that might hinder rendering the result valid and, thus, useful.
Goal: We aim at providing an overview of main pitfalls in software engineering surveys and report on practical ways to deal with them.
Method: We build on the experiences we collected in conducting many studies and distill the main lessons learnt.
Results: The eight lessons learnt we report cover different aspects of the survey process ranging from the design of initial research objectives to the design of a questionnaire.
Conclusions: Our hope is that by sharing our lessons learnt, combined with a disciplined application of the general survey theory, we contribute to improving the quality of the research results achievable by employing software engineering surveys.
Infrastructures are not inherently durable or fragile, yet all are fragile over the long term.
Durability requires care and maintenance of individual components and the links between them.
Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries.
Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy.
Methods include interviews, ethnography, and document analysis.
As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components.
Several features of data practices in astronomy contribute to the fragility of that infrastructure.
These include different archiving practices between ground- and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data.
Infrastructure components are tightly coupled, based on international agreements.
However, the durability of these infrastructures relies on much invisible work - cataloging, metadata, and other labor conducted by information professionals.
Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.
Indices and materialized views are physical structures that accelerate data access in data warehouses.
However, these data structures generate some maintenance overhead.
They also share the same storage space.
The existing studies about index and materialized view selection consider these structures separately.
In this paper, we adopt the opposite stance and couple index and materialized view selection to take into account the interactions between them and achieve an efficient storage space sharing.
We develop cost models that evaluate the respective benefit of indexing and view materialization.
These cost models are then exploited by a greedy algorithm to select a relevant configuration of indices and materialized views.
Experimental results show that our strategy performs better than the independent selection of indices and materialized views.
We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated.
It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs.
The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples.
Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought.
Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge.
We show that the technique is effective even when only small numbers of samples are used for training.
Further, since it detects weaknesses in the black-box anonymization scheme it can re-identify nodes in one social network when trained on another.
Generic word embeddings are trained on large-scale generic corpora; Domain Specific (DS) word embeddings are trained only on data from a domain of interest.
This paper proposes a method to combine the breadth of generic embeddings with the specificity of domain specific embeddings.
The resulting embeddings, called Domain Adapted (DA) word embeddings, are formed by aligning corresponding word vectors using Canonical Correlation Analysis (CCA) or the related nonlinear Kernel CCA.
Evaluation results on sentiment classification tasks show that the DA embeddings substantially outperform both generic and DS embeddings when used as input features to standard or state-of-the-art sentence encoding algorithms for classification.
This paper proposes a method based on signal injection to obtain the saturated current-flux relations of a PMSM from locked-rotor experiments.
With respect to the classical method based on time integration, it has the main advantage of being completely independent of the stator resistance; moreover, it is less sensitive to voltage biases due to the power inverter, as the injected signal may be fairly large.
Expressive efficiency refers to the relation between two architectures A and B, whereby any function realized by B could be replicated by A, but there exists functions realized by A, which cannot be replicated by B unless its size grows significantly larger.
For example, it is known that deep networks are exponentially efficient with respect to shallow networks, in the sense that a shallow network must grow exponentially large in order to approximate the functions represented by a deep network of polynomial size.
In this work, we extend the study of expressive efficiency to the attribute of network connectivity and in particular to the effect of "overlaps" in the convolutional process, i.e., when the stride of the convolution is smaller than its filter size (receptive field).
To theoretically analyze this aspect of network's design, we focus on a well-established surrogate for ConvNets called Convolutional Arithmetic Circuits (ConvACs), and then demonstrate empirically that our results hold for standard ConvNets as well.
Specifically, our analysis shows that having overlapping local receptive fields, and more broadly denser connectivity, results in an exponential increase in the expressive capacity of neural networks.
Moreover, while denser connectivity can increase the expressive capacity, we show that the most common types of modern architectures already exhibit exponential increase in expressivity, without relying on fully-connected layers.
Deep neural networks have shown promising results in image inpainting even if the missing area is relatively large.
However, most of the existing inpainting networks introduce undesired artifacts and noise to the repaired regions.
To solve this problem, we present a novel framework which consists of two stacked convolutional neural networks that inpaint the image and remove the artifacts, respectively.
The first network considers the global structure of the damaged image and coarsely fills the blank area.
Then the second network modifies the repaired image to cancel the noise introduced by the first network.
The proposed framework splits the problem into two distinct partitions that can be optimized separately, therefore it can be applied to any inpainting algorithm by changing the first network.
Second stage in our framework which aims at polishing the inpainted images can be treated as a denoising problem where a wide range of algorithms can be employed.
Our results demonstrate that the proposed framework achieves significant improvement on both visual and quantitative evaluations.
Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns typically specified using intuitive sketch-based interfaces.
Despite their potential in accelerating data exploration, more than a decade of past work on VQSs has not been translated to adoption in practice.
Through a year-long collaboration with experts from three diverse domains, we examine the role of VQSs in real data exploration workflows, enhance an existing VQS to support these workflows via a participatory design process, and evaluate how VQS components are used in practice.
Via these observations, we formalize a taxonomy of key capabilities for VQSs, organized by three sensemaking processes.
Perhaps somewhat surprisingly, we find that ad-hoc sketch-based querying is not commonly used during data exploration, since analysts are often unable to precisely articulate the patterns they are interested in.
We find that there is a spectrum of VQS-centric data exploration workflows, depending on the application domain, and that many of these workflows are not effectively supported in present-day VQSs.
Our insights can pave the way for next-generation VQSs to be adopted in a variety of real-world applications.
In this paper, we address the problem of unsupervised video summarization that automatically extracts key-shots from an input video.
Specifically, we tackle two critical issues based on our empirical observations: (i) Ineffective feature learning due to flat distributions of output importance scores for each frame, and (ii) training difficulty when dealing with long-length video inputs.
To alleviate the first problem, we propose a simple yet effective regularization loss term called variance loss.
The proposed variance loss allows a network to predict output scores for each frame with high discrepancy which enables effective feature learning and significantly improves model performance.
For the second problem, we design a novel two-stream network named Chunk and Stride Network (CSNet) that utilizes local (chunk) and global (stride) temporal view on the video features.
Our CSNet gives better summarization results for long-length videos compared to the existing methods.
In addition, we introduce an attention mechanism to handle the dynamic information in videos.
We demonstrate the effectiveness of the proposed methods by conducting extensive ablation studies and show that our final model achieves new state-of-the-art results on two benchmark datasets.
Elastic distortion of fingerprints has a negative effect on the performance of fingerprint recognition systems.
This negative effect brings inconvenience to users in authentication applications.
However, in the negative recognition scenario where users may intentionally distort their fingerprints, this can be a serious problem since distortion will prevent recognition system from identifying malicious users.
Current methods aimed at addressing this problem still have limitations.
They are often not accurate because they estimate distortion parameters based on the ridge frequency map and orientation map of input samples, which are not reliable due to distortion.
Secondly, they are not efficient and requiring significant computation time to rectify samples.
In this paper, we develop a rectification model based on a Deep Convolutional Neural Network (DCNN) to accurately estimate distortion parameters from the input image.
Using a comprehensive database of synthetic distorted samples, the DCNN learns to accurately estimate distortion bases ten times faster than the dictionary search methods used in the previous approaches.
Evaluating the proposed method on public databases of distorted samples shows that it can significantly improve the matching performance of distorted samples.
The training of deep neural nets is expensive.
We present a predictor- corrector method for the training of deep neural nets.
It alternates a predictor pass with a corrector pass using stochastic gradient descent with backpropagation such that there is no loss in validation accuracy.
No special modifications to SGD with backpropagation is required by this methodology.
Our experiments showed a time improvement of 9% on the CIFAR-10 dataset.
Concolic testing combines program execution and symbolic analysis to explore the execution paths of a software program.
This paper presents the first concolic testing approach for Deep Neural Networks (DNNs).
More specifically, we formalise coverage criteria for DNNs that have been studied in the literature, and then develop a coherent method for performing concolic testing to increase test coverage.
Our experimental results show the effectiveness of the concolic testing approach in both achieving high coverage and finding adversarial examples.
Photometric stereo is a method that seeks to reconstruct the normal vectors of an object from a set of images of the object illuminated under different light sources.
While effective in some situations, classical photometric stereo relies on a diffuse surface model that cannot handle objects with complex reflectance patterns, and it is sensitive to non-idealities in the images.
In this work, we propose a novel approach to photometric stereo that relies on dictionary learning to produce robust normal vector reconstructions.
Specifically, we develop two formulations for applying dictionary learning to photometric stereo.
We propose a model that applies dictionary learning to regularize and reconstruct the normal vectors from the images under the classic Lambertian reflectance model.
We then generalize this model to explicitly model non-Lambertian objects.
We investigate both approaches through extensive experimentation on synthetic and real benchmark datasets and observe state-of-the-art performance compared to existing robust photometric stereo methods.
Preconditioned gradient methods are among the most general and powerful tools in optimization.
However, preconditioning requires storing and manipulating prohibitively large matrices.
We describe and analyze a new structure-aware preconditioning algorithm, called Shampoo, for stochastic optimization over tensor spaces.
Shampoo maintains a set of preconditioning matrices, each of which operates on a single dimension, contracting over the remaining dimensions.
We establish convergence guarantees in the stochastic convex setting, the proof of which builds upon matrix trace inequalities.
Our experiments with state-of-the-art deep learning models show that Shampoo is capable of converging considerably faster than commonly used optimizers.
Although it involves a more complex update rule, Shampoo's runtime per step is comparable to that of simple gradient methods such as SGD, AdaGrad, and Adam.
Structure, functionality, parameters and organization of the computing Grid in Poland is described, mainly from the perspective of high-energy particle physics community, currently its largest consumer and developer.
It represents distributed Tier-2 in the worldwide Grid infrastructure.
It also provides services and resources for data-intensive applications in other sciences.
We present a method to incorporate global orientation information from the sun into a visual odometry pipeline using only the existing image stream, where the sun is typically not visible.
We leverage recent advances in Bayesian Convolutional Neural Networks to train and implement a sun detection model that infers a three-dimensional sun direction vector from a single RGB image.
Crucially, our method also computes a principled uncertainty associated with each prediction, using a Monte Carlo dropout scheme.
We incorporate this uncertainty into a sliding window stereo visual odometry pipeline where accurate uncertainty estimates are critical for optimal data fusion.
Our Bayesian sun detection model achieves a median error of approximately 12 degrees on the KITTI odometry benchmark training set, and yields improvements of up to 42% in translational ARMSE and 32% in rotational ARMSE compared to standard VO.
An open source implementation of our Bayesian CNN sun estimator (Sun-BCNN) using Caffe is available at https://github. com/utiasSTARS/sun-bcnn-vo
In this paper, we have proposed a novel technique for cache replacement in Ad-hoc Network based on the mining of Association Rules
Polar codes are the first class of error correcting codes that provably achieve the channel capacity at infinite code length.
They were selected for use in the fifth generation of cellular mobile communications (5G).
In practical scenarios such as 5G, a cyclic redundancy check (CRC) is concatenated with polar codes to improve their finite length performance.
This is mostly beneficial for sequential successive-cancellation list decoders.
However, for parallel iterative belief propagation (BP) decoders, CRC is only used as an early stopping criterion with incremental error-correction performance improvement.
In this paper, we first propose a CRC-polar BP (CPBP) decoder by exchanging the extrinsic information between the factor graph of the polar code and that of the CRC.
We then propose a neural CPBP (NCPBP) algorithm which improves the CPBP decoder by introducing trainable normalizing weights on the concatenated factor graph.
Our results on a 5G polar code of length 128 show that at the frame error rate of 10^(-5) and with a maximum of 30 iterations, the error-correction performance of CPBP and NCPBP are approximately 0.25 dB and 0.5 dB better than that of the conventional CRC-aided BP decoder, respectively, while introducing almost no latency overhead.
Learning representations of data, and in particular learning features for a subsequent prediction task, has been a fruitful area of research delivering impressive empirical results in recent years.
However, relatively little is understood about what makes a representation `good'.
We propose the idea of a risk gap induced by representation learning for a given prediction context, which measures the difference in the risk of some learner using the learned features as compared to the original inputs.
We describe a set of sufficient conditions for unsupervised representation learning to provide a benefit, as measured by this risk gap.
These conditions decompose the problem of when representation learning works into its constituent parts, which can be separately evaluated using an unlabeled sample, suitable domain-specific assumptions about the joint distribution, and analysis of the feature learner and subsequent supervised learner.
We provide two examples of such conditions in the context of specific properties of the unlabeled distribution, namely when the data lies close to a low-dimensional manifold and when it forms clusters.
We compare our approach to a recently proposed analysis of semi-supervised learning.
A simple model of MNIST handwritten digit recognition is presented here.
The model is an adaptation of a previous theory of face recognition.
It realizes translation and rotation invariance in a principled way instead of being based on extensive learning from large masses of sample data.
The presented recognition rates fall short of other publications, but due to its inspectability and conceptual and numerical simplicity, our system commends itself as a basis for further development.
A key problem of robotic environmental sensing and monitoring is that of active sensing: How can a team of robots plan the most informative observation paths to minimize the uncertainty in modeling and predicting an environmental phenomenon?
This paper presents two principled approaches to efficient information-theoretic path planning based on entropy and mutual information criteria for in situ active sensing of an important broad class of widely-occurring environmental phenomena called anisotropic fields.
Our proposed algorithms are novel in addressing a trade-off between active sensing performance and time efficiency.
An important practical consequence is that our algorithms can exploit the spatial correlation structure of Gaussian process-based anisotropic fields to improve time efficiency while preserving near-optimal active sensing performance.
We analyze the time complexity of our algorithms and prove analytically that they scale better than state-of-the-art algorithms with increasing planning horizon length.
We provide theoretical guarantees on the active sensing performance of our algorithms for a class of exploration tasks called transect sampling, which, in particular, can be improved with longer planning time and/or lower spatial correlation along the transect.
Empirical evaluation on real-world anisotropic field data shows that our algorithms can perform better or at least as well as the state-of-the-art algorithms while often incurring a few orders of magnitude less computational time, even when the field conditions are less favorable.
This paper presents SgxPectre Attacks that exploit the recently disclosed CPU bugs to subvert the confidentiality and integrity of SGX enclaves.
Particularly, we show that when branch prediction of the enclave code can be influenced by programs outside the enclave, the control flow of the enclave program can be temporarily altered to execute instructions that lead to observable cache-state changes.
An adversary observing such changes can learn secrets inside the enclave memory or its internal registers, thus completely defeating the confidentiality guarantee offered by SGX.
To demonstrate the practicality of our SgxPectre Attacks, we have systematically explored the possible attack vectors of branch target injection, approaches to win the race condition during enclave's speculative execution, and techniques to automatically search for code patterns required for launching the attacks.
Our study suggests that any enclave program could be vulnerable to SgxPectre Attacks since the desired code patterns are available in most SGX runtimes (e.g., Intel SGX SDK, Rust-SGX, and Graphene-SGX).
Most importantly, we have applied SgxPectre Attacks to steal seal keys and attestation keys from Intel signed quoting enclaves.
The seal key can be used to decrypt sealed storage outside the enclaves and forge valid sealed data; the attestation key can be used to forge attestation signatures.
For these reasons, SgxPectre Attacks practically defeat SGX's security protection.
This paper also systematically evaluates Intel's existing countermeasures against SgxPectre Attacks and discusses the security implications.
Different from traditional action recognition based on video segments, online action recognition aims to recognize actions from unsegmented streams of data in a continuous manner.
One way for online recognition is based on the evidence accumulation over time to make predictions from stream videos.
This paper presents a fast yet effective method to recognize actions from stream of noisy skeleton data, and a novel weighted covariance descriptor is adopted to accumulate evidence.
In particular, a fast incremental updating method for the weighted covariance descriptor is developed for accumulation of temporal information and online prediction.
The weighted covariance descriptor takes the following principles into consideration: past frames have less contribution for recognition and recent and informative frames such as key frames contribute more to the recognition.
The online recognition is achieved using a simple nearest neighbor search against a set of offline trained action models.
Experimental results on MSC-12 Kinect Gesture dataset and our newly constructed online action recognition dataset have demonstrated the efficacy of the proposed method.
Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes' great potential as a pose and view-invariant representation.
However, existing attribute-centric approaches have thus far underperformed state-of-the-art conventional approaches.
This is due to their non-scalable need for extensive domain (camera) specific annotation.
In this paper we present a new semantic attribute learning approach for person re-identification and search.
Our model is trained on existing fashion photography datasets -- either weakly or strongly labelled.
It can then be transferred and adapted to provide a powerful semantic description of surveillance person detections, without requiring any surveillance domain supervision.
The resulting representation is useful for both unsupervised and supervised person re-identification, achieving state-of-the-art and near state-of-the-art performance respectively.
Furthermore, as a semantic representation it allows description-based person search to be integrated within the same framework.
Small wind projects encounter difficulties to be efficiently deployed, partly because wrong way data and information are managed.
Ontologies can overcome the drawbacks of partially available, noisy, inconsistent, and heterogeneous data sources, by providing a semantic middleware between low level data and more general knowledge.
In this paper, we engineer an ontology for the wind energy domain using description logic as technical instrumentation.
We aim to integrate corpus of heterogeneous knowledge, both digital and human, in order to help the interested user to speed-up the initialization of a small-scale wind project.
We exemplify one use case scenario of our ontology, that consists of automatically checking whether a planned wind project is compliant or not with the active regulations.
Field failures, that is, failures caused by faults that escape the testing phase leading to failures in the field, are unavoidable.
Improving verification and validation activities before deployment can identify and timely remove many but not all faults, and users may still experience a number of annoying problems while using their software systems.
This paper investigates the nature of field failures, to understand to what extent further improving in-house verification and validation activities can reduce the number of failures in the field, and frames the need of new approaches that operate in the field.
We report the results of the analysis of the bug reports of five applications belonging to three different ecosystems, propose a taxonomy of field failures, and discuss the reasons why failures belonging to the identified classes cannot be detected at design time but shall be addressed at runtime.
We observe that many faults (70%) are intrinsically hard to detect at design-time.
We present a method for generating robust chaos.
It is based on the search algorithm weak symmetry violation in the reconstructed attractor.
On its basis the smooth functions in the form of a system of finite-difference equations.
To ensure robust chaos generator introduced piecewise continuous member.
The simulation results are given in the report.
The problem of time-constrained multi-agent task scheduling and control synthesis is addressed.
We assume the existence of a high level plan which consists of a sequence of cooperative tasks, each of which is associated with a deadline and several Quality-of-Service levels.
By taking into account the reward and cost of satisfying each task, a novel scheduling problem is formulated and a path synthesis algorithm is proposed.
Based on the obtained plan, a distributed hybrid control law is further designed for each agent.
Under the condition that only a subset of the agents are aware of the high level plan, it is shown that the proposed controller guarantees the satisfaction of time constraints for each task.
A simulation example is given to verify the theoretical results.
Femtocells have been considered by the wireless industry as a cost-effective solution not only to improve indoor service providing, but also to unload traffic from already overburdened macro networks.
Due to spectrum availability and network infrastructure considerations, a macro network may have to share spectrum with overlaid femtocells.
In spectrum-sharing macro and femto networks, inter-cell interference caused by different transmission powers of macrocell base stations (MBS) and femtocell access points (FAP), in conjunction with potentially densely deployed femtocells, may create dead spots where reliable services cannot be guaranteed to either macro or femto users.
In this paper, based on a thorough analysis of downlink (DL) outage probabilities (OP) of collocated spectrum-sharing orthogonal frequency division multiple access (OFDMA) based macro and femto networks, we devise a decentralized strategy for an FAP to self-regulate its transmission power level and usage of radio resources depending on its distance from the closest MBS.
Simulation results show that the derived closed-form lower bounds of DL OPs are tight, and the proposed decentralized femtocell self-regulation strategy is able to guarantee reliable DL services in targeted macro and femto service areas while providing superior spatial reuse, for even a large number of spectrum-sharing femtocells deployed per cell site.
Querying graph databases has recently received much attention.
We propose a new approach to this problem, which balances competing goals of expressive power, language clarity and computational complexity.
A distinctive feature of our approach is the ability to express properties of minimal (e.g. shortest) and maximal (e.g. most valuable) paths satisfying given criteria.
To express complex properties in a modular way, we introduce labelling-generating ontologies.
The resulting formalism is computationally attractive -- queries can be answered in non-deterministic logarithmic space in the size of the database.
Camouflaging data by generating fake information is a well-known obfuscation technique for protecting data privacy.
In this paper, we focus on a very sensitive and increasingly exposed type of data: location data.
There are two main scenarios in which fake traces are of extreme value to preserve location privacy: publishing datasets of location trajectories, and using location-based services.
Despite advances in protecting (location) data privacy, there is no quantitative method to evaluate how realistic a synthetic trace is, and how much utility and privacy it provides in each scenario.
Also, the lack of a methodology to generate privacy-preserving fake traces is evident.
In this paper, we fill this gap and propose the first statistical metric and model to generate fake location traces such that both the utility of data and the privacy of users are preserved.
We build upon the fact that, although geographically they visit distinct locations, people have strongly semantically similar mobility patterns, for example, their transition pattern across activities (e.g., working, driving, staying at home) is similar.
We define a statistical metric and propose an algorithm that automatically discovers the hidden semantic similarities between locations from a bag of real location traces as seeds, without requiring any initial semantic annotations.
We guarantee that fake traces are geographically dissimilar to their seeds, so they do not leak sensitive location information.
We also protect contributors to seed traces against membership attacks.
Interleaving fake traces with mobile users' traces is a prominent location privacy defense mechanism.
We quantitatively show the effectiveness of our methodology in protecting against localization inference attacks while preserving utility of sharing/publishing traces.
The ability to model search in a constraint solver can be an essential asset for solving combinatorial problems.
However, existing infrastructure for defining search heuristics is often inadequate.
Either modeling capabilities are extremely limited or users are faced with a general-purpose programming language whose features are not tailored towards writing search heuristics.
As a result, major improvements in performance may remain unexplored.
This article introduces search combinators, a lightweight and solver-independent method that bridges the gap between a conceptually simple modeling language for search (high-level, functional and naturally compositional) and an efficient implementation (low-level, imperative and highly non-modular).
By allowing the user to define application-tailored search strategies from a small set of primitives, search combinators effectively provide a rich domain-specific language (DSL) for modeling search to the user.
Remarkably, this DSL comes at a low implementation cost to the developer of a constraint solver.
The article discusses two modular implementation approaches and shows, by empirical evaluation, that search combinators can be implemented without overhead compared to a native, direct implementation in a constraint solver.
A network covert channel is created that uses resource names such as addresses to convey information, and that approximates typical user behavior in order to blend in with its environment.
The channel correlates available resource names with a user defined code-space, and transmits its covert message by selectively accessing resources associated with the message codes.
In this paper we focus on an implementation of the channel using the Hypertext Transfer Protocol (HTTP) with Uniform Resource Locators (URLs) as the message names, though the system can be used in conjunction with a variety of protocols.
The covert channel does not modify expected protocol structure as might be detected by simple inspection, and our HTTP implementation emulates transaction level web user behavior in order to avoid detection by statistical or behavioral analysis.
We calculate asymptotic expansions for the moments of number of comparisons used by the randomized quick sort algorithm using the singularity analysis of certain generating functions.
We propose graph-based predictable feature analysis (GPFA), a new method for unsupervised learning of predictable features from high-dimensional time series, where high predictability is understood very generically as low variance in the distribution of the next data point given the previous ones.
We show how this measure of predictability can be understood in terms of graph embedding as well as how it relates to the information-theoretic measure of predictive information in special cases.
We confirm the effectiveness of GPFA on different datasets, comparing it to three existing algorithms with similar objectives---namely slow feature analysis, forecastable component analysis, and predictable feature analysis---to which GPFA shows very competitive results.
Neural models have become ubiquitous in automatic speech recognition systems.
While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on neural networks, which can be trained to directly predict text from input acoustic features.
Although such systems are conceptually elegant and simpler than traditional systems, it is less obvious how to interpret the trained models.
In this work, we analyze the speech representations learned by a deep end-to-end model that is based on convolutional and recurrent layers, and trained with a connectionist temporal classification (CTC) loss.
We use a pre-trained model to generate frame-level features which are given to a classifier that is trained on frame classification into phones.
We evaluate representations from different layers of the deep model and compare their quality for predicting phone labels.
Our experiments shed light on important aspects of the end-to-end model such as layer depth, model complexity, and other design choices.
Recent work has suggested reducing electricity generation cost by cutting the peak to average ratio (PAR) without reducing the total amount of the loads.
However, most of these proposals rely on consumer's willingness to act.
In this paper, we propose an approach to cut PAR explicitly from the supply side.
The resulting cut loads are then distributed among consumers by the means of a multiunit auction which is done by an intelligent agent on behalf of the consumer.
This approach is also in line with the future vision of the smart grid to have the demand side matched with the supply side.
Experiments suggest that our approach reduces overall system cost and gives benefit to both consumers and the energy provider.
This paper discusses opportunities to parallelize graph based path planning algorithms in a time varying environment.
Parallel architectures have become commonplace, requiring algorithm to be parallelized for efficient execution.
An additional focal point of this paper is the inclusion of inaccuracies in path planning as a result of forecast error variance, accuracy of calculation in the cost functions and a different observed vehicle speed in the real mission than planned.
In this context, robust path planning algorithms will be described.
These algorithms are equally applicable to land based, aerial, or underwater mobile autonomous systems.
The results presented here provide the basis for a future Research project in which the parallelized algorithms will be evaluated on multi and many core systems such as the dual core ARM Panda board and the 48 core Single-chip Cloud Computer (SCC).
Modern multi and many core processors support a wide range of performance vs. energy tradeoffs that can be exploited in energyconstrained environments such as battery operated autonomous underwater vehicles.
For this evaluation, the boards will be deployed within the Slocum glider, a commercially available, buoyancy driven autonomous underwater vehicle (AUV).
This manuscript uses machine learning techniques to exploit baseball pitchers' decision making, so-called "Baseball IQ," by modeling the at-bat information, pitch selection and counts, as a Markov Decision Process (MDP).
Each state of the MDP models the pitcher's current pitch selection in a Markovian fashion, conditional on the information immediately prior to making the current pitch.
This includes the count prior to the previous pitch, his ensuing pitch selection, the batter's ensuing action and the result of the pitch.
Intent classification has been widely researched on English data with deep learning approaches that are based on neural networks and word embeddings.
The challenge for Chinese intent classification stems from the fact that, unlike English where most words are made up of 26 phonologic alphabet letters, Chinese is logographic, where a Chinese character is a more basic semantic unit that can be informative and its meaning does not vary too much in contexts.
Chinese word embeddings alone can be inadequate for representing words, and pre-trained embeddings can suffer from not aligning well with the task at hand.
To account for the inadequacy and leverage Chinese character information, we propose a low-effort and generic way to dynamically integrate character embedding based feature maps with word embedding based inputs, whose resulting word-character embeddings are stacked with a contextual information extraction module to further incorporate context information for predictions.
On top of the proposed model, we employ an ensemble method to combine single models and obtain the final result.
The approach is data-independent without relying on external sources like pre-trained word embeddings.
The proposed model outperforms baseline models and existing methods.
Multispectral (MS) image panchromatic (PAN) sharpening algorithms proposed to the remote sensing community are ever increasing in number and variety.
Their aim is to sharpen a coarse spatial resolution MS image with a fine spatial resolution PAN image acquired simultaneously by a spaceborne or airborne Earth observation (EO) optical imaging sensor pair.
Unfortunately, to date, no standard evaluation procedure for MS image PAN sharpening outcome and process is community agreed upon, in contrast with the Quality Assurance Framework for Earth Observation (QA4EO) guidelines proposed by the intergovernmental Group on Earth Observations (GEO).
In general, process is easier to measure, outcome is more important.
The original contribution of the present study is fourfold.
First, existing procedures for quantitative quality assessment (Q2A) of the (sole) PAN sharpened MS product are critically reviewed.
Their conceptual and implementation drawbacks are highlighted to be overcome for quality improvement.
Second, a novel (to the best of these authors' knowledge, the first) protocol for Q2A of MS image PAN sharpening product and process is designed, implemented and validated by independent means.
Third, within this protocol, an innovative categorization of spectral and spatial image quality indicators and metrics is presented.
Fourth, according to this new taxonomy, an original third order isotropic multi scale gray level co occurrence matrix (TIMS GLCM) calculator and a TIMS GLCM texture feature extractor are proposed to replace popular second order GLCMs.
Visual recognition of material boundaries in transparent vessels is valuable for numerous applications.
Such recognition is essential for estimation of fill-level, volume and phase-boundaries as well as for tracking of such chemical processes as precipitation, crystallization, condensation, evaporation and phase-separation.
The problem of material boundary recognition in images is particularly complex for materials with non-flat surfaces, i.e., solids, powders and viscous fluids, in which the material interfaces have unpredictable shapes.
This work demonstrates a general method for finding the boundaries of materials inside transparent containers in images.
The method uses an image of the transparent vessel containing the material and the boundary of the vessel in this image.
The recognition is based on the assumption that the material boundary appears in the image in the form of a curve (with various constraints) whose endpoints are both positioned on the vessel contour.
The probability that a curve matches the material boundary in the image is evaluated using a cost function based on some image properties along this curve.
Several image properties were examined as indicators for the material boundary.
The optimal boundary curve was found using Dijkstra's algorithm.
The method was successfully examined for recognition of various types of phase-boundaries, including liquid-air, solid-air and solid-liquid interfaces, as well as for various types of glassware containers from everyday life and the chemistry laboratory (i.e., bottles, beakers, flasks, jars, columns, vials and separation-funnels).
In addition, the method can be easily extended to materials carried on top of carrier vessels (i.e., plates, spoons, spatulas).
Pose-Graph optimization is a crucial component of many modern SLAM systems.
Most prominent state of the art systems address this problem by iterative non-linear least squares.
Both number of iterations and convergence basin of these approaches depend on the error functions used to describe the problem.
The smoother and more convex the error function with respect to perturbations of the state variables, the better the least-squares solver will perform.
In this paper we propose an alternative error function obtained by removing some non-linearities from the standard used one - i.e. the geodesic error function.
Comparative experiments conducted on common benchmarking datasets confirm that our function is more robust to noise that affects the rotational component of the pose measurements and, thus, exhibits a larger convergence basin than the geodesic.
Furthermore, its implementation is relatively easy compared to the geodesic distance.
This property leads to rather simple derivatives and nice numerical properties of the Jacobians resulting from the effective computation of the quadratic approximation used by Gauss-Newton algorithm.
We study the question of reconstructing a weighted, directed network up to isomorphism from its motifs.
In order to tackle this question we first relax the usual (strong) notion of graph isomorphism to obtain a relaxation that we call weak isomorphism.
Then we identify a definition of distance on the space of all networks that is compatible with weak isomorphism.
This global approach comes equipped with notions such as completeness, compactness, curves, and geodesics, which we explore throughout this paper.
Furthermore, it admits global-to-local inference in the following sense: we prove that two networks are weakly isomorphic if and only if all their motif sets are identical, thus answering the network reconstruction question.
Further exploiting the additional structure imposed by our network distance, we prove that two networks are weakly isomorphic if and only if certain essential associated structures---the skeleta of the respective networks---are strongly isomorphic.
In computer science, divide and conquer (D&C) is an algorithm design paradigm based on multi-branched recursion.
A D&C algorithm works by recursively and monotonically breaking down a problem into sub problems of the same (or a related) type, until these become simple enough to be solved directly.
The solutions to the sub problems are then combined to give a solution to the original problem.
The present work identifies D&C algorithms assumed within contemporary syntactic theory, and discusses the limits of their applicability in the realms of the syntax semantics and syntax morphophonology interfaces.
We will propose that D&C algorithms, while valid for some processes, fall short on flexibility given a mixed approach to the structure of linguistic phrase markers.
Arguments in favour of a computationally mixed approach to linguistic structure will be presented as an alternative that offers advantages to uniform D&C approaches.
We propose a stepsize adaptation scheme for stochastic gradient descent.
It operates directly with the loss function and rescales the gradient in order to make fixed predicted progress on the loss.
We demonstrate its capabilities by conclusively improving the performance of Adam and Momentum optimizers.
The enhanced optimizers with default hyperparameters consistently outperform their constant stepsize counterparts, even the best ones, without a measurable increase in computational cost.
The performance is validated on multiple architectures including dense nets, CNNs, ResNets, and the recurrent Differential Neural Computer on classical datasets MNIST, fashion MNIST, CIFAR10 and others.
This article is devoted to the stabilization of two underactuated planar systems, the well-known straight beam-and-ball system and an original circular beam-and-ball system.
The feedback control for each system is designed, using the Jordan form of its model, linearized near the unstable equilibrium.
The limits on the voltage, fed to the motor, are taken into account explicitly.
The straight beam-and-ball system has one unstable mode in the motion near the equilibrium point.
The proposed control law ensures that the basin of attraction coincides with the controllability domain.
The circular beam-and-ball system has two unstable modes near the equilibrium point.
Therefore, this device, never considered in the past, is much more difficult to control than the straight beam-and-ball system.
The main contribution is to propose a simple new control law, which ensures by adjusting its gain parameters that the basin of attraction arbitrarily can approach the controllability domain for the linear case.
For both nonlinear systems, simulation results are presented to illustrate the efficiency of the designed nonlinear control laws and to determine the basin of attraction.
The possibility of flexibly assigning spectrum resources with channels of different sizes greatly improves the spectral efficiency of optical networks, but can also lead to unwanted spectrum fragmentation.We study this problem in a scenario where traffic demands are categorised in two types (low or high bit-rate) by assessing the performance of three allocation policies.
Our first contribution consists of exact Markov chain models for these allocation policies, which allow us to numerically compute the relevant performance measures.
However, these exact models do not scale to large systems, in the sense that the computations required to determine the blocking probabilities---which measure the performance of the allocation policies---become intractable.
In order to address this, we first extend an approximate reduced-state Markov chain model that is available in the literature to the three considered allocation policies.
These reduced-state Markov chain models allow us to tractably compute approximations of the blocking probabilities, but the accuracy of these approximations cannot be easily verified.
Our main contribution then is the introduction of reduced-state imprecise Markov chain models that allow us to derive guaranteed lower and upper bounds on blocking probabilities, for the three allocation policies separately or for all possible allocation policies simultaneously.
The paper proposes a combination of the subdomain deflation method and local algebraic multigrid as a scalable distributed memory preconditioner that is able to solve large linear systems of equations.
The implementation of the algorithm is made available for the community as part of an open source AMGCL library.
The solution targets both homogeneous (CPU-only) and heterogeneous (CPU/GPU) systems, employing hybrid MPI/OpenMP approach in the former and a combination of MPI, OpenMP, and CUDA in the latter cases.
The use of OpenMP minimizes the number of MPI processes, thus reducing the communication overhead of the deflation method and improving both weak and strong scalability of the preconditioner.
The examples of scalar, Poisson-like, systems as well as non-scalar problems, stemming out of the discretization of the Navier-Stokes equations, are considered in order to estimate performance of the implemented algorithm.
A comparison with a traditional global AMG preconditioner based on a well-established Trilinos ML package is provided.
Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations.
Images can have multiple visual and language contexts that are relevant for generating questions namely places, captions, and tags.
In this paper, we propose the use of exemplars for obtaining the relevant context.
We obtain this by using a Multimodal Differential Network to produce natural and engaging questions.
The generated questions show a remarkable similarity to the natural questions as validated by a human study.
Further, we observe that the proposed approach substantially improves over state-of-the-art benchmarks on the quantitative metrics (BLEU, METEOR, ROUGE, and CIDEr).
In this paper, we propose and evaluate rate-maximizing pilot configurations for Unmanned Aerial Vehicle (UAV) communications employing OFDM waveforms.
OFDM relies on pilot symbols for effective communications.
We formulate a rate-maximization problem in which the pilot spacing (in the time-frequency resource grid) and power is varied as a function of the time-varying channel statistics.
The receiver solves this rate-maximization problem, and the optimal pilot spacing and power are explicitly fed back to the transmitter to adapt to the time-varying channel statistics in an air-to-ground (A2G) environment.
We show the enhanced throughput performance of this scheme for UAV communications in sub-6 GHz bands.
These performance gains are achieved at the cost of very low computational complexity and feedback requirements, making it attractive for A2G UAV communications in 5G.
We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects.
We first transform a 3D input volume into a 2D planar image using stereographic projection.
We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions.
Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble.
The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters.
Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.
Whenever multibeam satellite systems target very aggressive frequency reuse in their coverage area, inter-beam interference becomes the major obstacle for increasing the overall system throughput.
As a matter of fact, users located at the beam edges suffer from a very large interference for even a moderately aggressive planning of reuse-2.
Although solutions for inter-beam interference management have been investigated at the satellite terminal, it turns out that the performance improvement does not justify the increased terminal complexity and cost.
In this article, we pay attention to interference mitigation techniques that take place at the transmitter (i.e. the gateway).
Based on this understanding, we provide our vision on advanced precoding techniques and user clustering methods for multibeam broadband fixed satellite communications.
We also discuss practical challenges to deploy precoding schemes and the support introduced in the recently published DVB-S2X standard.
Future challenges for novel configurations employing precoding are also provided.
Identification of falls while performing normal activities of daily living (ADL) is important to ensure personal safety and well-being.
However, falling is a short term activity that occurs infrequently.
This poses a challenge to traditional classification algorithms, because there may be very little training data for falls (or none at all).
This paper proposes an approach for the identification of falls using a wearable device in the absence of training data for falls but with plentiful data for normal ADL.
We propose three `X-Factor' Hidden Markov Model (XHMMs) approaches.
The XHMMs model unseen falls using "inflated" output covariances (observation models).
To estimate the inflated covariances, we propose a novel cross validation method to remove "outliers" from the normal ADL that serve as proxies for the unseen falls and allow learning the XHMMs using only normal activities.
We tested the proposed XHMM approaches on two activity recognition datasets and show high detection rates for falls in the absence of fall-specific training data.
We show that the traditional method of choosing a threshold based on maximum of negative of log-likelihood to identify unseen falls is ill-posed for this problem.
We also show that supervised classification methods perform poorly when very limited fall data are available during the training phase.
We present an information-theoretic interpretation of quantum formalism based on a Bayesian framework and free of any additional axiom or principle.
Quantum information is merely construed as a technique of statistical estimation for analyzing a logical system subject to classical constraints, regardless of the specific variables used.
The problem is initially formulated in a standard Boolean algebra involving a particular set of working variables.
Statistical estimation is to express the truth table in terms of likelihood probability instead of the variables themselves.
The constraints are thus converted into a Bayesian prior.
This method leads to solving a linear programming problem in a real-valued probability space.
The complete set of alternative Boolean variables is introduced afterwards by transcribing the probability space into a Hilbert space, thanks to Gleason's theorem.
This allows to completely recover standard quantum information and provides an information-theoretic rationale to its technical rules.
The model offers a natural answer to the major puzzles that base quantum mechanics: Why is the theory linear?
Why is the theory probabilistic?
Where does the Hilbert space come from?
Also, most of the paradoxes, such as entanglement, contextuality, nonsignaling correlation, measurement problem, etc., find a quite trivial explanation, while the concept of information conveyed by a wave vector is clarified.
We conclude that quantum information, although dramatically expanding the scope of classical information, is not different from the information itself and is therefore a universal tool of reasoning.
We introduce Graph-Structured Sum-Product Networks (GraphSPNs), a probabilistic approach to structured prediction for problems where dependencies between latent variables are expressed in terms of arbitrary, dynamic graphs.
While many approaches to structured prediction place strict constraints on the interactions between inferred variables, many real-world problems can be only characterized using complex graph structures of varying size, often contaminated with noise when obtained from real data.
Here, we focus on one such problem in the domain of robotics.
We demonstrate how GraphSPNs can be used to bolster inference about semantic, conceptual place descriptions using noisy topological relations discovered by a robot exploring large-scale office spaces.
Through experiments, we show that GraphSPNs consistently outperform the traditional approach based on undirected graphical models, successfully disambiguating information in global semantic maps built from uncertain, noisy local evidence.
We further exploit the probabilistic nature of the model to infer marginal distributions over semantic descriptions of as yet unexplored places and detect spatial environment configurations that are novel and incongruent with the known evidence.
Uncertainty of decisions in safety-critical engineering applications can be estimated on the basis of the Bayesian Markov Chain Monte Carlo (MCMC) technique of averaging over decision models.
The use of decision tree (DT) models assists experts to interpret causal relations and find factors of the uncertainty.
Bayesian averaging also allows experts to estimate the uncertainty accurately when a priori information on the favored structure of DTs is available.
Then an expert can select a single DT model, typically the Maximum a Posteriori model, for interpretation purposes.
Unfortunately, a priori information on favored structure of DTs is not always available.
For this reason, we suggest a new prior on DTs for the Bayesian MCMC technique.
We also suggest a new procedure of selecting a single DT and describe an application scenario.
In our experiments on the Short-Term Conflict Alert data our technique outperforms the existing Bayesian techniques in predictive accuracy of the selected single DTs.
Automatic skin lesion segmentation on dermoscopic images is an essential component in computer-aided diagnosis of melanoma.
Recently, many fully supervised deep learning based methods have been proposed for automatic skin lesion segmentation.
However, these approaches require massive pixel-wise annotation from experienced dermatologists, which is very costly and time-consuming.
In this paper, we present a novel semi-supervised method for skin lesion segmentation by leveraging both labeled and unlabeled data.
The network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data.
In this paper, we present a novel semi-supervised method for skin lesion segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data.
Our method encourages a consistent prediction for unlabeled images using the outputs of the network-in-training under different regularizations, so that it can utilize the unlabeled data.
To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations.
Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model.
With only 300 labeled training samples, our method sets a new record on the benchmark of the International Skin Imaging Collaboration (ISIC) 2017 skin lesion segmentation challenge.
Such a result clearly surpasses fully-supervised state-of-the-arts that are trained with 2000 labeled data.
As the practical use of answer set programming (ASP) has grown with the development of efficient solvers, we expect a growing interest in extensions of ASP as their semantics stabilize and solvers supporting them mature.
Epistemic Specifications, which adds modal operators K and M to the language of ASP, is one such extension.
We call a program in this language an epistemic logic program (ELP).
Solvers have thus far been practical for only the simplest ELPs due to exponential growth of the search space.
We describe a solver that is able to solve harder problems better (e.g., without exponentially-growing memory needs w.r.t.K and M occurrences) and faster than any other known ELP solver.
At computer modeling of process of training it is usually supposed that all elements of a training material are forgotten with an identical speed.
But in practice that knowledge which are included in educational activity of the pupil are remembered much more strongly and forgotten more slowly then knowledge which he doesn't use.
For the purpose of more exact research of didactic systems is offered the model of training, in which consider that in case increasing the number of applications of this element of a learning material: 1) duration of its use by the pupil decreases; 2) the coefficient of forgetting decreases.
The computer model is considered, programs in the Pascal language are submitted, results of modeling are given and analyzed.
Keywords: didactics, information and cybernetic approach, computer modeling of process of training.
"How much energy is consumed for an inference made by a convolutional neural network (CNN)?"
With the increased popularity of CNNs deployed on the wide-spectrum of platforms (from mobile devices to workstations), the answer to this question has drawn significant attention.
From lengthening battery life of mobile devices to reducing the energy bill of a datacenter, it is important to understand the energy efficiency of CNNs during serving for making an inference, before actually training the model.
In this work, we propose NeuralPower: a layer-wise predictive framework based on sparse polynomial regression, for predicting the serving energy consumption of a CNN deployed on any GPU platform.
Given the architecture of a CNN, NeuralPower provides an accurate prediction and breakdown for power and runtime across all layers in the whole network, helping machine learners quickly identify the power, runtime, or energy bottlenecks.
We also propose the "energy-precision ratio" (EPR) metric to guide machine learners in selecting an energy-efficient CNN architecture that better trades off the energy consumption and prediction accuracy.
The experimental results show that the prediction accuracy of the proposed NeuralPower outperforms the best published model to date, yielding an improvement in accuracy of up to 68.5%.
We also assess the accuracy of predictions at the network level, by predicting the runtime, power, and energy of state-of-the-art CNN architectures, achieving an average accuracy of 88.24% in runtime, 88.34% in power, and 97.21% in energy.
We comprehensively corroborate the effectiveness of NeuralPower as a powerful framework for machine learners by testing it on different GPU platforms and Deep Learning software tools.
Autonomous robots need to interact with unknown, unstructured and changing environments, constantly facing novel challenges.
Therefore, continuous online adaptation for lifelong-learning and the need of sample-efficient mechanisms to adapt to changes in the environment, the constraints, the tasks, or the robot itself are crucial.
In this work, we propose a novel framework for probabilistic online motion planning with online adaptation based on a bio-inspired stochastic recurrent neural network.
By using learning signals which mimic the intrinsic motivation signalcognitive dissonance in addition with a mental replay strategy to intensify experiences, the stochastic recurrent network can learn from few physical interactions and adapts to novel environments in seconds.
We evaluate our online planning and adaptation framework on an anthropomorphic KUKA LWR arm.
The rapid online adaptation is shown by learning unknown workspace constraints sample-efficiently from few physical interactions while following given way points.
We initiate a general study of what we call orientation completion problems.
For a fixed class C of oriented graphs, the orientation completion problem asks whether a given partially oriented graph P can be completed to an oriented graph in C by orienting the (non-oriented) edges in P. Orien- tation completion problems commonly generalize several existing problems including recognition of certain classes of graphs and digraphs as well as extending representations of certain geometrically representable graphs.
We study orientation completion problems for various classes of oriented graphs, including k-arc- strong oriented graphs, k-strong oriented graphs, quasi-transitive oriented graphs, local tournament, acyclic local tournaments, locally transitive tournaments, locally transitive local tournaments, in- tournaments, and oriented graphs which have directed cycle factors.
We show that the orientation completion problem for each of these classes is either polynomial time solvable or NP-complete.
We also show that some of the NP-complete problems become polynomial time solvable when the input oriented graphs satisfy certain extra conditions.
Our results imply that the representation extension problems for proper interval graphs and for proper circular arc graphs are polynomial time solvable, which generalize a previous result.
As the volume of medicinal information stored electronically increase, so do the need to enhance how it is secured.
The inaccessibility to patient record at the ideal time can prompt death toll and also well degrade the level of health care services rendered by the medicinal professionals.
Criminal assaults in social insurance have expanded by 125% since 2010 and are now the leading cause of medical data breaches.
This study therefore presents the combination of 3DES and LSB to improve security measure applied on medical data.
Java programming language was used to develop a simulation program for the experiment.
The result shows medical data can be stored, shared, and managed in a reliable and secure manner using the combined model.
The use of satellite imagery has become increasingly popular for disaster monitoring and response.
After a disaster, it is important to prioritize rescue operations, disaster response and coordinate relief efforts.
These have to be carried out in a fast and efficient manner since resources are often limited in disaster-affected areas and it's extremely important to identify the areas of maximum damage.
However, most of the existing disaster mapping efforts are manual which is time-consuming and often leads to erroneous results.
In order to address these issues, we propose a framework for change detection using Convolutional Neural Networks (CNN) on satellite images which can then be thresholded and clustered together into grids to find areas which have been most severely affected by a disaster.
We also present a novel metric called Disaster Impact Index (DII) and use it to quantify the impact of two natural disasters - the Hurricane Harvey flood and the Santa Rosa fire.
Our framework achieves a top F1 score of 81.2% on the gridded flood dataset and 83.5% on the gridded fire dataset.
In this paper, we propose a novel fully convolutional two-stream fusion network (FCTSFN) for interactive image segmentation.
The proposed network includes two sub-networks: a two-stream late fusion network (TSLFN) that predicts the foreground at a reduced resolution, and a multi-scale refining network (MSRN) that refines the foreground at full resolution.
The TSLFN includes two distinct deep streams followed by a fusion network.
The intuition is that, since user interactions are more direct information on foreground/background than the image itself, the two-stream structure of the TSLFN reduces the number of layers between the pure user interaction features and the network output, allowing the user interactions to have a more direct impact on the segmentation result.
The MSRN fuses the features from different layers of TSLFN with different scales, in order to seek the local to global information on the foreground to refine the segmentation result at full resolution.
We conduct comprehensive experiments on four benchmark datasets.
The results show that the proposed network achieves competitive performance compared to current state-of-the-art interactive image segmentation methods
We propose a novel way of computing surface folding maps via solving a linear PDE.
This framework is a generalization to the existing quasiconformal methods and allows manipulation of the geometry of folding.
Moreover, the crucial quantity that characterizes the geometry occurs as the coefficient of the equation, namely the Beltrami coefficient.
This allows us to solve an inverse problem of parametrizing the folded surface given only partial data but with known folding topology.
Various interesting applications such as fold sculpting on 3D models and self-occlusion reasoning are demonstrated to show the effectiveness of our method.
The use of functional brain imaging for research and diagnosis has benefitted greatly from the recent advancements in neuroimaging technologies, as well as the explosive growth in size and availability of fMRI data.
While it has been shown in literature that using multiple and large scale fMRI datasets can improve reproducibility and lead to new discoveries, the computational and informatics systems supporting the analysis and visualization of such fMRI big data are extremely limited and largely under-discussed.
We propose to address these shortcomings in this work, based on previous success in using dictionary learning method for functional network decomposition studies on fMRI data.
We presented a distributed dictionary learning framework based on rank-1 matrix decomposition with sparseness constraint (D-r1DL framework).
The framework was implemented using the Spark distributed computing engine and deployed on three different processing units: an in-house server, in-house high performance clusters, and the Amazon Elastic Compute Cloud (EC2) service.
The whole analysis pipeline was integrated with our neuroinformatics system for data management, user input/output, and real-time visualization.
Performance and accuracy of D-r1DL on both individual and group-wise fMRI Human Connectome Project (HCP) dataset shows that the proposed framework is highly scalable.
The resulting group-wise functional network decompositions are highly accurate, and the fast processing time confirm this claim.
In addition, D-r1DL can provide real-time user feedback and results visualization which are vital for large-scale data analysis.
Most existing methods for automatic bilingual dictionary induction rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries.
For many language pairs, such supervised alignments are not readily available.
We propose an unsupervised approach for learning a bilingual dictionary for a pair of languages given their independently-learned monolingual word embeddings.
The proposed method exploits local and global structures in monolingual vector spaces to align them such that similar words are mapped to each other.
We show empirically that the performance of bilingual correspondents learned using our proposed unsupervised method is comparable to that of using supervised bilingual correspondents from a seed dictionary.
We formulate problems of statistical recognition and learning in a common framework of complex hypothesis testing.
Based on arguments from multi-criteria optimization, we identify strategies that are improper for solving these problems and derive a common form of the remaining strategies.
We show that some widely used approaches to recognition and learning are improper in this sense.
We then propose a generalized formulation of the recognition and learning problem which embraces the whole range of sizes of the learning sample, including the zero size.
Learning becomes a special case of recognition without learning.
We define the concept of closest to optimal strategy, being a solution to the formulated problem, and describe a technique for finding such a strategy.
On several illustrative cases, the strategy is shown to be superior to the widely used learning methods based on maximal likelihood estimation.
Thanks to the low operational cost and large storage capacity of smartphones and wearable devices, people are recording many hours of daily activities, sport actions and home videos.
These videos, also known as egocentric videos, are generally long-running streams with unedited content, which make them boring and visually unpalatable, bringing up the challenge to make egocentric videos more appealing.
In this work we propose a novel methodology to compose the new fast-forward video by selecting frames based on semantic information extracted from images.
The experiments show that our approach outperforms the state-of-the-art as far as semantic information is concerned and that it is also able to produce videos that are more pleasant to be watched.
A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.
To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization.
Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training.
Further, we apply it to text sequence-matching problems.
The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.
Deep learning has recently become one of the most popular sub-fields of machine learning owing to its distributed data representation with multiple levels of abstraction.
A diverse range of deep learning algorithms are being employed to solve conventional artificial intelligence problems.
This paper gives an overview of some of the most widely used deep learning algorithms applied in the field of computer vision.
It first inspects the various approaches of deep learning algorithms, followed by a description of their applications in image classification, object identification, image extraction and semantic segmentation in the presence of noise.
The paper concludes with the discussion of the future scope and challenges for construction and training of deep neural networks.
Detection of indoor and outdoor scenarios is an important resource for many types of activities such as multisensor navigation and location-based services.
This research presents the use of NMEA data provided by GPS receivers to characterize different types of scenarios automatically.
A set of static tests was performed to evaluate metrics such as number of satellites, positioning solution geometry and carrier-to-receiver noise-density ratio values to detect possible patterns to determine indoor and outdoor scenarios.
Subsequently, validation tests are applied to verify that parameters obtained are adequate.
Network embedding (NE) is playing a principal role in network mining, due to its ability to map nodes into efficient low-dimensional embedding vectors.
However, two major limitations exist in state-of-the-art NE methods: structure preservation and uncertainty modeling.
Almost all previous methods represent a node into a point in space and focus on the local structural information, i.e., neighborhood information.
However, neighborhood information does not capture the global structural information and point vector representation fails in modeling the uncertainty of node representations.
In this paper, we propose a new NE framework, struc2gauss, which learns node representations in the space of Gaussian distributions and performs network embedding based on global structural information.
struc2gauss first employs a given node similarity metric to measure the global structural information, then generates structural context for nodes and finally learns node representations via Gaussian embedding.
Different structural similarity measures of networks and energy functions of Gaussian embedding are investigated.
Experiments conducted on both synthetic and real-world data sets demonstrate that struc2gauss effectively captures the global structural information while state-of-the-art network embedding methods fails to, outperforms other methods on the structure-based clustering task and provides more information on uncertainties of node representations.
Zero-shot learning has received increasing interest as a means to alleviate the often prohibitive expense of annotating training data for large scale recognition problems.
These methods have achieved great success via learning intermediate semantic representations in the form of attributes and more recently, semantic word vectors.
However, they have thus far been constrained to the single-label case, in contrast to the growing popularity and importance of more realistic multi-label data.
In this paper, for the first time, we investigate and formalise a general framework for multi-label zero-shot learning, addressing the unique challenge therein: how to exploit multi-label correlation at test time with no training data for those classes?
In particular, we propose (1) a multi-output deep regression model to project an image into a semantic word space, which explicitly exploits the correlations in the intermediate semantic layer of word vectors; (2) a novel zero-shot learning algorithm for multi-label data that exploits the unique compositionality property of semantic word vector representations; and (3) a transductive learning strategy to enable the regression model learned from seen classes to generalise well to unseen classes.
Our zero-shot learning experiments on a number of standard multi-label datasets demonstrate that our method outperforms a variety of baselines.
The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018).
Our team was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th for the bts-rnc and active-dict datasets (containing mostly polysemous words) among all 19 participants.
The method we employed was extremely naive.
It implied representing contexts of ambiguous words as averaged word embedding vectors, using off-the-shelf pre-trained distributional models.
Then, these vector representations were clustered with mainstream clustering techniques, thus producing the groups corresponding to the ambiguous word senses.
As a side result, we show that word embedding models trained on small but balanced corpora can be superior to those trained on large but noisy data - not only in intrinsic evaluation, but also in downstream tasks like word sense induction.
This paper studies change point detection on networks with community structures.
It proposes a framework that can detect both local and global changes in networks efficiently.
Importantly, it can clearly distinguish the two types of changes.
The framework design is generic and as such several state-of-the-art change point detection algorithms can fit in this design.
Experiments on both synthetic and real-world networks show that this framework can accurately detect changes while achieving up to 800X speedup.
The time domain inter-cell interference coordination techniques specified in LTE Rel.
10 standard improves the throughput of picocell-edge users by protecting them from macrocell interference.
On the other hand, it also degrades the aggregate capacity in macrocell because the macro base station (MBS) does not transmit data during certain subframes known as almost blank subframes.
The MBS data transmission using reduced power subframes was standardized in LTE Rel.11, which can improve the capacity in macrocell while not causing high interference to the nearby picocells.
In order to get maximum benefit from the reduced power subframes, setting the key system parameters, such as the amount of power reduction, carries critical importance.
Using stochastic geometry, this paper lays down a theoretical foundation for the performance evaluation of heterogeneous networks with reduced power subframes and range expansion bias.
The analytic expressions for average capacity and 5th percentile throughput are derived as a function of transmit powers, node densities, and interference coordination parameters in a heterogeneous network scenario, and are validated through Monte Carlo simulations.
Joint optimization of range expansion bias, power reduction factor, scheduling thresholds, and duty cycle of reduced power subframes are performed to study the trade-offs between aggregate capacity of a cell and fairness among the users.
To validate our analysis, we also compare the stochastic geometry based theoretical results with the real MBS deployment (in the city of London) and the hexagonal-grid model.
Our analysis shows that with optimum parameter settings, the LTE Rel.11 with reduced power subframes can provide substantially better performance than the LTE Rel.10 with almost blank subframes, in terms of both aggregate capacity and fairness.
Mining dense subgraphs on multi-layer graphs is an interesting problem, which has witnessed lots of applications in practice.
To overcome the limitations of the quasi-clique-based approach, we propose d-coherent core (d-CC), a new notion of dense subgraph on multi-layer graphs, which has several elegant properties.
We formalize the diversified coherent core search (DCCS) problem, which finds k d-CCs that can cover the largest number of vertices.
We propose a greedy algorithm with an approximation ratio of 1 - 1/e and two search algorithms with an approximation ratio of 1/4.
The experiments verify that the search algorithms are faster than the greedy algorithm and produce comparably good results as the greedy algorithm in practice.
As opposed to the quasi-clique-based approach, our DCCS algorithms can fast detect larger dense subgraphs that cover most of the quasi-clique-based results.
The design of robotic systems is largely dictated by our purely human intuition about how we perceive the world.
This intuition has been proven incorrect with regard to a number of critical issues, such as visual change blindness.
In order to develop truly autonomous robots, we must step away from this intuition and let robotic agents develop their own way of perceiving.
The robot should start from scratch and gradually develop perceptual notions, under no prior assumptions, exclusively by looking into its sensorimotor experience and identifying repetitive patterns and invariants.
One of the most fundamental perceptual notions, space, cannot be an exception to this requirement.
In this paper we look into the prerequisites for the emergence of simplified spatial notions on the basis of a robot's sensorimotor flow.
We show that the notion of space as environment-independent cannot be deduced solely from exteroceptive information, which is highly variable and is mainly determined by the contents of the environment.
The environment-independent definition of space can be approached by looking into the functions that link the motor commands to changes in exteroceptive inputs.
In a sufficiently rich environment, the kernels of these functions correspond uniquely to the spatial configuration of the agent's exteroceptors.
We simulate a redundant robotic arm with a retina installed at its end-point and show how this agent can learn the configuration space of its retina.
The resulting manifold has the topology of the Cartesian product of a plane and a circle, and corresponds to the planar position and orientation of the retina.
We present a comprehensive study and evaluation of existing single image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single Image DEhazing (RESIDE).
RESIDE highlights diverse data sources and image contents, and is divided into five subsets, each serving different training or evaluation purposes.
We further provide a rich variety of criteria for dehazing algorithm evaluation, ranging from full-reference metrics, to no-reference metrics, to subjective evaluation and the novel task-driven evaluation.
Experiments on RESIDE shed light on the comparisons and limitations of state-of-the-art dehazing algorithms, and suggest promising future directions.
Trajectory prediction (TP) is of great importance for a wide range of location-based applications in intelligent transport systems such as location-based advertising, route planning, traffic management, and early warning systems.
In the last few years, the widespread use of GPS navigation systems and wireless communication technology enabled vehicles has resulted in huge volumes of trajectory data.
The task of utilizing this data employing spatio-temporal techniques for trajectory prediction in an efficient and accurate manner is an ongoing research problem.
Existing TP approaches are limited to short-term predictions.
Moreover, they cannot handle a large volume of trajectory data for long-term prediction.
To address these limitations, we propose a scalable clustering and Markov chain based hybrid framework, called Traj-clusiVAT-based TP, for both short-term and long-term trajectory prediction, which can handle a large number of overlapping trajectories in a dense road network.
Traj-clusiVAT can also determine the number of clusters, which represent different movement behaviours in input trajectory data.
In our experiments, we compare our proposed approach with a mixed Markov model (MMM)-based scheme, and a trajectory clustering, NETSCAN-based TP method for both short- and long-term trajectory predictions.
We performed our experiments on two real, vehicle trajectory datasets, including a large-scale trajectory dataset consisting of 3.28 million trajectories obtained from 15,061 taxis in Singapore over a period of one month.
Experimental results on two real trajectory datasets show that our proposed approach outperforms the existing approaches in terms of both short- and long-term prediction performances, based on prediction accuracy and distance error (in km).
The human heart is enclosed in the pericardial cavity.
The pericardium consists of a layered thin sac and is separated from the myocardium by a thin film of fluid.
It provides a fixture in space and frictionless sliding of the myocardium.
The influence of the pericardium is essential for predictive mechanical simulations of the heart.
However, there is no consensus on physiologically correct and computationally tractable pericardial boundary conditions.
Here we propose to model the pericardial influence as a parallel spring and dashpot acting in normal direction to the epicardium.
Using a four-chamber geometry, we compare a model with pericardial boundary conditions to a model with fixated apex.
The influence of pericardial stiffness is demonstrated in a parametric study.
Comparing simulation results to measurements from cine magnetic resonance imaging reveals that adding pericardial boundary conditions yields a better approximation with respect to atrioventricular plane displacement, atrial filling, and overall spatial approximation error.
We demonstrate that this simple model of pericardial-myocardial interaction can correctly predict the pumping mechanisms of the heart as previously assessed in clinical studies.
Utilizing a pericardial model can not only provide much more realistic cardiac mechanics simulations but also allows new insights into pericardial-myocardial interaction which cannot be assessed in clinical measurements yet.
The excessive use of digital devices such as cameras and smartphones in smart cities has produced huge data repositories that require automatic tools for efficient browsing, searching, and management.
Data prioritization (DP) is a technique that produces a condensed form of the original data by analyzing its contents.
Current DP studies are either concerned with data collected through stable capturing devices or focused on prioritization of data of a certain type such as surveillance, sports, or industry.
This necessitates the need for DP tools that intelligently and cost-effectively prioritize a large variety of data for detecting abnormal events and hence effectively manage them, thereby making the current smart cities greener.
In this article, we first carry out an in-depth investigation of the recent approaches and trends of DP for data of different natures, genres, and domains of two decades in green smart cities.
Next, we propose an energy-efficient DP framework by intelligent integration of the Internet of Things, artificial intelligence, and big data analytics.
Experimental evaluation on real-world surveillance data verifies the energy efficiency and applicability of this framework in green smart cities.
Finally, this article highlights the key challenges of DP, its future requirements, and propositions for integration into green smart cities.
We present a rational analysis of curiosity, proposing that people's curiosity is driven by seeking stimuli that maximize their ability to make appropriate responses in the future.
This perspective offers a way to unify previous theories of curiosity into a single framework.
Experimental results confirm our model's predictions, showing how the relationship between curiosity and confidence can change significantly depending on the nature of the environment.
In this technical report we present novel results of the dopamine neuromodulation inspired modulation of a polyaniline (PANI) memristive device excitatory learning STDP.
Results presented in this work are of two experiments setup computer simulation and physical prototype experiments.
We present physical prototype of inhibitory learning or iSTDP as well as the results of iSTDP learning.
Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset.
In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline.
Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern.
In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result.
Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time.
The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern.
We show how we can use the data structure in conjunction with query decomposition techniques, in order to efficiently represent the outputs for several classes of conjunctive queries.
We propose an end-to-end deep learning model for translating free-form natural language instructions to a high-level plan for behavioral robot navigation.
We use attention models to connect information from both the user instructions and a topological representation of the environment.
We evaluate our model's performance on a new dataset containing 10,050 pairs of navigation instructions.
Our model significantly outperforms baseline approaches.
Furthermore, our results suggest that it is possible to leverage the environment map as a relevant knowledge base to facilitate the translation of free-form navigational instruction.
We introduce the persistent homotopy type distance dHT to compare real valued functions defined on possibly different homotopy equivalent topological spaces.
The underlying idea in the definition of dHT is to measure the minimal shift that is necessary to apply to one of the two functions in order that the sublevel sets of the two functions become homotopically equivalent.
This distance is interesting in connection with persistent homology.
Indeed, our main result states that dHT still provides an upper bound for the bottleneck distance between the persistence diagrams of the intervening functions.
Moreover, because homotopy equivalences are weaker than homeomorphisms, this implies a lifting of the standard stability results provided by the L-infty distance and the natural pseudo-distance dNP.
From a different standpoint, we prove that dHT extends the L-infty distance and dNP in two ways.
First, we show that, appropriately restricting the category of objects to which dHT applies, it can be made to coincide with the other two distances.
Finally, we show that dHT has an interpretation in terms of interleavings that naturally places it in the family of distances used in persistence theory.
In this paper, we demonstrate an end-to-end spatiotemporal gesture learning approach for 3D point cloud data using a new gestures dataset of point clouds acquired from a 3D sensor.
Nine classes of gestures were learned from gestures sample data.
We mapped point cloud data into dense occupancy grids, then time steps of the occupancy grids are used as inputs into a 3D convolutional neural network which learns the spatiotemporal features in the data without explicit modeling of gesture dynamics.
We also introduced a 3D region of interest jittering approach for point cloud data augmentation.
This resulted in an increased classification accuracy of up to 10% when the augmented data is added to the original training data.
The developed model is able to classify gestures from the dataset with 84.44% accuracy.
We propose that point cloud data will be a more viable data type for scene understanding and motion recognition, as 3D sensors become ubiquitous in years to come.
Fostering technological innovation is intimately related to knowledge creation and recombination.
Here, we map research efforts in Greece within the domain of renewable energy technology and its intersections with the domains of nanoscience and nanotechnology with focus on materials, and electrical engineering and computer science by means of a combined statistical and network-based approach to studying collaboration in scientific authorship.
We specifically examine the content, organizational make-up and geographic trace of scientific collaboration, how these have evolved over the sixteen-year period 2000-2015, and we attempt to illuminate the processes underlying knowledge creation and diversification.
Our findings collectively provide insights into the collaboration structure and evolution of energy-related research activity in Greece and can be used to inform research, development and innovation policy for energy technology.
We argue that hierarchical methods can become the key for modular robots achieving reconfigurability.
We present a hierarchical approach for modular robots that allows a robot to simultaneously learn multiple tasks.
Our evaluation results present an environment composed of two different modular robot configurations, namely 3 degrees-of-freedom (DoF) and 4DoF with two corresponding targets.
During the training, we switch between configurations and targets aiming to evaluate the possibility of training a neural network that is able to select appropriate motor primitives and robot configuration to achieve the target.
The trained neural network is then transferred and executed on a real robot with 3DoF and 4DoF configurations.
We demonstrate how this technique generalizes to robots with different configurations and tasks.
Inversion and PDE-constrained optimization problems often rely on solving the adjoint problem to calculate the gradient of the objec- tive function.
This requires storing large amounts of intermediate data, setting a limit to the largest problem that might be solved with a given amount of memory available.
Checkpointing is an approach that can reduce the amount of memory required by redoing parts of the computation instead of storing intermediate results.
The Revolve checkpointing algorithm o ers an optimal schedule that trades computational cost for smaller memory footprints.
Integrat- ing Revolve into a modern python HPC code and combining it with code generation is not straightforward.
We present an API that makes checkpointing accessible from a DSL-based code generation environment along with some initial performance gures with a focus on seismic applications.
Automatic melody generation has been a long-time aspiration for both AI researchers and musicians.
However, learning to generate euphonious melodies has turned out to be highly challenging.
This paper introduces 1) a new variant of variational autoencoder (VAE), where the model structure is designed in a modularized manner in order to model polyphonic and dynamic music with domain knowledge, and 2) a hierarchical encoding/decoding strategy, which explicitly models the dependency between melodic features.
The proposed framework is capable of generating distinct melodies that sounds natural, and the experiments for evaluating generated music clips show that the proposed model outperforms the baselines in human evaluation.
Due to computational and storage efficiencies of compact binary codes, hashing has been widely used for large-scale similarity search.
Unfortunately, many existing hashing methods based on observed keyword features are not effective for short texts due to the sparseness and shortness.
Recently, some researchers try to utilize latent topics of certain granularity to preserve semantic similarity in hash codes beyond keyword matching.
However, topics of certain granularity are not adequate to represent the intrinsic semantic information.
In this paper, we present a novel unified approach for short text Hashing using Multi-granularity Topics and Tags, dubbed HMTT.
In particular, we propose a selection method to choose the optimal multi-granularity topics depending on the type of dataset, and design two distinct hashing strategies to incorporate multi-granularity topics.
We also propose a simple and effective method to exploit tags to enhance the similarity of related texts.
We carry out extensive experiments on one short text dataset as well as on one normal text dataset.
The results demonstrate that our approach is effective and significantly outperforms baselines on several evaluation metrics.
End-to-end task-oriented dialog systems usually suffer from the challenge of incorporating knowledge bases.
In this paper, we propose a novel yet simple end-to-end differentiable model called memory-to-sequence (Mem2Seq) to address this issue.
Mem2Seq is the first neural generative model that combines the multi-hop attention over memories with the idea of pointer network.
We empirically show how Mem2Seq controls each generation step, and how its multi-hop attention mechanism helps in learning correlations between memories.
In addition, our model is quite general without complicated task-specific designs.
As a result, we show that Mem2Seq can be trained faster and attain the state-of-the-art performance on three different task-oriented dialog datasets.
Millimeter-wave (mmWave) communications have been considered as a key technology for future 5G wireless networks because of the orders-of-magnitude wider bandwidth than current cellular bands.
In this paper, we consider the problem of codebook-based joint analog-digital hybrid precoder and combiner design for spatial multiplexing transmission in a mmWave multiple-input multiple-output (MIMO) system.
We propose to jointly select analog precoder and combiner pair for each data stream successively aiming at maximizing the channel gain while suppressing the interference between different data streams.
After all analog precoder/combiner pairs have been determined, we can obtain the effective baseband channel.
Then, the digital precoder and combiner are computed based on the obtained effective baseband channel to further mitigate the interference and maximize the sum-rate.
Simulation results demonstrate that our proposed algorithm exhibits prominent advantages in combating interference between different data streams and offer satisfactory performance improvement compared to the existing codebook-based hybrid beamforming schemes.
This paper proposes a new scheduler applying the concept of non-uniform laxity to Earliest deadline first (EDF) approach for aperiodic tasks.
This scheduler improves task utilisation (Execution time / deadline) and also increases the number of tasks that are being scheduled.
Laxity is a measure of the spare time permitted for the task before it misses its deadline, and is computed using the expression (deadline - (current time + execution time)).
Weight decides the priority of the task and is defined by the expression (quantum slice time / allocated time)*total core time for the task.
Quantum slice time is the time actually used, allocated time is the time allocated by the scheduler, and total core time is the time actually reserved by the core for execution of one quantum of the task.
Non-uniform laxity enables scheduling of tasks that have higher priority before the normal execution of other tasks and is computed by multiplying the weight of the task with its laxity.
The algorithm presented in the paper has been simulated on Cheddar, a real time scheduling tool and also on SESC, an architectural simulator for multicore platforms, for upto 5000 random task sets, and upto 5000 cores.
This scheduler improves task utilisation by 35% and the number of tasks being scheduled by 36%, compared to conventional EDF.
The non-stationarity characteristic of the solar power renders traditional point forecasting methods to be less useful due to large prediction errors.
This results in increased uncertainties in the grid operation, thereby negatively affecting the reliability and increased cost of operation.
This research paper proposes a unified architecture for multi-time-horizon predictions for short and long-term solar forecasting using Recurrent Neural Networks (RNN).
The paper describes an end-to-end pipeline to implement the architecture along with the methods to test and validate the performance of the prediction model.
The results demonstrate that the proposed method based on the unified architecture is effective for multi-horizon solar forecasting and achieves a lower root-mean-squared prediction error compared to the previous best-performing methods which use one model for each time-horizon.
The proposed method enables multi-horizon forecasts with real-time inputs, which have a high potential for practical applications in the evolving smart grid.
Supervised machine learning models boast remarkable predictive capabilities.
But can you trust your model?
Will it work in deployment?
What else can it tell you about the world?
We want models to be not only good, but interpretable.
And yet the task of interpretation appears underspecified.
Papers provide diverse and sometimes non-overlapping motivations for interpretability, and offer myriad notions of what attributes render models interpretable.
Despite this ambiguity, many papers proclaim interpretability axiomatically, absent further explanation.
In this paper, we seek to refine the discourse on interpretability.
First, we examine the motivations underlying interest in interpretability, finding them to be diverse and occasionally discordant.
Then, we address model properties and techniques thought to confer interpretability, identifying transparency to humans and post-hoc explanations as competing notions.
Throughout, we discuss the feasibility and desirability of different notions, and question the oft-made assertions that linear models are interpretable and that deep neural networks are not.
The need for higher agricultural productivity has demanded the intensive use of pesticides.
However, their correct use depends on assessment methods that can accurately predict how well the pesticides' spraying covered the intended crop region.
Some methods have been proposed in the literature, but their high cost and low portability harm their widespread use.
This paper proposes and experimentally evaluates a new methodology based on the use of a smartphone-based mobile application, named DropLeaf.
Experiments performed using DropLeaf showed that, in addition to its versatility, it can predict with high accuracy the pesticide spraying.
DropLeaf is a five-fold image-processing methodology based on: (i) color space conversion, (ii) threshold noise removal, (iii) convolutional operations of dilation and erosion, (iv) detection of contour markers in the water-sensitive card, and, (v) identification of droplets via the marker-controlled watershed transformation.
The authors performed successful experiments over two case studies, the first using a set of synthetic cards and the second using a real-world crop.
The proposed tool can be broadly used by farmers equipped with conventional mobile phones, improving the use of pesticides with health, environmental and financial benefits.
Face deidentification is an active topic amongst privacy and security researchers.
Early deidentification methods relying on image blurring or pixelization were replaced in recent years with techniques based on formal anonymity models that provide privacy guaranties and at the same time aim at retaining certain characteristics of the data even after deidentification.
The latter aspect is particularly important, as it allows to exploit the deidentified data in applications for which identity information is irrelevant.
In this work we present a novel face deidentification pipeline, which ensures anonymity by synthesizing artificial surrogate faces using generative neural networks (GNNs).
The generated faces are used to deidentify subjects in images or video, while preserving non-identity-related aspects of the data and consequently enabling data utilization.
Since generative networks are very adaptive and can utilize a diverse set of parameters (pertaining to the appearance of the generated output in terms of facial expressions, gender, race, etc.
), they represent a natural choice for the problem of face deidentification.
To demonstrate the feasibility of our approach, we perform experiments using automated recognition tools and human annotators.
Our results show that the recognition performance on deidentified images is close to chance, suggesting that the deidentification process based on GNNs is highly effective.
Being a matter of cognition, user interests should be apt to classification independent of the language of users, social network and content of interest itself.
To prove it, we analyze a collection of English and Russian Twitter and Vkontakte community pages by interests of their followers.
First, we create a model of Major Interests (MaIs) with the help of expert analysis and then classify a set of pages using machine learning algorithms (SVM, Neural Network, Naive Bayes, and some other).
We take three interest domains that are typical of both English and Russian-speaking communities: football, rock music, vegetarianism.
The results of classification show a greater correlation between Russian-Vkontakte and Russian-Twitter pages while English-Twitterpages appear to provide the highest score.
The line-of-sight (LoS) air-to-ground channel brings both opportunities and challenges in cellular-connected unmanned aerial vehicle (UAV) communications.
On one hand, the LoS channels make more cellular base stations (BSs) visible to a UAV as compared to the ground users, which leads to a higher macro-diversity gain for UAV-BS communications.
On the other hand, they also render the UAV to impose/suffer more severe uplink/downlink interference to/from the BSs, thus requiring more sophisticated inter-cell interference coordination (ICIC) techniques with more BSs involved.
In this paper, we consider the uplink transmission from a UAV to cellular BSs, under spectrum sharing with the existing ground users.
To investigate the optimal ICIC design and air-ground performance trade-off, we maximize the weighted sum-rate of the UAV and existing ground users by jointly optimizing the UAV's uplink cell associations and power allocations over multiple resource blocks.
However, this problem is non-convex and difficult to be solved optimally.
We first propose a centralized ICIC design to obtain a locally optimal solution based on the successive convex approximation (SCA) method.
As the centralized ICIC requires global information of the network and substantial information exchange among an excessively large number of BSs, we further propose a decentralized ICIC scheme of significantly lower complexity and signaling overhead for implementation, by dividing the cellular BSs into small-size clusters and exploiting the LoS macro-diversity for exchanging information between the UAV and cluster-head BSs only.
Numerical results show that the proposed centralized and decentralized ICIC schemes both achieve a near-optimal performance, and draw important design insights based on practical system setups.
In this paper we provide a survey of various libraries for homomorphic encryption.
We describe key features and trade-offs that should be considered while choosing the right approach for secure computation.
We then present a comparison of six commonly available Homomorphic Encryption libraries - SEAL, HElib, TFHE, Paillier, ELGamal and RSA across these identified features.
Support for different languages and real-life applications are also elucidated.
We study a classification problem where each feature can be acquired for a cost and the goal is to optimize a trade-off between the expected classification error and the feature cost.
We revisit a former approach that has framed the problem as a sequential decision-making problem and solved it by Q-learning with a linear approximation, where individual actions are either requests for feature values or terminate the episode by providing a classification decision.
On a set of eight problems, we demonstrate that by replacing the linear approximation with neural networks the approach becomes comparable to the state-of-the-art algorithms developed specifically for this problem.
The approach is flexible, as it can be improved with any new reinforcement learning enhancement, it allows inclusion of pre-trained high-performance classifier, and unlike prior art, its performance is robust across all evaluated datasets.
We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning.
The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016.
We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss examples of use.
The Moral Foundations Dictionary (MFD) is a useful tool for applying the conceptual framework developed in Moral Foundations Theory and quantifying the moral meanings implicated in the linguistic information people convey.
However, the applicability of the MFD is limited because it is available only in English.
Translated versions of the MFD are therefore needed to study morality across various cultures, including non-Western cultures.
The contribution of this paper is two-fold.
We developed the first Japanese version of the MFD (referred to as the J-MFD) by introducing a semi-automated method---this serves as a reference when translating the MFD into other languages.
We next tested the validity of the J-MFD by analyzing open-ended written texts about the situations that Japanese participants thought followed and violated the five moral foundations.
We found that the J-MFD correctly categorized the Japanese participants' descriptions into the corresponding moral foundations, and that the Moral Foundations Questionnaire (MFQ) scores were correlated with the frequency of situations, of total words, and of J-MFD words in the participants' descriptions for the Harm and Fairness foundations.
The J-MFD can be used to study morality unique to the Japanese and cultural differences in moral behavior.
Bounded rationality investigates utility-optimizing decision-makers with limited information-processing power.
In particular, information theoretic bounded rationality models formalize resource constraints abstractly in terms of relative Shannon information, namely the Kullback-Leibler Divergence between the agents' prior and posterior policy.
Between prior and posterior lies an anytime deliberation process that can be instantiated by sample-based evaluations of the utility function through Markov Chain Monte Carlo (MCMC) optimization.
The most simple model assumes a fixed prior and can relate abstract information-theoretic processing costs to the number of sample evaluations.
However, more advanced models would also address the question of learning, that is how the prior is adapted over time such that generated prior proposals become more efficient.
In this work we investigate generative neural networks as priors that are optimized concurrently with anytime sample-based decision-making processes such as MCMC.
We evaluate this approach on toy examples.
In this paper we present the Creative Invention Benchmark (CrIB), a 2000-problem benchmark for evaluating a particular facet of computational creativity.
Specifically, we address combinational p-creativity, the creativity at play when someone combines existing knowledge to achieve a solution novel to that individual.
We present generation strategies for the five problem categories of the benchmark and a set of initial baselines.
Computational research and data analytics increasingly relies on complex ecosystems of open source software (OSS) "libraries" -- curated collections of reusable code that programmers import to perform a specific task.
Software documentation for these libraries is crucial in helping programmers/analysts know what libraries are available and how to use them.
Yet documentation for open source software libraries is widely considered low-quality.
This article is a collaboration between CSCW researchers and contributors to data analytics OSS libraries, based on ethnographic fieldwork and qualitative interviews.
We examine several issues around the formats, practices, and challenges around documentation in these largely volunteer-based projects.
There are many different kinds and formats of documentation that exist around such libraries, which play a variety of educational, promotional, and organizational roles.
The work behind documentation is similarly multifaceted, including writing, reviewing, maintaining, and organizing documentation.
Different aspects of documentation work require contributors to have different sets of skills and overcome various social and technical barriers.
Finally, most of our interviewees do not report high levels of intrinsic enjoyment for doing documentation work (compared to writing code).
Their motivation is affected by personal and project-specific factors, such as the perceived level of credit for doing documentation work versus more "technical" tasks like adding new features or fixing bugs.
In studying documentation work for data analytics OSS libraries, we gain a new window into the changing practices of data-intensive research, as well as help practitioners better understand how to support this often invisible and infrastructural work in their projects.
To save time and money, businesses and individuals have begun outsourcing their data and computations to cloud computing services.
These entities would, however, like to ensure that the queries they request from the cloud services are being computed correctly.
In this paper, we use the principles of economics and competition to vastly reduce the complexity of query verification on outsourced data.
We consider two cases: First, we consider the scenario where multiple non-colluding data outsourcing services exist, and then we consider the case where only a single outsourcing service exists.
Using a game theoretic model, we show that given the proper incentive structure, we can effectively deter dishonest behavior on the part of the data outsourcing services with very few computational and monetary resources.
We prove that the incentive for an outsourcing service to cheat can be reduced to zero.
Finally, we show that a simple verification method can achieve this reduction through extensive experimental evaluation.
Face alignment, which is the task of finding the locations of a set of facial landmark points in an image of a face, is useful in widespread application areas.
Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression.
To address this issue, we propose a cascade in which each stage consists of a mixture of regression experts.
Each expert learns a customized regression model that is specialized to a different subset of the joint space of pose and expressions.
The system is invariant to a predefined class of transformations (e.g., affine), because the input is transformed to match each expert's prototype shape before the regression is applied.
We also present a method to include deformation constraints within the discriminative alignment framework, which makes our algorithm more robust.
Our algorithm significantly outperforms previous methods on publicly available face alignment datasets.
This paper presents a practical approach to rapidly introduce new dataplane functionality into networks: End-hosts embed tiny programs into packets to actively query and manipulate a network's internal state.
We show how this "tiny packet program" (TPP) interface gives end-hosts unprecedented visibility into network behavior, enabling them to work with the network to achieve a common goal.
Our design leverages what each component does best: (a) switches forward and execute tiny packet programs (at most 5 instructions) at line rate, and (b) end-hosts perform arbitrary computation on network state, which are easy to evolve.
Using a hardware prototype on a NetFPGA, we show our design is feasible, at a reasonable cost.
By implementing three different research proposals, we show that TPPs are also useful.
And finally, we present an architecture in which they can be made secure.
Person Re-identification (ReID) is to identify the same person across different cameras.
It is a challenging task due to the large variations in person pose, occlusion, background clutter, etc How to extract powerful features is a fundamental problem in ReID and is still an open problem today.
In this paper, we design a Multi-Scale Context-Aware Network (MSCAN) to learn powerful features over full body and body parts, which can well capture the local context knowledge by stacking multi-scale convolutions in each layer.
Moreover, instead of using predefined rigid parts, we propose to learn and localize deformable pedestrian parts using Spatial Transformer Networks (STN) with novel spatial constraints.
The learned body parts can release some difficulties, eg pose variations and background clutters, in part-based representation.
Finally, we integrate the representation learning processes of full body and body parts into a unified framework for person ReID through multi-class person identification tasks.
Extensive evaluations on current challenging large-scale person ReID datasets, including the image-based Market1501, CUHK03 and sequence-based MARS datasets, show that the proposed method achieves the state-of-the-art results.
The DLVHEX system implements the HEX-semantics, which integrates answer set programming (ASP) with arbitrary external sources.
Since its first release ten years ago, significant advancements were achieved.
Most importantly, the exploitation of properties of external sources led to efficiency improvements and flexibility enhancements of the language, and technical improvements on the system side increased user's convenience.
In this paper, we present the current status of the system and point out the most important recent enhancements over early versions.
While existing literature focuses on theoretical aspects and specific components, a bird's eye view of the overall system is missing.
In order to promote the system for real-world applications, we further present applications which were already successfully realized on top of DLVHEX.
This paper is under consideration for acceptance in Theory and Practice of Logic Programming.
To avoid the foreseeable spectrum crunch, LTE operators have started to explore the option to directly use 5 GHz unlicensed spectrum band being used by IEEE 802.11 (WiFi).
However, as LTE is not designed with shared spectrum access in mind, there is a major issue of coexistence with WiFi networks.
Current coexistence schemes to be deployed at the LTE-U BS create coexistence gaps only in one domain (e.g., time, frequency, or space) and can provide only incremental gains due to the lack of coordination among the coexisting WiFi and LTE-U networks.
Therefore, we propose a coordinated coexistence scheme which relies on cooperation between neighboring LTE-U and WiFi networks.
Our proposal suggests that LTE-U BSs equipped with multiple antennas can create coexistence gaps in space domain in addition to the time domain gaps by means of cross-technology interference nulling towards WiFi nodes in the interference range.
In return, LTE-U can increase its own airtime utilization while trading off slightly its antenna diversity.
The cooperation offers benefits to both LTE-U and WiFi in terms of improved throughput and decreased channel access delay.
More specifically, system-level simulations reveal a throughput gain up to 221% for LTE-U network and 44% for WiFi network depending on the setting, e.g., the distance between the two cell, number of LTE antennas, and WiFi users in the LTE-U BS neighborhood.
Our proposal provides significant benefits especially for moderate separation distances between LTE-U/WiFi cells where interference from a neighboring network might be severe due to the hidden network problem.
We explore a collaborative multi-agent reinforcement learning setting where a team of agents attempts to solve cooperative tasks in partially-observable environments.
In this scenario, learning an effective communication protocol is key.
We propose a communication architecture that allows for targeted communication, where agents learn both what messages to send and who to send them to, solely from downstream task-specific reward without any communication supervision.
Additionally, we introduce a multi-stage communication approach where the agents co-ordinate via multiple rounds of communication before taking actions in the environment.
We evaluate our approach on a diverse set of cooperative multi-agent tasks, of varying difficulties, with varying number of agents, in a variety of environments ranging from 2D grid layouts of shapes and simulated traffic junctions to complex 3D indoor environments.
We demonstrate the benefits of targeted as well as multi-stage communication.
Moreover, we show that the targeted communication strategies learned by agents are both interpretable and intuitive.
In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing.
However, little effort has been devoted at enhancing performance at micro-architecture level.
This paper characterizes the performance of in-memory data analytics using Apache Spark framework.
We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads.
We also quantify the inefficiencies at micro-architecture level for various data analysis workloads.
Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance.
Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.
Measuring science is based on comparing articles to similar others.
However, keyword-based groups of thematically similar articles are dominantly small.
These small sizes keep the statistical errors of comparisons high.
With the growing availability of bibliographic data such statistical errors can be reduced by merging methods of thematic grouping, citation networks and keyword co-usage.
Wireless Network-on-Chip (WNoC) appears as a promising alternative to conventional interconnect fabrics for chip-scale communications.
The WNoC paradigm has been extensively analyzed from the physical, network and architecture perspectives assuming mmWave band operation.
However, there has not been a comprehensive study at this band for realistic chip packages and, thus, the characteristics of such wireless channel remain not fully understood.
This work addresses this issue by accurately modeling a flip-chip package and investigating the wave propagation inside it.
Through parametric studies, a locally optimal configuration for 60 GHz WNoC is obtained, showing that chip-wide attenuation below 32.6 dB could be achieved with standard processes.
Finally, the applicability of the methodology is discussed for higher bands and other integrated environments such as a Software-Defined Metamaterial (SDM).
Although transfer learning has been shown to be successful for tasks like object and speech recognition, its applicability to question answering (QA) has yet to be well-studied.
In this paper, we conduct extensive experiments to investigate the transferability of knowledge learned from a source QA dataset to a target dataset using two QA models.
The performance of both models on a TOEFL listening comprehension test (Tseng et al., 2016) and MCTest (Richardson et al., 2013) is significantly improved via a simple transfer learning technique from MovieQA (Tapaswi et al., 2016).
In particular, one of the models achieves the state-of-the-art on all target datasets; for the TOEFL listening comprehension test, it outperforms the previous best model by 7%.
Finally, we show that transfer learning is helpful even in unsupervised scenarios when correct answers for target QA dataset examples are not available.
It is proposed a new code for contours of plane images.
This code was applied for optical character recognition of printed and handwritten characters.
One can apply it to recognition of any visual images.
It has long been known that certain superquantum nonlocal correlations collapse communication complexity, and it is conjectured that a statement like "communication complexity is not trivial" may provide an intuitive information-theoretic axiom for quantum mechanics.
With the goal of addressing this conjecture, we take aim at collapsing communication complexity using weaker nonlocal correlations, and present a no-go theorem for a broad class of approaches.
To achieve this, we investigate fault-tolerant computation by noisy circuits in a new light.
Our main technical result is that, perhaps surprisingly, noiseless XOR gates are not more helpful than noisy ones in read-once formulas that have noisy AND gates for the task of building amplifiers.
We also formalize a connection between fault-tolerant computation and amplification, and highlight new directions and open questions in fault-tolerant computation with noisy circuits.
Our results inform the relationship between superquantum nonlocality and the collapse of communication complexity.
The Keystroke Level Model (KLM) and Fitts Law constitute core teaching subjects in most HCI courses, as well as many courses on software design and evaluation.
The KLM Form Analyzer (KLM_FA) has been introduced as a practitioner s tool to facilitate web form design and evaluation, based on these established HCI predictive models.
It was also hypothesized that KLMFA can also be used for educational purposes, since it provides step by step tracing of the KLM modeling for any web form filling task, according to various interaction strategies or users characteristics.
In our previous work, we found that KLM-FA supports teaching and learning of HCI modeling in the context of distance education.
This paper reports a study investigating the learning effectiveness of KLM-FA in the context of campus-based higher education.
Students of a software quality course completed a knowledge test after the lecture- based instruction (pre-test condition) and after being involved in a KLMFA mediated learning activity (post-test condition).
They also provided posttest ratings for their educational experience and the tool s usability.
Results showed that KLM-FA can significantly improve learning of the HCI modeling.
In addition, participating students rated their perceived educational experience as very satisfactory and the perceived usability of KLM-FA as good to excellent.
Studies estimate that there will be 266,120 new cases of invasive breast cancer and 40,920 breast cancer induced deaths in the year of 2018 alone.
Despite the pervasiveness of this affliction, the current process to obtain an accurate breast cancer prognosis is tedious and time consuming, requiring a trained pathologist to manually examine histopathological images in order to identify the features that characterize various cancer severity levels.
We propose MITOS-RCNN: a novel region based convolutional neural network (RCNN) geared for small object detection to accurately grade one of the three factors that characterize tumor belligerence described by the Nottingham Grading System: mitotic count.
Other computational approaches to mitotic figure counting and detection do not demonstrate ample recall or precision to be clinically viable.
Our models outperformed all previous participants in the ICPR 2012 challenge, the AMIDA 2013 challenge and the MITOS-ATYPIA-14 challenge along with recently published works.
Our model achieved an F-measure score of 0.955, a 6.11% improvement in accuracy from the most accurate of the previously proposed models.
This article presents an anatomy of PhD programmes in Hellenic universities' departments of computer science/engineering from the perspective of research productivity and impact.
The study aims at showing the dynamics of research conducted in computer science/engineering departments, and after recognizing weaknesses, to motivate the stakeholders to take actions that will improve competition and excellence.
Beneficiaries of this investigation are the following entities: a) the departments themselves can assess their performance relative to that of other departments and then set strategic goals and design procedures to achieve them, b) supervisors can assess the part of their research conducted with PhDs and set their own goals, c) former PhDs who can identify their relative success, and finally d) prospective PhD students who can consider the efficacy of departments and supervisors in conducting high-impact research as one more significant factor in designing the doctoral studies they will follow.
Highly regularized LSTMs achieve impressive results on several benchmark datasets in language modeling.
We propose a new regularization method based on decoding the last token in the context using the predicted distribution of the next token.
This biases the model towards retaining more contextual information, in turn improving its ability to predict the next token.
With negligible overhead in the number of parameters and training time, our Past Decode Regularization (PDR) method achieves a word level perplexity of 55.6 on the Penn Treebank and 63.5 on the WikiText-2 datasets using a single softmax.
We also show gains by using PDR in combination with a mixture-of-softmaxes, achieving a word level perplexity of 53.8 and 60.5 on these datasets.
In addition, our method achieves 1.169 bits-per-character on the Penn Treebank Character dataset for character level language modeling.
These results constitute a new state-of-the-art in their respective settings.
Research on sound event detection (SED) with weak labeling has mostly focused on presence/absence labeling, which provides no temporal information at all about the event occurrences.
In this paper, we consider SED with sequential labeling, which specifies the temporal order of the event boundaries.
The conventional connectionist temporal classification (CTC) framework, when applied to SED with sequential labeling, does not localize long events well due to a "peak clustering" problem.
We adapt the CTC framework and propose connectionist temporal localization (CTL), which successfully solves the problem.
Evaluation on a subset of Audio Set shows that CTL closes a third of the gap between presence/ absence labeling and strong labeling, demonstrating the usefulness of the extra temporal information in sequential labeling.
CTL also makes it easy to combine sequential labeling with presence/absence labeling and strong labeling.
In this work, we investigate various methods to deal with semantic labeling of very high resolution multi-modal remote sensing data.
Especially, we study how deep fully convolutional networks can be adapted to deal with multi-modal and multi-scale remote sensing data for semantic labeling.
Our contributions are threefold: a) we present an efficient multi-scale approach to leverage both a large spatial context and the high resolution data, b) we investigate early and late fusion of Lidar and multispectral data, c) we validate our methods on two public datasets with state-of-the-art results.
Our results indicate that late fusion make it possible to recover errors steaming from ambiguous data, while early fusion allows for better joint-feature learning but at the cost of higher sensitivity to missing data.
In the field of generic object tracking numerous attempts have been made to exploit deep features.
Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features.
In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking.
We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness.
We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking.
Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy.
Extensive experiments are performed on four challenging datasets.
On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of 17% in EAO.
The groundbreaking experiment of Travers and Milgram demonstrated the so-called "six degrees of separation" phenomenon, by which any individual in the world is able to contact an arbitrary, hitherto-unknown, individual by means of a short chain of social ties.
Despite the large number of empirical and theoretical studies to explain the Travers-Milgram experiment, some fundamental questions are still open: why some individuals are more likely than others to discover short friend-of-a-friend communication chains?
Can we rank individuals on the basis of their ability to discover short chains?
To answer these questions, we extend the concept of potential gain, originally defined in the context of Web analysis, to social networks and we define a novel index, called "the navigability score," that ranks nodes in a network on the basis of how their position facilitates the discover of short chains that connect to arbitrary target nodes in the network.
We define two variants of potential gain, called the geometric and the exponential potential gain, and present fast algorithms to compute them.
Our theoretical and experimental analysis proves that the computation of the geometric and exponential gain are affordable even on large real-life graphs.
The rise of social media provides a great opportunity for people to reach out to their social connections to satisfy their information needs.
However, generic social media platforms are not explicitly designed to assist information seeking of users.
In this paper, we propose a novel framework to identify the social connections of a user able to satisfy his information needs.
The information need of a social media user is subjective and personal, and we investigate the utility of his social context to identify people able to satisfy it.
We present questions users post on Twitter as instances of information seeking activities in social media.
We infer soft community memberships of the asker and his social connections by integrating network and content information.
Drawing concepts from the social foci theory, we identify answerers who share communities with the asker w.r.t. the question.
Our experiments demonstrate that the framework is effective in identifying answerers to social media questions.
A worldwide movement towards the publication of Open Government Data is taking place, and budget data is one of the key elements pushing this trend.
Its importance is mostly related to transparency, but publishing budget data, combined with other actions, can also improve democratic participation, allow comparative analysis of governments and boost data-driven business.
However, the lack of standards and common evaluation criteria still hinders the development of appropriate tools and the materialization of the appointed benefits.
In this paper, we present a model to analyse government initiatives to publish budget data.
We identify the main features of these initiatives with a double objective: (i) to drive a structured analysis, relating some dimensions to their possible impacts, and (ii) to derive characterization attributes to compare initiatives based on each dimension.
We define use perspectives and analyse some initiatives using this model.
We conclude that, in order to favour use perspectives, special attention must be given to user feedback, semantics standards and linking possibilities.
Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars.
While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames.
To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames.
We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI.
We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.
Software analytics has been widely used in software engineering for many tasks such as generating effort estimates for software projects.
One of the "black arts" of software analytics is tuning the parameters controlling a data mining algorithm.
Such hyperparameter optimization has been widely studied in other software analytics domains (e.g.defect prediction and text mining) but, so far, has not been extensively explored for effort estimation.
Accordingly, this paper seeks simple, automatic, effective and fast methods for finding good tunings for automatic software effort estimation.
We introduce a hyperparameter optimization architecture called OIL (Optimized Inductive Learning).
We test OIL on a wide range of hyperparameter optimizers using data from 945 software projects.
After tuning, large improvements in effort estimation accuracy were observed (measured in terms of standardized accuracy).
From those results, we recommend using regression trees (CART) tuned by different evolution combine with default analogy-based estimator.
This particular combination of learner and optimizers often achieves in a few hours what other optimizers need days to weeks of CPU time to accomplish.
An important part of this analysis is its reproducibility and refutability.
All our scripts and data are on-line.
It is hoped that this paper will prompt and enable much more research on better methods to tune software effort estimators.
The impressive success of Generative Adversarial Networks (GANs) is often overshadowed by the difficulties in their training.
Despite the continuous efforts and improvements, there are still open issues regarding their convergence properties.
In this paper, we propose a simple training variation where suitable weights are defined and assist the training of the Generator.
We provide theoretical arguments why the proposed algorithm is better than the baseline training in the sense of speeding up the training process and of creating a stronger Generator.
Performance results showed that the new algorithm is more accurate in both synthetic and image datasets resulting in improvements ranging between 5% and 50%.
Crowdsourcing relies on people's contributions to meet product- or system-level objectives.
Crowdsourcing-based methods have been implemented in various cyber-physical systems and realtime markets.
This paper explores a framework for Crowdsourced Energy Systems (CES), where small-scale energy generation or energy trading is crowdsourced from distributed energy resources, electric vehicles, and shapable loads.
The merits/pillars of energy crowdsourcing are discussed.
Then, an operational model for CESs in distribution networks with different types of crowdsourcees is proposed.
The model yields a market equilibrium depicting traditional and distributed generator and load setpoints.
Given these setpoints, crowdsourcing incentives are designed to steer crowdsourcees to the equilibrium.
As the number of crowdsourcees and energy trading transactions scales up, a secure energy trading platform is required.
To that end, the presented framework is integrated with a lightweight Blockchain implementation and smart contracts.
Numerical tests are provided to showcase the overall implementation.
Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements.
We propose a "compressive learning" framework where we estimate model parameters from a sketch of the training data.
This sketch is a collection of generalized moments of the underlying probability distribution of the data.
It can be computed in a single pass on the training set, and is easily computable on streams or distributed datasets.
The proposed framework shares similarities with compressive sensing, which aims at drastically reducing the dimension of high-dimensional signals while preserving the ability to reconstruct them.
To perform the estimation task, we derive an iterative algorithm analogous to sparse reconstruction algorithms in the context of linear inverse problems.
We exemplify our framework with the compressive estimation of a Gaussian Mixture Model (GMM), providing heuristics on the choice of the sketching procedure and theoretical guarantees of reconstruction.
We experimentally show on synthetic data that the proposed algorithm yields results comparable to the classical Expectation-Maximization (EM) technique while requiring significantly less memory and fewer computations when the number of database elements is large.
We further demonstrate the potential of the approach on real large-scale data (over 10 8 training samples) for the task of model-based speaker verification.
Finally, we draw some connections between the proposed framework and approximate Hilbert space embedding of probability distributions using random features.
We show that the proposed sketching operator can be seen as an innovative method to design translation-invariant kernels adapted to the analysis of GMMs.
We also use this theoretical framework to derive information preservation guarantees, in the spirit of infinite-dimensional compressive sensing.
Sparse Bayesian learning is a state-of-the-art supervised learning algorithm that can choose a subset of relevant samples from the input data and make reliable probabilistic predictions.
However, in the presence of high-dimensional data with irrelevant features, traditional sparse Bayesian classifiers suffer from performance degradation and low efficiency by failing to eliminate irrelevant features.
To tackle this problem, we propose a novel sparse Bayesian embedded feature selection method that adopts truncated Gaussian distributions as both sample and feature priors.
The proposed method, called probabilistic feature selection and classification vector machine (PFCVMLP ), is able to simultaneously select relevant features and samples for classification tasks.
In order to derive the analytical solutions, Laplace approximation is applied to compute approximate posteriors and marginal likelihoods.
Finally, parameters and hyperparameters are optimized by the type-II maximum likelihood method.
Experiments on three datasets validate the performance of PFCVMLP along two dimensions: classification performance and effectiveness for feature selection.
Finally, we analyze the generalization performance and derive a generalization error bound for PFCVMLP .
By tightening the bound, the importance of feature selection is demonstrated.
Discrete energy minimization is a ubiquitous task in computer vision, yet is NP-hard in most cases.
In this work we propose a multiscale framework for coping with the NP-hardness of discrete optimization.
Our approach utilizes algebraic multiscale principles to efficiently explore the discrete solution space, yielding improved results on challenging, non-submodular energies for which current methods provide unsatisfactory approximations.
In contrast to popular multiscale methods in computer vision, that builds an image pyramid, our framework acts directly on the energy to construct an energy pyramid.
Deriving a multiscale scheme from the energy itself makes our framework application independent and widely applicable.
Our framework gives rise to two complementary energy coarsening strategies: one in which coarser scales involve fewer variables, and a more revolutionary one in which the coarser scales involve fewer discrete labels.
We empirically evaluated our unified framework on a variety of both non-submodular and submodular energies, including energies from Middlebury benchmark.
Channel estimation at millimeter wave (mmWave) is challenging when large antenna arrays are used.
Prior work has leveraged the sparse nature of mmWave channels via compressed sensing based algorithms for channel estimation.
Most of these algorithms, though, assume perfect synchronization and are vulnerable to phase errors that arise due to carrier frequency offset (CFO) and phase noise.
Recently sparsity-aware, non-coherent beamforming algorithms that are robust to phase errors were proposed for narrowband phased array systems with full resolution analog-to-digital converters (ADCs).
Such energy based algorithms, however, are not robust to heavy quantization at the receiver.
In this paper, we develop a joint CFO and wideband channel estimation algorithm that is scalable across different mmWave architectures.
Our method exploits the sparsity of mmWave MIMO channel in the angle-delay domain, in addition to compressibility of the phase error vector.
We formulate the joint estimation as a sparse bilinear optimization problem and then use message passing for recovery.
We also give an efficient implementation of a generalized bilinear message passing algorithm for the joint estimation in mmWave systems with one-bit ADCs.
Simulation results show that our method is able to recover the CFO and the channel compressively, even in the presence of phase noise.
We describe opportunities and challenges with wireless robotic materials.
Robotic materials are multi-functional composites that tightly integrate sensing, actuation, computation and communication to create smart composites that can sense their environment and change their physical properties in an arbitrary programmable manner.
Computation and communication in such materials are based on miniature, possibly wireless, devices that are scattered in the material and interface with sensors and actuators inside the material.
Whereas routing and processing of information within the material build upon results from the field of sensor networks, robotic materials are pushing the limits of sensor networks in both size (down to the order of microns) and numbers of devices (up to the order of millions).
In order to solve the algorithmic and systems challenges of such an approach, which will involve not only computer scientists, but also roboticists, chemists and material scientists, the community requires a common platform - much like the "Mote" that bootstrapped the widespread adoption of the field of sensor networks - that is small, provides ample of computation, is equipped with basic networking functionalities, and preferably can be powered wirelessly.
To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.
We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed.
We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.
Plain recurrent networks greatly suffer from the vanishing gradient problem while Gated Neural Networks (GNNs) such as Long-short Term Memory (LSTM) and Gated Recurrent Unit (GRU) deliver promising results in many sequence learning tasks through sophisticated network designs.
This paper shows how we can address this problem in a plain recurrent network by analyzing the gating mechanisms in GNNs.
We propose a novel network called the Recurrent Identity Network (RIN) which allows a plain recurrent network to overcome the vanishing gradient problem while training very deep models without the use of gates.
We compare this model with IRNNs and LSTMs on multiple sequence modeling benchmarks.
The RINs demonstrate competitive performance and converge faster in all tasks.
Notably, small RIN models produce 12%--67% higher accuracy on the Sequential and Permuted MNIST datasets and reach state-of-the-art performance on the bAbI question answering dataset.
Estimation of Distribution Algorithms (EDAs) require flexible probability models that can be efficiently learned and sampled.
Generative Adversarial Networks (GAN) are generative neural networks which can be trained to implicitly model the probability distribution of given data, and it is possible to sample this distribution.
We integrate a GAN into an EDA and evaluate the performance of this system when solving combinatorial optimization problems with a single objective.
We use several standard benchmark problems and compare the results to state-of-the-art multivariate EDAs.
GAN-EDA doe not yield competitive results - the GAN lacks the ability to quickly learn a good approximation of the probability distribution.
A key reason seems to be the large amount of noise present in the first EDA generations.
We here study the behavior of political party members aiming at identifying how ideological communities are created and evolve over time in diverse (fragmented and non-fragmented) party systems.
Using public voting data of both Brazil and the US, we propose a methodology to identify and characterize ideological communities, their member polarization, and how such communities evolve over time, covering a 15-year period.
Our results reveal very distinct patterns across the two case studies, in terms of both structural and dynamic properties.
Machine translation (MT) was developed as one of the hottest research topics in the natural language processing (NLP) literature.
One important issue in MT is that how to evaluate the MT system reasonably and tell us whether the translation system makes an improvement or not.
The traditional manual judgment methods are expensive, time-consuming, unrepeatable, and sometimes with low agreement.
On the other hand, the popular automatic MT evaluation methods have some weaknesses.
Firstly, they tend to perform well on the language pairs with English as the target language, but weak when English is used as source.
Secondly, some methods rely on many additional linguistic features to achieve good performance, which makes the metric unable to replicate and apply to other language pairs easily.
Thirdly, some popular metrics utilize incomprehensive factors, which result in low performance on some practical tasks.
In this thesis, to address the existing problems, we design novel MT evaluation methods and investigate their performances on different languages.
Firstly, we design augmented factors to yield highly accurate evaluation.Secondly, we design a tunable evaluation model where weighting of factors can be optimised according to the characteristics of languages.
Thirdly, in the enhanced version of our methods, we design concise linguistic feature using POS to show that our methods can yield even higher performance when using some external linguistic resources.
Finally, we introduce the practical performance of our metrics in the ACL-WMT workshop shared tasks, which show that the proposed methods are robust across different languages.
We give a mathematical formalization of `generalized data parallel' operations, a concept that covers such common scientific kernels as matrix-vector multiplication, multi-grid coarsening, load distribution, and many more.
We show that from a compact specification such computational aspects as MPI messages or task dependencies can be automatically derived.
Predictive models of student success in Massive Open Online Courses (MOOCs) are a critical component of effective content personalization and adaptive interventions.
In this article we review the state of the art in predictive models of student success in MOOCs and present a categorization of MOOC research according to the predictors (features), prediction (outcomes), and underlying theoretical model.
We critically survey work across each category, providing data on the raw data source, feature engineering, statistical model, evaluation method, prediction architecture, and other aspects of these experiments.
Such a review is particularly useful given the rapid expansion of predictive modeling research in MOOCs since the emergence of major MOOC platforms in 2012.
This survey reveals several key methodological gaps, which include extensive filtering of experimental subpopulations, ineffective student model evaluation, and the use of experimental data which would be unavailable for real-world student success prediction and intervention, which is the ultimate goal of such models.
Finally, we highlight opportunities for future research, which include temporal modeling, research bridging predictive and explanatory student models, work which contributes to learning theory, and evaluating long-term learner success in MOOCs.
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules.
On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across different corpora and domains.
On the other hand, by introducing the concept of compositional operations and exploiting syntactic information in the form of universal dependencies, we tackle one of their main drawbacks: their rigidity on data that are structured differently depending on the language concerned.
Experiments show an improvement both over existing unsupervised methods, and over state-of-the-art supervised models when evaluating outside their corpus of origin.
Experiments also show how the same compositional operations can be shared across languages.
The system is available at http://www.grupolys.org/software/UUUSA/
Text Proposals have emerged as a class-dependent version of object proposals - efficient approaches to reduce the search space of possible text object locations in an image.
Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text recognition.
In this paper we propose an improvement over the original Text Proposals algorithm of Gomez and Karatzas (2016), combining it with Fully Convolutional Networks to improve the ranking of proposals.
Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.
Good user experience with interactive cloud-based multimedia applications, such as cloud gaming and cloud-based VR, requires low end-to-end latency and large amounts of downstream network bandwidth at the same time.
In this paper, we present a foveated video streaming system for cloud gaming.
The system adapts video stream quality by adjusting the encoding parameters on the fly to match the player's gaze position.
We conduct measurements with a prototype that we developed for a cloud gaming system in conjunction with eye tracker hardware.
Evaluation results suggest that such foveated streaming can reduce bandwidth requirements by even more than 50% depending on parametrization of the foveated video coding and that it is feasible from the latency perspective.
In this paper, we conduct extensive simulations to understand the properties of the overlay generated by BitTorrent.
We start by analyzing how the overlay properties impact the efficiency of BitTorrent.
We focus on the average peer set size (i.e., average number of neighbors), the time for a peer to reach its maximum peer set size, and the diameter of the overlay.
In particular, we show that the later a peer arrives in a torrent, the longer it takes to reach its maximum peer set size.
Then, we evaluate the impact of the maximum peer set size, the maximum number of outgoing connections per peer, and the number of NATed peers on the overlay properties.
We show that BitTorrent generates a robust overlay, but that this overlay is not a random graph.
In particular, the connectivity of a peer to its neighbors depends on its arriving order in the torrent.
We also show that a large number of NATed peers significantly compromise the robustness of the overlay to attacks.
Finally, we evaluate the impact of peer exchange on the overlay properties, and we show that it generates a chain-like overlay with a large diameter, which will adversely impact the efficiency of large torrents.
The Gilbert type bound for codes in the title is reviewed, both for small and large alphabets.
Constructive lower bounds better than these existential bounds are derived from geometric codes, either over Fp or Fp2 ; or over even degree extensions of Fp: In the latter case the approach is concatena- tion with a good code for the Hamming metric as outer code and a short code for the Lee metric as an inner code.
In the former case lower bounds on the minimum Lee distance are derived by algebraic geometric arguments inspired by results of Wu, Kuijper, Udaya (2007).
A recommender system is an information filtering technology which can be used to predict preference ratings of items (products, services, movies, etc) and/or to output a ranking of items that are likely to be of interest to the user.
Context-aware recommender systems (CARS) learn and predict the tastes and preferences of users by incorporating available contextual information in the recommendation process.
One of the major challenges in context-aware recommender systems research is the lack of automatic methods to obtain contextual information for these systems.
Considering this scenario, in this paper, we propose to use contextual information from topic hierarchies of the items (web pages) to improve the performance of context-aware recommender systems.
The topic hierarchies are constructed by an extension of the LUPI-based Incremental Hierarchical Clustering method that considers three types of information: traditional bag-of-words (technical information), and the combination of named entities (privileged information I) with domain terms (privileged information II).
We evaluated the contextual information in four context-aware recommender systems.
Different weights were assigned to each type of information.
The empirical results demonstrated that topic hierarchies with the combination of the two kinds of privileged information can provide better recommendations.
We propose an algorithm to locate the most critical nodes to network robustness.
Such critical nodes may be thought of as those most related to the notion of network centrality.
Our proposal relies only on a localized spectral analysis of a limited subnetwork centered at each node in the network.
We also present a procedure allowing the navigation from any node towards a critical node following only local information computed by the proposed algorithm.
Experimental results confirm the effectiveness of our proposal considering networks of different scales and topological characteristics.
Visual recognition algorithms are required today to exhibit adaptive abilities.
Given a deep model trained on a specific, given task, it would be highly desirable to be able to adapt incrementally to new tasks, preserving scalability as the number of new tasks increases, while at the same time avoiding catastrophic forgetting issues.
Recent work has shown that masking the internal weights of a given original conv-net through learned binary variables is a promising strategy.
We build upon this intuition and take into account more elaborated affine transformations of the convolutional weights that include learned binary masks.
We show that with our generalization it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task.
Experiments on two popular benchmarks showcase the power of our approach, that achieves the new state of the art on the Visual Decathlon Challenge.
This paper introduces a method for predicting the likely behaviors of continuous nonlinear systems in equilibrium in which the input values can vary.
The method uses a parameterized equation model and a lower bound on the input joint density to bound the likelihood that some behavior will occur, such as a state variable being inside a given numeric range.
Using a bound on the density instead of the density itself is desirable because often the input density's parameters and shape are not exactly known.
The new method is called SAB after its basic operations: split the input value space into smaller regions, and then bound those regions' possible behaviors and the probability of being in them.
SAB finds rough bounds at first, and then refines them as more time is given.
In contrast to other researchers' methods, SAB can (1) find all the possible system behaviors, and indicate how likely they are, (2) does not approximate the distribution of possible outcomes without some measure of the error magnitude, (3) does not use discretized variable values, which limit the events one can find probability bounds for, (4) can handle density bounds, and (5) can handle such criteria as two state variables both being inside a numeric range.
In large-scale distributed learning, security issues have become increasingly important.
Particularly in a decentralized environment, some computing units may behave abnormally, or even exhibit Byzantine failures---arbitrary and potentially adversarial behavior.
In this paper, we develop distributed learning algorithms that are provably robust against such failures, with a focus on achieving optimal statistical performance.
A main result of this work is a sharp analysis of two robust distributed gradient descent algorithms based on median and trimmed mean operations, respectively.
We prove statistical error rates for three kinds of population loss functions: strongly convex, non-strongly convex, and smooth non-convex.
In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses.
To achieve better communication efficiency, we further propose a median-based distributed algorithm that is provably robust, and uses only one communication round.
For strongly convex quadratic loss, we show that this algorithm achieves the same optimal error rate as the robust distributed gradient descent algorithms.
The ability to build a model on a source task and subsequently adapt such model on a new target task is a pervasive need in many astronomical applications.
The problem is generally known as transfer learning in machine learning, where domain adaptation is a popular scenario.
An example is to build a predictive model on spectroscopic data to identify Supernovae IA, while subsequently trying to adapt such model on photometric data.
In this paper we propose a new general approach to domain adaptation that does not rely on the proximity of source and target distributions.
Instead we simply assume a strong similarity in model complexity across domains, and use active learning to mitigate the dependency on source examples.
Our work leads to a new formulation for the likelihood as a function of empirical error using a theoretical learning bound; the result is a novel mapping from generalization error to a likelihood estimation.
Results using two real astronomical problems, Supernova Ia classification and identification of Mars landforms, show two main advantages with our approach: increased accuracy performance and substantial savings in computational cost.
The utilization of web mapping becomes increasingly important in the domain of cartography.
Users want access to spatial data on the web specific to their needs.
For this reason, different approaches were appeared for generating on-the-fly the maps demanded by users, but those not suffice for guide a flexible and efficient process.
Thus, new approach must be developed for improving this process according to the user needs.
This work focuses on defining a new strategy which improves on-the-fly map generalization process and resolves the spatial conflicts.
This approach uses the multiple representation and cartographic generalization.
The map generalization process is based on the implementation of multi- agent system where each agent was equipped with a genetic patrimony.
We consider the transmission of packets across a lossy end-to-end network path so as to achieve low in-order delivery delay.
This can be formulated as a decision problem, namely deciding whether the next packet to send should be an information packet or a coded packet.
Importantly, this decision is made based on delayed feedback from the receiver.
While an exact solution to this decision problem is challenging, we exploit ideas from queueing theory to derive scheduling policies based on prediction of a receiver queue length that, while suboptimal, can be efficiently implemented and offer substantially better performance than state of the art approaches.
We obtain a number of useful analytic bounds that help characterise design trade-offs and our analysis highlights that the use of prediction plays a key role in achieving good performance in the presence of significant feedback delay.
Our approach readily generalises to networks of paths and we illustrate this by application to multipath transport scheduler design.
Humans are able to understand and perform complex tasks by strategically structuring the tasks into incremental steps or subgoals.
For a robot attempting to learn to perform a sequential task with critical subgoal states, such states can provide a natural opportunity for interaction with a human expert.
This paper analyzes the benefit of incorporating a notion of subgoals into Inverse Reinforcement Learning (IRL) with a Human-In-The-Loop (HITL) framework.
The learning process is interactive, with a human expert first providing input in the form of full demonstrations along with some subgoal states.
These subgoal states define a set of subtasks for the learning agent to complete in order to achieve the final goal.
The learning agent queries for partial demonstrations corresponding to each subtask as needed when the agent struggles with the subtask.
The proposed Human Interactive IRL (HI-IRL) framework is evaluated on several discrete path-planning tasks.
We demonstrate that subgoal-based interactive structuring of the learning task results in significantly more efficient learning, requiring only a fraction of the demonstration data needed for learning the underlying reward function with the baseline IRL model.
The problem of finding a finite state symbolic model which is bisimilar to a hybrid dynamical system (HDS) and has the minimum number of states is considered.
The considered class of HDS allows for discrete-valued inputs that only affect the jumps (events) of the HDS.
Representation of the HDS in the form of a transition system is revisited in comparison with prior works.
An algorithm is proposed for solving the problem which gives the bisimulation with the minimum number of states if it already exists and also a parameter of the algorithm is properly tuned.
There is no need for stability assumptions and no time discretization is applied.
The results are applied to an example
The well-known dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are the basis of several universal lossless compression techniques.
These algorithms are asymmetric regarding encoding/decoding time and memory requirements, with the former being much more demanding.
In the past years, considerable attention has been devoted to the problem of finding efficient data structures to support these searches, aiming at optimizing the encoders in terms of speed and memory.
Hash tables, binary search trees and suffix trees have been widely used for this purpose, as they allow fast search at the expense of memory.
Some recent research has focused on suffix arrays (SA), due to their low memory requirements and linear construction algorithms.
Previous work has shown how the LZ77 decomposition can be computed using a single SA or an SA with an auxiliary array with the longest common prefix information.
The SA-based algorithms use less memory than the tree-based encoders, allocating the strictly necessary amount of memory, regardless of the contents of the text to search/encode.
In this paper, we improve on previous work by proposing faster SA-based algorithms for LZ77 encoding and sub-string search, keeping their low memory requirements.
For some compression settings, on a large set of benchmark files, our low-memory SA-based encoders are also faster than tree-based encoders.
This provides time and memory efficient LZ77 encoding, being a possible replacement for trees on well known encoders like LZMA.
Our algorithm is also suited for text classification, because it provides a compact way to describe text in a bag-of-words representation, as well as a fast indexing mechanism that allows to quickly find all the sets of words that start with a given symbol, over a static dictionary.
We present Caffe con Troll (CcT), a fully compatible end-to-end version of the popular framework Caffe with rebuilt internals.
We built CcT to examine the performance characteristics of training and deploying general-purpose convolutional neural networks across different hardware architectures.
We find that, by employing standard batching optimizations for CPU training, we achieve a 4.5x throughput improvement over Caffe on popular networks like CaffeNet.
Moreover, with these improvements, the end-to-end training time for CNNs is directly proportional to the FLOPS delivered by the CPU, which enables us to efficiently train hybrid CPU-GPU systems for CNNs.
HIV/AIDS spread depends upon complex patterns of interaction among various sub-sets emerging at population level.
This added complexity makes it difficult to study and model AIDS and its dynamics.
AIDS is therefore a natural candidate to be modeled using agent-based modeling, a paradigm well-known for modeling Complex Adaptive Systems (CAS).
While agent-based models are also well-known to effectively model CAS, often times models can tend to be ambiguous and the use of purely text-based specifications (such as ODD) can make models difficult to be replicated.
Previous work has shown how formal specification may be used in conjunction with agent-based modeling to develop models of various CAS.
However, to the best of our knowledge, no such model has been developed in conjunction with AIDS.
In this paper, we present a Formal Agent-Based Simulation modeling framework (FABS-AIDS) for an AIDS-based CAS.
FABS-AIDS employs the use of a formal specification model in conjunction with an agent-based model to reduce ambiguity as well as improve clarity in the model definition.
The proposed model demonstrates the effectiveness of using formal specification in conjunction with agent-based simulation for developing models of CAS in general and, social network-based agent-based models, in particular.
In this paper, we introduce the notion of Plausible Deniability in an information theoretic framework.
We consider a scenario where an entity that eavesdrops through a broadcast channel summons one of the parties in a communication protocol to reveal their message (or signal vector).
It is desirable that the summoned party have enough freedom to produce a fake output that is likely plausible given the eavesdropper's observation.
We examine three variants of this problem -- Message Deniability, Transmitter Deniability, and Receiver Deniability.
In the first setting, the message sender is summoned to produce the sent message.
Similarly, in the second and third settings, the transmitter and the receiver are required to produce the transmitted codeword, and the received vector respectively.
For each of these settings, we examine the maximum communication rate that allows a given minimum rate of plausible fake outputs.
For the Message and Transmitter Deniability problems, we fully characterise the capacity region for general broadcast channels, while for the Receiver Deniability problem, we give an achievable rate region for physically degraded broadcast channels.
This paper proposes a generic selective-candidate framework with similarity selection rule (SCSS) for performance enhancement of well-established evolutionary optimization algorithms.
It is done by using a more efficient selective searching direction.
In the SCSS framework, M (M > 1) candidates are generated from each current solution by M independent reproduction procedures.
The winner is then determined by employing a similarity selection rule that achieves a balance between exploitation and exploration.
This computationally light rule simultaneously considers the evolution status (fitness ranking information) of the current solution as well as its Euclidian distances to each of the M candidates.
The SCSS framework can be easily applied to evolutionary algorithms or swarm intelligences.
Experiments conducted with 60 benchmark functions show the superiority of SCSS in three classic, four state-of-the-art and four up-to-date algorithms.
The Industrial Internet market is targeted to grow by trillions of US dollars by the year 2030, driven by adoption, deployment and integration of billions of intelligent devices and their associated data.
This digital expansion faces a number of significant challenges, including reliable data management, security and privacy.
Realizing the benefits from this evolution is made more difficult because a typical industrial plant includes multiple vendors and legacy technology stacks.
Aggregating all the raw data to a single data center before performing analysis increases response times, raising performance concerns in traditional markets and requiring a compromise between data duplication and data access performance.
Similar to the way microservices can integrate disparate information technologies without imposing monolithic cross-cutting architecture impacts, we propose microdatabases to manage the data heterogeneity of the Industrial Internet while allowing records to be captured and secured close to the industrial processes, but also be made available near the applications that can benefit from the data.
A microdatabase is an abstraction of a data store that standardizes and protects the interactions between distributed data sources, providers and consumers.
It integrates an information model with discoverable object types that can be browsed interactively and programmatically, and supports repository instances that evolve with their own lifecycles.
The microdatabase abstraction is independent of technology choice and was designed based on solicitation and review of industry stakeholder concerns.
This paper studies the challenging problem of fingerprint image denoising and inpainting.
To tackle the challenge of suppressing complicated artifacts (blur, brightness, contrast, elastic transformation, occlusion, scratch, resolution, rotation, and so on) while preserving fine textures, we develop a multi-scale convolutional network, termed U- Finger.
Based on the domain expertise, we show that the usage of dilated convolutions as well as the removal of padding have important positive impacts on the final restoration performance, in addition to multi-scale cascaded feature modules.
Our model achieves the overall ranking of No.2 in the ECCV 2018 Chalearn LAP Inpainting Competition Track 3 (Fingerprint Denoising and Inpainting).
Among all participating teams, we obtain the MSE of 0.0231 (rank 2), PSNR 16.9688 dB (rank 2), and SSIM 0.8093 (rank 3) on the hold-out testing set.
Volatility is a quantity of measurement for the price movements of stocks or options which indicates the uncertainty within financial markets.
As an indicator of the level of risk or the degree of variation, volatility is important to analyse the financial market, and it is taken into consideration in various decision-making processes in financial activities.
On the other hand, recent advancement in deep learning techniques has shown strong capabilities in modelling sequential data, such as speech and natural language.
In this paper, we empirically study the applicability of the latest deep structures with respect to the volatility modelling problem, through which we aim to provide an empirical guidance for the theoretical analysis of the marriage between deep learning techniques and financial applications in the future.
We examine both the traditional approaches and the deep sequential models on the task of volatility prediction, including the most recent variants of convolutional and recurrent networks, such as the dilated architecture.
Accordingly, experiments with real-world stock price datasets are performed on a set of 1314 daily stock series for 2018 days of transaction.
The evaluation and comparison are based on the negative log likelihood (NLL) of real-world stock price time series.
The result shows that the dilated neural models, including dilated CNN and Dilated RNN, produce most accurate estimation and prediction, outperforming various widely-used deterministic models in the GARCH family and several recently proposed stochastic models.
In addition, the high flexibility and rich expressive power are validated in this study.
Tactile sensing is a key enabling technology to develop complex behaviours for robots interacting with humans or the environment.
This paper discusses computational aspects playing a significant role when extracting information about contact events.
Considering a large-scale, capacitance-based robot skin technology we developed in the past few years, we analyse the classical Boussinesq-Cerruti's solution and the Love's approach for solving a distributed inverse contact problem, both from a qualitative and a computational perspective.
Our contribution is the characterisation of algorithms performance using a freely available dataset and data originating from surfaces provided with robot skin.
Differentiating intrinsic language words from transliterable words is a key step aiding text processing tasks involving different natural languages.
We consider the problem of unsupervised separation of transliterable words from native words for text in Malayalam language.
Outlining a key observation on the diversity of characters beyond the word stem, we develop an optimization method to score words based on their nativeness.
Our method relies on the usage of probability distributions over character n-grams that are refined in step with the nativeness scorings in an iterative optimization formulation.
Using an empirical evaluation, we illustrate that our method, DTIM, provides significant improvements in nativeness scoring for Malayalam, establishing DTIM as the preferred method for the task.
In recent years, Log-Structured Merge-trees (LSM-trees) have been widely adopted for use in the storage layer of modern NoSQL systems.
Because of this, there have been a large number of research efforts, from both the database community and the systems community, that try to improve various aspects of LSM-trees.
In this paper, we provide a survey of recent LSM efforts so that readers can learn the state of the art in LSM-based storage techniques.
We provide a general taxonomy to classify the literature of LSM improvements, survey the efforts in detail, and discuss their strengths and trade-offs.
We further survey several representative LSM-based open-source NoSQL systems and we discuss some potential future research directions resulting from the survey.
This comment recalls a previously proposed encoding scheme involving two synchronized random number generators (RNGs) to compress the transmission message.
It is also claimed that the recently proposed random number modulation (RNM) scheme suffers considerably from the severe error propagation, and that, in general, the overall energy consumption is minimized when all information bits are transmitted as fast as possible with the minimum latency.
The aim of this article is to present an overview of the existing biomedical data warehouses and to discuss the issues and future trends in this area.
We illustrate this topic by presenting the design of an innovative, complex data warehouse for personal, anticipative medicine.
Ponzi schemes are financial frauds where, under the promise of high profits, users put their money, recovering their investment and interests only if enough users after them continue to invest money.
Originated in the offline world 150 years ago, Ponzi schemes have since then migrated to the digital world, approaching first on the Web, and more recently hanging over cryptocurrencies like Bitcoin.
Smart contract platforms like Ethereum have provided a new opportunity for scammers, who have now the possibility of creating "trustworthy" frauds that still make users lose money, but at least are guaranteed to execute "correctly".
We present a comprehensive survey of Ponzi schemes on Ethereum, analysing their behaviour and their impact from various viewpoints.
Perhaps surprisingly, we identify a remarkably high number of Ponzi schemes, despite the hosting platform has been operating for less than two years.
The Jordan center of a graph is defined as a vertex whose maximum distance to other nodes in the graph is minimal, and it finds applications in facility location and source detection problems.
We study properties of the Jordan Center in the case of random growing trees.
In particular, we consider a regular tree graph on which an infection starts from a root node and then spreads along the edges of the graph according to various random spread models.
For the Independent Cascade (IC) model and the discrete Susceptible Infected (SI) model, both of which are discrete time models, we show that as the infected subgraph grows with time, the Jordan center persists on a single vertex after a finite number of timesteps.
Finally, we also study the continuous time version of the SI model and bound the maximum distance between the Jordan center and the root node at any time.
Let D be a set of n disks in the plane.
We present a data structure of size O(n) that can compute, for any query point q, the largest disk in D that contains q, in O(log n) time.
The structure can be constructed in O(n log^3 n) time.
The optimal storage and query time of the structure improve several recent solutions by Augustine et al. and by Kaplan and Sharir.
In this paper, we study the interactions among interconnected autonomous microgrids, and propose a joint energy trading and scheduling strategy.
Each interconnected microgrid not only schedules its local power supply and demand, but also trades energy with other microgrids in a distribution network.
Specifically, microgrids with excessive renewable generations can trade with other microgrids in deficit of power supplies for mutual benefits.
Since interconnected microgrids operate autonomously, they aim to optimize their own performance and expect to gain benefits through energy trading.
We design an incentive mechanism using Nash bargaining theory to encourage proactive energy trading and fair benefit sharing.
We solve the bargaining problem by decomposing it into two sequential problems on social cost minimization and trading benefit sharing, respectively.
For practical implementation, we propose a decentralized solution method with minimum information exchange overhead.
Numerical studies based on realistic data demonstrate that the total cost of the interconnected-microgrids operation can be reduced by up to 13.2% through energy trading, and an individual participating microgrid can achieve up to 29.4% reduction in its cost through energy trading.
Cities across the United States are undergoing great transformation and urban growth.
Data and data analysis has become an essential element of urban planning as cities use data to plan land use and development.
One great challenge is to use the tools of data science to promote equity along with growth.
The city of Atlanta is an example site of large-scale urban renewal that aims to engage in development without displacement.
On the Westside of downtown Atlanta, the construction of the new Mercedes-Benz Stadium and the conversion of an underutilized rail-line into a multi-use trail may result in increased property values.
In response to community residents' concerns and a commitment to development without displacement, the city and philanthropic partners announced an Anti-Displacement Tax Fund to subsidize future property tax increases of owner occupants for the next twenty years.
To achieve greater transparency, accountability, and impact, residents expressed a desire for a tool that would help them determine eligibility and quantify this commitment.
In support of this goal, we use machine learning techniques to analyze historical tax assessment and predict future tax assessments.
We then apply eligibility estimates to our predictions to estimate the total cost for the first seven years of the program.
These forecasts are also incorporated into an interactive tool for community residents to determine their eligibility for the fund and the expected increase in their home value over the next seven years.
There is overwhelming evidence that human intelligence is a product of Darwinian evolution.
Investigating the consequences of self-modification, and more precisely, the consequences of utility function self-modification, leads to the stronger claim that not only human, but any form of intelligence is ultimately only possible within evolutionary processes.
Human-designed artificial intelligences can only remain stable until they discover how to manipulate their own utility function.
By definition, a human designer cannot prevent a superhuman intelligence from modifying itself, even if protection mechanisms against this action are put in place.
Without evolutionary pressure, sufficiently advanced artificial intelligences become inert by simplifying their own utility function.
Within evolutionary processes, the implicit utility function is always reducible to persistence, and the control of superhuman intelligences embedded in evolutionary processes is not possible.
Mechanisms against utility function self-modification are ultimately futile.
Instead, scientific effort toward the mitigation of existential risks from the development of superintelligences should be in two directions: understanding consciousness, and the complex dynamics of evolutionary systems.
One of the most important problems of data processing in high energy and nuclear physics is the event reconstruction.
Its main part is the track reconstruction procedure which consists in looking for all tracks that elementary particles leave when they pass through a detector among a huge number of points, so-called hits, produced when flying particles fire detector coordinate planes.
Unfortunately, the tracking is seriously impeded by the famous shortcoming of multiwired, strip and GEM detectors due to appearance in them a lot of fake hits caused by extra spurious crossings of fired strips.
Since the number of those fakes is several orders of magnitude greater than for true hits, one faces with the quite serious difficulty to unravel possible track-candidates via true hits ignoring fakes.
We introduce a renewed method that is a significant improvement of our previous two-stage approach based on hit preprocessing using directed K-d tree search followed a deep neural classifier.
We combine these two stages in one by applying recurrent neural network that simultaneously determines whether a set of points belongs to a true track or not and predicts where to look for the next point of track on the next coordinate plane of the detector.
We show that proposed deep network is more accurate, faster and does not require any special preprocessing stage.
Preliminary results of our approach for simulated events of the BM@N GEM detector are presented.
We study policy iteration for infinite-horizon Markov decision processes.
It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting.
We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.
While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks.
We hypothesize that what these systems lack is a "relational inductive bias": a capacity for reasoning about inter-object relations and making choices over a structured description of a scene.
To test this hypothesis, we focus on a task that involves gluing pairs of blocks together to stabilize a tower, and quantify how well humans perform.
We then introduce a deep reinforcement learning agent which uses object- and relation-centric scene and policy representations and apply it to the task.
Our results show that these structured representations allow the agent to outperform both humans and more naive approaches, suggesting that relational inductive bias is an important component in solving structured reasoning problems and for building more intelligent, flexible machines.
Many convolutional neural networks (CNNs) have a feed-forward structure.
In this paper, a linear program that estimates the Lipschitz bound of such CNNs is proposed.
Several CNNs, including the scattering networks, the AlexNet and the GoogleNet, are studied numerically and compared to the theoretical bounds.
Next, concentration inequalities of the output distribution to a stationary random input signal expressed in terms of the Lipschitz bound are established.
The Lipschitz bound is further used to establish a nonlinear discriminant analysis designed to measure the separation between features of different classes.
Diacritical marks play a crucial role in meeting the criteria of usability of typographic text, such as: homogeneity, clarity and legibility.
To change the diacritic of a letter in a word could completely change its semantic.
The situation is very complicated with multilingual text.
Indeed, the problem of design becomes more difficult by the presence of diacritics that come from various scripts; they are used for different purposes, and are controlled by various typographic rules.
It is quite challenging to adapt rules from one script to another.
This paper aims to study the placement and sizing of diacritical marks in Arabic script, with a comparison with the Latin's case.
The Arabic script is cursive and runs from right-to-left; its criteria and rules are quite distinct from those of the Latin script.
In the beginning, we compare the difficulty of processing diacritics in both scripts.
After, we will study the limits of Latin resolution strategies when applied to Arabic.
At the end, we propose an approach to resolve the problem for positioning and resizing diacritics.
This strategy includes creating an Arabic font, designed in OpenType format, along with suitable justification in TEX.
Most recent approaches use the sequence-to-sequence model for paraphrase generation.
The existing sequence-to-sequence model tends to memorize the words and the patterns in the training dataset instead of learning the meaning of the words.
Therefore, the generated sentences are often grammatically correct but semantically improper.
In this work, we introduce a novel model based on the encoder-decoder framework, called Word Embedding Attention Network (WEAN).
Our proposed model generates the words by querying distributed word representations (i.e.neural word embeddings), hoping to capturing the meaning of the according words.
Following previous work, we evaluate our model on two paraphrase-oriented tasks, namely text simplification and short text abstractive summarization.
Experimental results show that our model outperforms the sequence-to-sequence baseline by the BLEU score of 6.3 and 5.5 on two English text simplification datasets, and the ROUGE-2 F1 score of 5.7 on a Chinese summarization dataset.
Moreover, our model achieves state-of-the-art performances on these three benchmark datasets.
Cyber attacks and malware are now more prevalent than ever and the trend is ever upward.
There have been several approaches to attack detection including resident software applications at the root or user level, e.g., virus detection, and modifications to the OS, e.g., encryption, application signing, etc.
Some approaches have moved to lower level detection and preven- tion, e.g., Data Execution Prevention.
An emerging approach in countermeasure development is the use of hardware performance counters existing in the micro-architecture of modern processors.
These are at the lowest level, implemented in processor hardware, and the wealth of data collected by these counters affords some very promising countermeasures with minimal overhead as well as protection from being sabotaged themselves by attackers.
Here, we conduct a survey of recent techniques in realizing effective countermeasures for cyber attack detection from these hardware performance counters.
Evacuation is one of the main disaster management solutions to reduce the impact of man-made and natural threats on building occupants.
To date, several modern technologies and gamification concepts, e.g. immersive virtual reality and serious games, have been used to enhance building evacuation preparedness and effectiveness.
Those tools have been used both to investigate human behavior during building emergencies and to train building occupants on how to cope with building evacuations.
Augmented Reality (AR) is novel technology that can enhance this process providing building occupants with virtual contents to improve their evacuation performance.
This work aims at reviewing existing AR applications developed for building evacuation.
This review identifies the disasters and types of building those tools have been applied for.
Moreover, the application goals, hardware and evacuation stages affected by AR are also investigated in the review.
Finally, this review aims at identifying the challenges to face for further development of AR evacuation tools.
We formulate and solve the energy minimization problem for a clustered device-to-device (D2D) network with cache-enabled mobile devices.
Devices are distributed according to a Poisson cluster process (PCP) and are assumed to have a surplus memory which is exploited to proactively cache files from a library.
Devices can retrieve the requested files from their caches, from neighboring devices in their proximity (cluster), or from the base station as a last resort.
We minimize the energy consumption of the proposed network under a random prob- abilistic caching scheme, where files are independently cached according to a specific probability distribution.
A closed-form expression for the D2D coverage probability is obtained.
The energy consumption problem is then formulated as a function of the caching distribution, and the optimal probabilistic caching distribution is obtained.
Results reveal that the proposed caching distribution reduces energy consumption up to 33% as compared to caching popular files scheme.
Modern large-scale computing deployments consist of complex applications running over machine clusters.
An important issue in these is the offering of elasticity, i.e., the dynamic allocation of resources to applications to meet fluctuating workload demands.
Threshold based approaches are typically employed, yet they are difficult to configure and optimize.
Approaches based on reinforcement learning have been proposed, but they require a large number of states in order to model complex application behavior.
Methods that adaptively partition the state space have been proposed, but their partitioning criteria and strategies are sub-optimal.
In this work we present MDP_DT, a novel full-model based reinforcement learning algorithm for elastic resource management that employs adaptive state space partitioning.
We propose two novel statistical criteria and three strategies and we experimentally prove that they correctly decide both where and when to partition, outperforming existing approaches.
We experimentally evaluate MDP_DT in a real large scale cluster over variable not-encountered workloads and we show that it takes more informed decisions compared to static and model-free approaches, while requiring a minimal amount of training data.
We introduce the Densely Segmented Supermarket (D2S) dataset, a novel benchmark for instance-aware semantic segmentation in an industrial domain.
It contains 21,000 high-resolution images with pixel-wise labels of all object instances.
The objects comprise groceries and everyday products from 60 categories.
The benchmark is designed such that it resembles the real-world setting of an automatic checkout, inventory, or warehouse system.
The training images only contain objects of a single class on a homogeneous background, while the validation and test sets are much more complex and diverse.
To further benchmark the robustness of instance segmentation methods, the scenes are acquired with different lightings, rotations, and backgrounds.
We ensure that there are no ambiguities in the labels and that every instance is labeled comprehensively.
The annotations are pixel-precise and allow using crops of single instances for articial data augmentation.
The dataset covers several challenges highly relevant in the field, such as a limited amount of training data and a high diversity in the test and validation sets.
The evaluation of state-of-the-art object detection and instance segmentation methods on D2S reveals significant room for improvement.
Optimizing deep neural networks (DNNs) often suffers from the ill-conditioned problem.
We observe that the scaling-based weight space symmetry property in rectified nonlinear network will cause this negative effect.
Therefore, we propose to constrain the incoming weights of each neuron to be unit-norm, which is formulated as an optimization problem over Oblique manifold.
A simple yet efficient method referred to as projection based weight normalization (PBWN) is also developed to solve this problem.
PBWN executes standard gradient updates, followed by projecting the updated weight back to Oblique manifold.
This proposed method has the property of regularization and collaborates well with the commonly used batch normalization technique.
We conduct comprehensive experiments on several widely-used image datasets including CIFAR-10, CIFAR-100, SVHN and ImageNet for supervised learning over the state-of-the-art convolutional neural networks, such as Inception, VGG and residual networks.
The results show that our method is able to improve the performance of DNNs with different architectures consistently.
We also apply our method to Ladder network for semi-supervised learning on permutation invariant MNIST dataset, and our method outperforms the state-of-the-art methods: we obtain test errors as 2.52%, 1.06%, and 0.91% with only 20, 50, and 100 labeled samples, respectively.
In this paper,a new design of wireless sensor network (WSN)node is discussed which is based on components with ultra low power.We ha e de eloped a Low cost and low power WSN Node using MSP430 and nRF24L01.The architectural circuit details are presented.This architecture fulfils the requirements like low cost,low power,compact size and self organization.Various tests are carried out to test the performance of the nRF24L01 module.The packet loss,free Space loss (FSL)and battery lifetime calculations are described.These test results will help the researchers to build new applications using abo e node and to work efficiently with nRF24L01.
Multiple Kernel Learning, or MKL, extends (kernelized) SVM by attempting to learn not only a classifier/regressor but also the best kernel for the training task, usually from a combination of existing kernel functions.
Most MKL methods seek the combined kernel that performs best over every training example, sacrificing performance in some areas to seek a global optimum.
Localized kernel learning (LKL) overcomes this limitation by allowing the training algorithm to match a component kernel to the examples that can exploit it best.
Several approaches to the localized kernel learning problem have been explored in the last several years.
We unify many of these approaches under one simple system and design a new algorithm with improved performance.
We also develop enhanced versions of existing algorithms, with an eye on scalability and performance.
The number of references per paper, perhaps the best single index of a journal's scholarliness, has been studied in different disciplines and periods.
In this paper we present a four decade study of eight engineering journals.
A data set of over 70000 references was generated after automatic data gathering and manual inspection for errors.
Results show a significant increase in the number of references per paper, the average rises from 8 in 1972 to 25 in 2013.
This growth presents an acceleration around the year 2000, consistent with a much easier access to search engines and documents produced by the generalization of the Internet.
Tissue texture is known to exhibit a heterogeneous or non-stationary nature, therefore using a single resolution approach for optimum classification might not suffice.
A clinical decision support system that exploits the subband textural fractal characteristics for best bases selection of meningioma brain histopathological image classification is proposed.
Each subband is analysed using its fractal dimension instead of energy, which has the advantage of being less sensitive to image intensity and abrupt changes in tissue texture.
The most significant subband that best identifies texture discontinuities will be chosen for further decomposition, and its fractal characteristics would represent the optimal feature vector for classification.
The performance was tested using the support vector machine (SVM), Bayesian and k-nearest neighbour (kNN) classifiers and a leave-one-patient-out method was employed for validation.
Our method outperformed the classical energy based selection approaches, achieving for SVM, Bayesian and kNN classifiers an overall classification accuracy of 94.12%, 92.50% and 79.70%, as compared to 86.31%, 83.19% and 51.63% for the co-occurrence matrix, and 76.01%, 73.50% and 50.69% for the energy texture signatures, respectively.
These results indicate the potential usefulness as a decision support system that could complement radiologists diagnostic capability to discriminate higher order statistical textural information, for which it would be otherwise difficult via ordinary human vision.
We propose a novel part-based method for tracking an arbitrary object in challenging video sequences, focusing on robustly tracking under the effects of camera motion and object motion change.
Each of a group of tracked image patches on the target is represented by pairs of RGB pixel samples and counts of how many pixels in the patch are similar to them.
This empirically characterises the underlying colour distribution of the patches and allows for matching using the Bhattacharyya distance.
Candidate patch locations are generated by applying non-shearing affine transformations to the patches' previous locations, followed by local optimisation.
Experiments using the VOT2016 dataset show that our tracker out-performs all other part-based trackers in terms of robustness to camera motion and object motion change.
In this paper, we present the design process of a novel solution for enabling the collaboration between OpenStack cloud systems in SAML federations with standalone attribute authorities, such as national research and education federations or eduGAIN.
The software solution that realizes the integration of systems serves as a case study to show how abstract desirable engineering properties fixed at the beginning of the design process can be implemented during the development phase.
An analysis of earlier generations of OpenStack-related developments trying to tackle the same problem is given.
Many aspects of this software integration can be generalized to serve as a template for federative cloud access.
Judgment aggregation is a general framework for collective decision making that can be used to model many different settings.
Due to its general nature, the worst case complexity of essentially all relevant problems in this framework is very high.
However, these intractability results are mainly due to the fact that the language to represent the aggregation domain is overly expressive.
We initiate an investigation of representation languages for judgment aggregation that strike a balance between (1) being limited enough to yield computational tractability results and (2) being expressive enough to model relevant applications.
In particular, we consider the languages of Krom formulas, (definite) Horn formulas, and Boolean circuits in decomposable negation normal form (DNNF).
We illustrate the use of the positive complexity results that we obtain for these languages with a concrete application: voting on how to spend a budget (i.e., participatory budgeting).
Wrist-wearables such as smartwatches and fitness bands are equipped with a variety of high-precision sensors that support novel contextual and activity-based applications.
The presence of a diverse set of on-board sensors, however, also expose an additional attack surface which, if not adequately protected, could be potentially exploited to leak private user information.
In this paper, we investigate the feasibility of a new attack that takes advantage of a wrist-wearable's motion sensors to infer input on mechanical devices typically used to secure physical access, for example, combination locks.
We outline an inference framework that attempts to infer a lock's unlock combination from the wrist motion captured by a smartwatch's gyroscope sensor, and uses a probabilistic model to produce a ranked list of likely unlock combinations.
We conduct a thorough empirical evaluation of the proposed framework by employing unlocking-related motion data collected from human subject participants in a variety of controlled and realistic settings.
Evaluation results from these experiments demonstrate that motion data from wrist-wearables can be effectively employed as a side-channel to significantly reduce the unlock combination search-space of commonly found combination locks, thus compromising the physical security provided by these locks.
The relationship of scientific knowledge development to technological development is widely recognized as one of the most important and complex aspects of technological evolution.
This paper adds to our understanding of the relationship through use of a more rigorous structure for differentiating among technologies based upon technological domains (defined as consisting of the artifacts over time that fulfill a specific generic function using a specific body of technical knowledge).
The main contribution of this paper is a simple semi-supervised pipeline that only uses the original training set without collecting extra data.
It is challenging in 1) how to obtain more training data only from the training set and 2) how to use the newly generated data.
In this work, the generative adversarial network (GAN) is used to generate unlabeled samples.
We propose the label smoothing regularization for outliers (LSRO).
This method assigns a uniform label distribution to the unlabeled images, which regularizes the supervised model and improves the baseline.
We verify the proposed method on a practical problem: person re-identification (re-ID).
This task aims to retrieve a query person from other cameras.
We adopt the deep convolutional generative adversarial network (DCGAN) for sample generation, and a baseline convolutional neural network (CNN) for representation learning.
Experiments show that adding the GAN-generated data effectively improves the discriminative ability of learned CNN embeddings.
On three large-scale datasets, Market-1501, CUHK03 and DukeMTMC-reID, we obtain +4.37%, +1.6% and +2.46% improvement in rank-1 precision over the baseline CNN, respectively.
We additionally apply the proposed method to fine-grained bird recognition and achieve a +0.6% improvement over a strong baseline.
The code is available at https://github.com/layumi/Person-reID_GAN.
In this paper, we introduce a new uplink visible light indoor positioning system that estimates the position of the users in the network-side of a visible light communications (VLC) system.
This technique takes advantage of the diffuse components of the uplink channel impulse response for positioning, which has been considered as a destructive noise in existing visible light communication positioning literature.
Exploiting the line of sight (LOS) component, the most significant diffusive component of the channel (the second power peak (SPP)), and the delay time between LOS and SPP, we present a proof of concept analysis for positioning using fixed reference points, i.e. uplink photodetectors (PDs).
Simulation results show the root mean square (RMS) positioning accuracy of 25 cm and 5 cm for one and 4 PDs scenarios, respectively.
Cyber-Physical Systems (CPS) are systems composed by a physical component that is controlled or monitored by a cyber-component, a computer-based algorithm.
Advances in CPS technologies and science are enabling capability, adaptability, scalability, resiliency, safety, security, and usability that will far exceed the simple embedded systems of today.
CPS technologies are transforming the way people interact with engineered systems.
New smart CPS are driving innovation in various sectors such as agriculture, energy, transportation, healthcare, and manufacturing.
They are leading the 4-th Industrial Revolution (Industry 4.0) that is having benefits thanks to the high flexibility of production.
The Industry 4.0 production paradigm is characterized by high intercommunicating properties of its production elements in all the manufacturing processes.
This is the reason it is a core concept how the systems should be structurally optimized to have the adequate level of redundancy to be satisfactorily resilient.
This goal can benefit from formal methods well known in various scientific domains such as artificial intelligence.
So, the current research concerns the proposal of a CPS meta-model and its instantiation.
In this way it lists all kind of relationships that may occur between the CPSs themselves and between their (cyber-and physical-) components.
Using the CPS meta-model formalization, with an adaptation of the Formal Concept Analysis (FCA) formal approach, this paper presents a way to optimize the modelling of CPS systems emphasizing their redundancy and their resiliency.
Can health entities collaboratively train deep learning models without sharing sensitive raw data?
This paper proposes several configurations of a distributed deep learning method called SplitNN to facilitate such collaborations.
SplitNN does not share raw data or model details with collaborating institutions.
The proposed configurations of splitNN cater to practical settings of i) entities holding different modalities of patient data, ii) centralized and local health entities collaborating on multiple tasks and iii) learning without sharing labels.
We compare performance and resource efficiency trade-offs of splitNN and other distributed deep learning methods like federated learning, large batch synchronous stochastic gradient descent and show highly encouraging results for splitNN.
We provide a formula for the number of edges of the Hasse diagram of the independent subsets of the h-th power of a path ordered by inclusion.
For h=1 such a value is the number of edges of a Fibonacci cube.
We show that, in general, the number of edges of the diagram is obtained by convolution of a Fibonacci-like sequence with itself.
The impact of soiling on solar panels is an important and well-studied problem in renewable energy sector.
In this paper, we present the first convolutional neural network (CNN) based approach for solar panel soiling and defect analysis.
Our approach takes an RGB image of solar panel and environmental factors as inputs to predict power loss, soiling localization, and soiling type.
In computer vision, localization is a complex task which typically requires manually labeled training data such as bounding boxes or segmentation masks.
Our proposed approach consists of specialized four stages which completely avoids localization ground truth and only needs panel images with power loss labels for training.
The region of impact area obtained from the predicted localization masks are classified into soiling types using the webly supervised learning.
For improving localization capabilities of CNNs, we introduce a novel bi-directional input-aware fusion (BiDIAF) block that reinforces the input at different levels of CNN to learn input-specific feature maps.
Our empirical study shows that BiDIAF improves the power loss prediction accuracy by about 3% and localization accuracy by about 4%.
Our end-to-end model yields further improvement of about 24% on localization when learned in a weakly supervised manner.
Our approach is generalizable and showed promising results on web crawled solar panel images.
Our system has a frame rate of 22 fps (including all steps) on a NVIDIA TitanX GPU.
Additionally, we collected first of it's kind dataset for solar panel image analysis consisting 45,000+ images.
We present the MAC network, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning.
MAC moves away from monolithic black-box neural architectures towards a design that encourages both transparency and versatility.
The model approaches problems by decomposing them into a series of attention-based reasoning steps, each performed by a novel recurrent Memory, Attention, and Composition (MAC) cell that maintains a separation between control and memory.
By stringing the cells together and imposing structural constraints that regulate their interaction, MAC effectively learns to perform iterative reasoning processes that are directly inferred from the data in an end-to-end approach.
We demonstrate the model's strength, robustness and interpretability on the challenging CLEVR dataset for visual reasoning, achieving a new state-of-the-art 98.9% accuracy, halving the error rate of the previous best model.
More importantly, we show that the model is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.
Replacing a portion of current light duty vehicles (LDV) with plug-in hybrid electric vehicles (PHEVs) offers the possibility to reduce the dependence on petroleum fuels together with environmental and economic benefits.
The charging activity of PHEVs will certainly introduce new load to the power grid.
In the framework of the development of a smarter grid, the primary focus of the present study is to propose a model for the electrical daily demand in presence of PHEVs charging.
Expected PHEV demand is modeled by the PHEV charging time and the starting time of charge according to real world data.
A normal distribution for starting time of charge is assumed.
Several distributions for charging time are considered: uniform distribution, Gaussian with positive support, Rician distribution and a non-uniform distribution coming from driving patterns in real-world data.
We generate daily demand profiles by using real-world residential profiles throughout 2014 in the presence of different expected PHEV demand models.
Support vector machines (SVMs), a set of supervised machine learning models, are employed in order to find the best model to fit the data.
SVMs with radial basis function (RBF) and polynomial kernels were tested.
Model performances are evaluated by means of mean squared error (MSE) and mean absolute percentage error (MAPE).
Best results are obtained with RBF kernel: maximum (worst) values for MSE and MAPE were about 2.89 10-8 and 0.023, respectively.
This study investigates wireless information and energy transfer for dual-hop amplify-and-forward full-duplex relaying systems.
By forming energy efficiency (EE) maximization problem into a concave fractional program of transmission power, three relay control schemes are separately designed to enable energy harvesting and full-duplex information relaying.
With Rician fading modeled residual self-interference channel, analytical expressions of outage probability and ergodic capacity are presented for the maximum relay, signal-to-interference-plus-noise-ratio (SINR) relay, and target relay.
It has shown that EE maximization problem of the maximum relay is concave for time switching factor, so that bisection method has been applied to obtain the optimized value.
By incorporating instantaneous channel information, the SINR relay with collateral time switching factor achieves an improved EE over the maximum relay in delay-limited and delay-tolerant transmissions.
Without requiring channel information for the second-hop, the target relay ensures a competitive performance for outage probability, ergodic capacity, and EE.
Comparing to the direct source-destination transmission, numerical results show that the proposed relaying scheme is beneficial in achieving a comparable EE for low-rate delay-limited transmission.
Automated synthesis of reactive systems from specifications has been a topic of research for decades.
Recently, a variety of approaches have been proposed to extend synthesis of reactive systems from proposi- tional specifications towards specifications over rich theories.
We propose a novel, completely automated approach to program synthesis which reduces the problem to deciding the validity of a set of forall-exists formulas.
In spirit of IC3 / PDR, our problem space is recursively refined by blocking out regions of unsafe states, aiming to discover a fixpoint that describes safe reactions.
If such a fixpoint is found, we construct a witness that is directly translated into an implementation.
We implemented the algorithm on top of the JKind model checker, and exercised it against contracts written using the Lustre specification language.
Experimental results show how the new algorithm outperforms JKinds already existing synthesis procedure based on k-induction and addresses soundness issues in the k-inductive approach with respect to unrealizable results.
Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems.
However, these deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning.
There has been a significant recent interest in developing explainable deep learning models, and this paper is an effort in this direction.
Building on a recently proposed method called Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art.
We provide a mathematical derivation for the proposed method, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the corresponding class label.
Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ provides promising human-interpretable visual explanations for a given CNN architecture across multiple tasks including classification, image caption generation and 3D action recognition; as well as in new settings such as knowledge distillation.
Low-rank modeling has a lot of important applications in machine learning, computer vision and social network analysis.
While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better recovery performance.
However, the resultant optimization problem is much more challenging.
A very recent state-of-the-art is based on the proximal gradient algorithm.
However, it requires an expensive full SVD in each proximal step.
In this paper, we show that for many commonly-used nonconvex low-rank regularizers, a cutoff can be derived to automatically threshold the singular values obtained from the proximal operator.
This allows the use of power method to approximate the SVD efficiently.
Besides, the proximal operator can be reduced to that of a much smaller matrix projected onto this leading subspace.
Convergence, with a rate of O(1/T) where T is the number of iterations, can be guaranteed.
Extensive experiments are performed on matrix completion and robust principal component analysis.
The proposed method achieves significant speedup over the state-of-the-art.
Moreover, the matrix solution obtained is more accurate and has a lower rank than that of the traditional nuclear norm regularizer.
When using active learning, smaller batch sizes are typically more efficient from a learning efficiency perspective.
However, in practice due to speed and human annotator considerations, the use of larger batch sizes is necessary.
While past work has shown that larger batch sizes decrease learning efficiency from a learning curve perspective, it remains an open question how batch size impacts methods for stopping active learning.
We find that large batch sizes degrade the performance of a leading stopping method over and above the degradation that results from reduced learning efficiency.
We analyze this degradation and find that it can be mitigated by changing the window size parameter of how many past iterations of learning are taken into account when making the stopping decision.
We find that when using larger batch sizes, stopping methods are more effective when smaller window sizes are used.
There is an increasing demand for goal-oriented conversation systems which can assist users in various day-to-day activities such as booking tickets, restaurant reservations, shopping, etc.
Most of the existing datasets for building such conversation systems focus on monolingual conversations and there is hardly any work on multilingual and/or code-mixed conversations.
Such datasets and systems thus do not cater to the multilingual regions of the world, such as India, where it is very common for people to speak more than one language and seamlessly switch between them resulting in code-mixed conversations.
For example, a Hindi speaking user looking to book a restaurant would typically ask, "Kya tum is restaurant mein ek table book karne mein meri help karoge?"
("Can you help me in booking a table at this restaurant?").
To facilitate the development of such code-mixed conversation models, we build a goal-oriented dialog dataset containing code-mixed conversations.
Specifically, we take the text from the DSTC2 restaurant reservation dataset and create code-mixed versions of it in Hindi-English, Bengali-English, Gujarati-English and Tamil-English.
We also establish initial baselines on this dataset using existing state of the art models.
This dataset along with our baseline implementations is made publicly available for research purposes.
Islamophobic hate speech on social media inflicts considerable harm on both targeted individuals and wider society, and also risks reputational damage for the host platforms.
Accordingly, there is a pressing need for robust tools to detect and classify Islamophobic hate speech at scale.
Previous research has largely approached the detection of Islamophobic hate speech on social media as a binary task.
However, the varied nature of Islamophobia means that this is often inappropriate for both theoretically-informed social science and effectively monitoring social media.
Drawing on in-depth conceptual work we build a multi-class classifier which distinguishes between non-Islamophobic, weak Islamophobic and strong Islamophobic content.
Accuracy is 77.6% and balanced accuracy is 83%.
We apply the classifier to a dataset of 109,488 tweets produced by far right Twitter accounts during 2017.
Whilst most tweets are not Islamophobic, weak Islamophobia is considerably more prevalent (36,963 tweets) than strong (14,895 tweets).
Our main input feature is a gloVe word embeddings model trained on a newly collected corpus of 140 million tweets.
It outperforms a generic word embeddings model by 5.9 percentage points, demonstrating the importan4ce of context.
Unexpectedly, we also find that a one-against-one multi class SVM outperforms a deep learning algorithm.
The emergence of low-cost personal mobiles devices and wearable cameras and the increasing storage capacity of video-sharing websites have pushed forward a growing interest towards first-person videos.
Since most of the recorded videos compose long-running streams with unedited content, they are tedious and unpleasant to watch.
The fast-forward state-of-the-art methods are facing challenges of balancing the smoothness of the video and the emphasis in the relevant frames given a speed-up rate.
In this work, we present a methodology capable of summarizing and stabilizing egocentric videos by extracting the semantic information from the frames.
This paper also describes a dataset collection with several semantically labeled videos and introduces a new smoothness evaluation metric for egocentric videos that is used to test our method.
Graph algorithms applied in many applications, including social networks, communication networks, VLSI design, graphics, and several others, require dynamic modifications -- addition and removal of vertices and/or edges -- in the graph.
This paper presents a novel concurrent non-blocking algorithm to implement a dynamic unbounded directed graph in a shared-memory machine.
The addition and removal operations of vertices and edges are lock-free.
For a finite sized graph, the lookup operations are wait-free.
Most significant component of the presented algorithm is the reachability query in a concurrent graph.
The reachability queries in our algorithm are obstruction-free and thus impose minimal additional synchronization cost over other operations.
We prove that each of the data structure operations are linearizable.
We extensively evaluate a sample C/C++ implementation of the algorithm through a number of micro-benchmarks.
The experimental results show that the proposed algorithm scales well with the number of threads and on an average provides 5 to 7x performance improvement over a concurrent graph implementation using coarse-grained locking.
We propose a conceptual model of software development that encompasses all approaches: traditional or agile, light and heavy, for large and small development efforts.
The model identifies both the common aspects in all software development, i.e., elements found in some form or another in each and every software development project (Intent, Product, People, Work, Time, Quality, Risk, Cost, Value), as well as the variable part, i.e., the main factors that cause the very wide variations we can find in the software development world (Size, Age, Criticality, Architecture stability, Business model, Governance, Rate of change, Geographic distribution).
We show how the model can be used as an explanatory theory of software development, as a tool for analysis of practices, techniques, processes, as the basis for curriculum design or for software process adoption and improvement, and to support empirical research on software development methods.
This model is also proposed as a way to depolarize the debate on agile methods versus the rest-of-the-world: a unified model.
This paper introduces analogical and deductive methodologies for the design medical processor units (MPUs).
From the study of evolution of numerous earlier processors, we derive the basis for the architecture of MPUs.
These specialized processors perform unique medical functions encoded as medical operational codes (mopcs).
From a pragmatic perspective, MPUs function very close to CPUs.
Both processors have unique operation codes that command the hardware to perform a distinct chain of subprocesses upon operands and generate a specific result unique to the opcode and the operand(s).
In medical environments, MPU decodes the mopcs and executes a series of medical sub-processes and sends out secondary commands to the medical machine.
Whereas operands in a typical computer system are numerical and logical entities, the operands in medical machine are objects such as such as patients, blood samples, tissues, operating rooms, medical staff, medical bills, patient payments, etc.
We follow the functional overlap between the two processes and evolve the design of medical computer systems and networks.
One of the most attractive features of untyped languages is the flexibility in term creation and manipulation.
However, with such power comes the responsibility of ensuring the correctness of these operations.
A solution is adding run-time checks to the program via assertions, but this can introduce overheads that are in many cases impractical.
While static analysis can greatly reduce such overheads, the gains depend strongly on the quality of the information inferred.
Reusable libraries, i.e., library modules that are pre-compiled independently of the client, pose special challenges in this context.
We propose a technique which takes advantage of module systems which can hide a selected set of functor symbols to significantly enrich the shape information that can be inferred for reusable libraries, as well as an improved run-time checking approach that leverages the proposed mechanisms to achieve large reductions in overhead, closer to those of static languages, even in the reusable-library context.
While the approach is general and system-independent, we present it for concreteness in the context of the Ciao assertion language and combined static/dynamic checking framework.
Our method maintains the full expressiveness of the assertion language in this context.
In contrast to other approaches it does not introduce the need to switch the language to a (static) type system, which is known to change the semantics in languages like Prolog.
We also study the approach experimentally and evaluate the overhead reduction achieved in the run-time checks.
The paper presents a parallel implementation of existing image fusion methods on a graphical cluster.
Parallel implementations of methods based on discrete wavelet transformation (Haars and Daubechies discrete wavelet transform) are developed.
Experiments were performed on a cluster using GPU and CPU and performance gains were estimated for the use of the developed parallel implementations to process satellite images from satellite Landsat 7.
The implementation on a graphic cluster provides performance improvement from 2 to 18 times.
The quality of the considered methods was evaluated by ERGAS and QNR metrics.
The results show performance gains and retaining of quality with the cluster of GPU compared to the results obtained by the authors and other researchers for a CPU and single GPU.
3D models provide a common ground for different representations of human bodies.
In turn, robust 2D estimation has proven to be a powerful tool to obtain 3D fits "in-the- wild".
However, depending on the level of detail, it can be hard to impossible to acquire labeled data for training 2D estimators on large scale.
We propose a hybrid approach to this problem: with an extended version of the recently introduced SMPLify method, we obtain high quality 3D body model fits for multiple human pose datasets.
Human annotators solely sort good and bad fits.
This procedure leads to an initial dataset, UP-3D, with rich annotations.
With a comprehensive set of experiments, we show how this data can be used to train discriminative models that produce results with an unprecedented level of detail: our models predict 31 segments and 91 landmark locations on the body.
Using the 91 landmark pose estimator, we present state-of-the art results for 3D human pose and shape estimation using an order of magnitude less training data and without assumptions about gender or pose in the fitting procedure.
We show that UP-3D can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale.
The data, code and models are available for research purposes.
In this paper, we propose a novel sparse learning based feature selection method that directly optimizes a large margin linear classification model sparsity with l_(2,p)-norm (0 < p < 1)subject to data-fitting constraints, rather than using the sparsity as a regularization term.
To solve the direct sparsity optimization problem that is non-smooth and non-convex when 0<p<1, we provide an efficient iterative algorithm with proved convergence by converting it to a convex and smooth optimization problem at every iteration step.
The proposed algorithm has been evaluated based on publicly available datasets, and extensive comparison experiments have demonstrated that our algorithm could achieve feature selection performance competitive to state-of-the-art algorithms.
We study benefits of opportunistic routing in a large wireless ad hoc network by examining how the power, delay, and total throughput scale as the number of source- destination pairs increases up to the operating maximum.
Our opportunistic routing is novel in a sense that it is massively parallel, i.e., it is performed by many nodes simultaneously to maximize the opportunistic gain while controlling the inter-user interference.
The scaling behavior of conventional multi-hop transmission that does not employ opportunistic routing is also examined for comparison.
Our results indicate that our opportunistic routing can exhibit a net improvement in overall power--delay trade-off over the conventional routing by providing up to a logarithmic boost in the scaling law.
Such a gain is possible since the receivers can tolerate more interference due to the increased received signal power provided by the multi-user diversity gain, which means that having more simultaneous transmissions is possible.
Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017).
This paper presents an overview of our visual discovery engine powering these services, and shares the rationales behind our technical and product decisions such as the use of object detection and interactive user interfaces.
We conclude that this visual discovery engine significantly improves engagement in both search and recommendation tasks.
Being able to soundly estimate roundoff errors of finite-precision computations is important for many applications in embedded systems and scientific computing.
Due to the discrepancy between continuous reals and discrete finite-precision values, automated static analysis tools are highly valuable to estimate roundoff errors.
The results, however, are only as correct as the implementations of the static analysis tools.
This paper presents a formally verified and modular tool which fully automatically checks the correctness of finite-precision roundoff error bounds encoded in a certificate.
We present implementations of certificate generation and checking for both Coq and HOL4 and evaluate it on a number of examples from the literature.
The experiments use both in-logic evaluation of Coq and HOL4, and execution of extracted code outside of the logics: we benchmark Coq extracted unverified OCaml code and a CakeML-generated verified binary.
Online advertising is progressively moving towards a programmatic model in which ads are matched to actual interests of individuals collected as they browse the web.
Letting the huge debate around privacy aside, a very important question in this area, for which little is known, is: How much do advertisers pay to reach an individual?
In this study, we develop a first of its kind methodology for computing exactly that -- the price paid for a web user by the ad ecosystem -- and we do that in real time.
Our approach is based on tapping on the Real Time Bidding (RTB) protocol to collect cleartext and encrypted prices for winning bids paid by advertisers in order to place targeted ads.
Our main technical contribution is a method for tallying winning bids even when they are encrypted.
We achieve this by training a model using as ground truth prices obtained by running our own "probe" ad-campaigns.
We design our methodology through a browser extension and a back-end server that provides it with fresh models for encrypted bids.
We validate our methodology using a one year long trace of 1600 mobile users and demonstrate that it can estimate a user's advertising worth with more than 82% accuracy.
In this paper, we propose an inertial forward backward splitting algorithm to compute a zero of the sum of two monotone operators, with one of the two operators being co-coercive.
The algorithm is inspired by the accelerated gradient method of Nesterov, but can be applied to a much larger class of problems including convex-concave saddle point problems and general monotone inclusions.
We prove convergence of the algorithm in a Hilbert space setting and show that several recently proposed first-order methods can be obtained as special cases of the general algorithm.
Numerical results show that the proposed algorithm converges faster than existing methods, while keeping the computational cost of each iteration basically unchanged.
Map construction in large scale outdoor environment is of importance for robots to robustly fulfill their tasks.
Massive sessions of data should be merged to distinguish low dynamics in the map, which otherwise might debase the performance of localization and navigation algorithms.
In this paper we propose a method for multi-session map construction in large scale outdoor environment using 3D LiDAR.
To efficiently align the maps from different sessions, a laser-based loop closure detection method is integrated and the sequential information within the submaps is utilized for higher robustness.
Furthermore, a dynamic detection method is proposed to detect dynamics in the overlapping areas among sessions of maps.
We test the method in the real-world environment with a VLP-16 Velodyne LiDAR and the experimental results prove the validity and robustness of the proposed method.
We propose three private information retrieval (PIR) protocols for distributed storage systems (DSSs) where data is stored using an arbitrary linear code.
The first two protocols, named Protocol 1 and Protocol 2, achieve privacy for the scenario with noncolluding nodes.
Protocol 1 requires a file size that is exponential in the number of files in the system, while Protocol 2 requires a file size that is independent of the number of files and is hence simpler.
We prove that, for certain linear codes, Protocol 1 achieves the maximum distance separable (MDS) PIR capacity, i.e., the maximum PIR rate (the ratio of the amount of retrieved stored data per unit of downloaded data) for a DSS that uses an MDS code to store any given (finite and infinite) number of files, and Protocol 2 achieves the asymptotic MDS-PIR capacity (with infinitely large number of files in the DSS).
In particular, we provide a necessary and a sufficient condition for a code to achieve the MDS-PIR capacity with Protocols 1 and 2 and prove that cyclic codes, Reed-Muller (RM) codes, and a class of distance-optimal local reconstruction codes achieve both the finite MDS-PIR capacity (i.e., with any given number of files) and the asymptotic MDS-PIR capacity with Protocols 1 and 2, respectively.
Furthermore, we present a third protocol, Protocol 3, for the scenario with multiple colluding nodes, which can be seen as an improvement of a protocol recently introduced by Freij-Hollanti et al..
Similar to the noncolluding case, we provide a necessary and a sufficient condition to achieve the maximum possible PIR rate of Protocol 3.
Moreover, we provide a particular class of codes that is suitable for this protocol and show that RM codes achieve the maximum possible PIR rate for the protocol.
For all three protocols, we present an algorithm to optimize their PIR rates.
We introduce a theory-driven mechanism for learning a neural network model that performs generative topology design in one shot given a problem setting, circumventing the conventional iterative process that computational design tasks usually entail.
The proposed mechanism can lead to machines that quickly response to new design requirements based on its knowledge accumulated through past experiences of design generation.
Achieving such a mechanism through supervised learning would require an impractically large amount of problem-solution pairs for training, due to the known limitation of deep neural networks in knowledge generalization.
To this end, we introduce an interaction between a student (the neural network) and a teacher (the optimality conditions underlying topology optimization): The student learns from existing data and is tested on unseen problems.
Deviation of the student's solutions from the optimality conditions is quantified, and used for choosing new data points to learn from.
We call this learning mechanism "theory-driven", as it explicitly uses domain-specific theories to guide the learning, thus distinguishing itself from purely data-driven supervised learning.
We show through a compliance minimization problem that the proposed learning mechanism leads to topology generation with near-optimal structural compliance, much improved from standard supervised learning under the same computational budget.
Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story.
We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora.
To date, recurrent neural networks that learn language models at character, word, or sentence levels have had little success generating coherent stories.
We explore the question of event representations that provide a mid-level of abstraction between words and sentences in order to retain the semantic information of the original data while minimizing event sparsity.
We present a technique for preprocessing textual story data into event sequences.
We then present a technique for automated story generation whereby we decompose the problem into the generation of successive events (event2event) and the generation of natural language sentences from events (event2sentence).
We give empirical results comparing different event representations and their effects on event successor generation and the translation of events to natural language.
The aim of fine-grained recognition is to identify sub-ordinate categories in images like different species of birds.
Existing works have confirmed that, in order to capture the subtle differences across the categories, automatic localization of objects and parts is critical.
Most approaches for object and part localization relied on the bottom-up pipeline, where thousands of region proposals are generated and then filtered by pre-trained object/part models.
This is computationally expensive and not scalable once the number of objects/parts becomes large.
In this paper, we propose a nonparametric data-driven method for object and part localization.
Given an unlabeled test image, our approach transfers annotations from a few similar images retrieved in the training set.
In particular, we propose an iterative transfer strategy that gradually refine the predicted bounding boxes.
Based on the located objects and parts, deep convolutional features are extracted for recognition.
We evaluate our approach on the widely-used CUB200-2011 dataset and a new and large dataset called Birdsnap.
On both datasets, we achieve better results than many state-of-the-art approaches, including a few using oracle (manually annotated) bounding boxes in the test images.
Data augmentation is usually used by supervised learning approaches for offline writer identification, but such approaches require extra training data and potentially lead to overfitting errors.
In this study, a semi-supervised feature learning pipeline was proposed to improve the performance of writer identification by training with extra unlabeled data and the original labeled data simultaneously.
Specifically, we proposed a weighted label smoothing regularization (WLSR) method for data augmentation, which assigned the weighted uniform label distribution to the extra unlabeled data.
The WLSR method could regularize the convolutional neural network (CNN) baseline to allow more discriminative features to be learned to represent the properties of different writing styles.
The experimental results on well-known benchmark datasets (ICDAR2013 and CVL) showed that our proposed semi-supervised feature learning approach could significantly improve the baseline measurement and perform competitively with existing writer identification approaches.
Our findings provide new insights into offline write identification.
This research introduces a new constraint domain for reasoning about data with uncertainty.
It extends convex modeling with the notion of p-box to gain additional quantifiable information on the data whereabouts.
Unlike existing approaches, the p-box envelops an unknown probability instead of approximating its representation.
The p-box bounds are uniform cumulative distribution functions (cdf) in order to employ linear computations in the probabilistic domain.
The reasoning by means of p-box cdf-intervals is an interval computation which is exerted on the real domain then it is projected onto the cdf domain.
This operation conveys additional knowledge represented by the obtained probabilistic bounds.
Empirical evaluation shows that, with minimal overhead, the output solution set realizes a full enclosure of the data along with tighter bounds on its probabilistic distributions.
This paper discusses online algorithms for inverse dynamics modelling in robotics.
Several model classes including rigid body dynamics (RBD) models, data-driven models and semiparametric models (which are a combination of the previous two classes) are placed in a common framework.
While model classes used in the literature typically exploit joint velocities and accelerations, which need to be approximated resorting to numerical differentiation schemes, in this paper a new `derivative-free' framework is proposed that does not require this preprocessing step.
An extensive experimental study with real data from the right arm of the iCub robot is presented, comparing different model classes and estimation procedures, showing that the proposed `derivative-free' methods outperform existing methodologies.
The detection of overlapping communities is a challenging problem which is gaining increasing interest in recent years because of the natural attitude of individuals, observed in real-world networks, to participate in multiple groups at the same time.
This review gives a description of the main proposals in the field.
Besides the methods designed for static networks, some new approaches that deal with the detection of overlapping communities in networks that change over time, are described.
Methods are classified with respect to the underlying principles guiding them to obtain a network division in groups sharing part of their nodes.
For each of them we also report, when available, computational complexity and web site address from which it is possible to download the software implementing the method.
In the presence of great social diversity in India, it is difficult to change the social background of students, parents and their economical conditions.
Therefore the only option left for us is to provide uniform or standardize teaching learning resources or methods.
For high quality education throughout India there must be some nation-wide network, which provides equal quality education to all students, including the student from the rural areas and villages.
The one and only simple solution to this is Web Based e-Learning.
In this paper we try to give some innovative ideas to spread the Web Based e-Learning (WBeL) concept in to the minds of young India along with various approaches taken or to be taken, associated to it till date besides of instructional design models, different course developmental models, the role of technical writing and merit-demerit of WBeL till date.
We study the problem of building models that disentangle independent factors of variation.
Such models could be used to encode features that can efficiently be used for classification and to transfer attributes between different images in image synthesis.
As data we use a weakly labeled training set.
Our weak labels indicate what single factor has changed between two data samples, although the relative value of the change is unknown.
This labeling is of particular interest as it may be readily available without annotation costs.
To make use of weak labels we introduce an autoencoder model and train it through constraints on image pairs and triplets.
We formally prove that without additional knowledge there is no guarantee that two images with the same factor of variation will be mapped to the same feature.
We call this issue the reference ambiguity.
Moreover, we show the role of the feature dimensionality and adversarial training.
We demonstrate experimentally that the proposed model can successfully transfer attributes on several datasets, but show also cases when the reference ambiguity occurs.
Meta-learning is a powerful tool that builds on multi-task learning to learn how to quickly adapt a model to new tasks.
In the context of reinforcement learning, meta-learning algorithms can acquire reinforcement learning procedures to solve new problems more efficiently by meta-learning prior tasks.
The performance of meta-learning algorithms critically depends on the tasks available for meta-training: in the same way that supervised learning algorithms generalize best to test points drawn from the same distribution as the training points, meta-learning methods generalize best to tasks from the same distribution as the meta-training tasks.
In effect, meta-reinforcement learning offloads the design burden from algorithm design to task design.
If we can automate the process of task design as well, we can devise a meta-learning algorithm that is truly automated.
In this work, we take a step in this direction, proposing a family of unsupervised meta-learning algorithms for reinforcement learning.
We describe a general recipe for unsupervised meta-reinforcement learning, and describe an effective instantiation of this approach based on a recently proposed unsupervised exploration technique and model-agnostic meta-learning.
We also discuss practical and conceptual considerations for developing unsupervised meta-learning methods.
Our experimental results demonstrate that unsupervised meta-reinforcement learning effectively acquires accelerated reinforcement learning procedures without the need for manual task design, significantly exceeds the performance of learning from scratch, and even matches performance of meta-learning methods that use hand-specified task distributions.
This paper studies the distributed state estimation problem for a class of discrete-time stochastic systems with nonlinear uncertain dynamics over time-varying topologies of sensor networks.
An extended state vector consisting of the original state and the nonlinear dynamics is constructed.
By analyzing the extended system, we provide a design method for the filtering gain and fusion matrices, leading to the extended state distributed Kalman filter.
It is shown that the proposed filter can provide the upper bound of estimation covariance in real time, which means the estimation accuracy can be evaluated online.It is proven that the estimation covariance of the filter is bounded under rather mild assumptions, i.e., collective observability of the system and jointly strong connectedness of network topologies.
Numerical simulation shows the effectiveness of the proposed filter.
We propose a methodology for clustering financial time series of stocks' returns, and a graphical set-up to quantify and visualise the evolution of these clusters through time.
The proposed graphical representation allows for the application of well known algorithms for solving classical combinatorial graph problems, which can be interpreted as problems relevant to portfolio design and investment strategies.
We illustrate this graph representation of the evolution of clusters in time and its use on real data from the Madrid Stock Exchange market.
Inserting an end of a rope through a loop is a common and important action that is required for creating most types of knots.
To perform this action, we need to pass the end of the rope through an area that is enclosed by another segment of rope.
As for all knotting actions, the robot must for this exercise control over a semi-compliant and flexible body whose complex 3d shape is difficult to perceive and follow.
Additionally, the target loop often deforms during the insertion.
We address this problem by defining a virtual magnetic field through the loop's interior and use the Biot Savart law to guide the robotic manipulator that holds the end of the rope.
This approach directly defines, for any manipulator position, a motion vector that results in a path that passes through the loop.
The motion vector is directly derived from the position of the loop and changes as soon as it moves or deforms.
In simulation, we test the insertion action against dynamic loop deformation of different intensity.
We also combine insertion with grasp and release actions, coordinated by a hybrid control system, to tie knots in simulation and with a NAO robot.
By executing jobs serially rather than in parallel, size-based scheduling policies can shorten time needed to complete jobs; however, major obstacles to their applicability are fairness guarantees and the fact that job sizes are rarely known exactly a-priori.
Here, we introduce the Pri family of size-based scheduling policies; Pri simulates any reference scheduler and executes jobs in the order of their simulated completion: we show that these schedulers give strong fairness guarantees, since no job completes later in Pri than in the reference policy.
In addition, we introduce PSBS, a practical implementation of such a scheduler: it works online (i.e., without needing knowledge of jobs submitted in the future), it has an efficient O(log n) implementation and it allows setting priorities to jobs.
Most importantly, unlike earlier size-based policies, the performance of PSBS degrades gracefully with errors, leading to performances that are close to optimal in a variety of realistic use cases.
The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy.
According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type.
In this paper, we present a deep learning system based on multi-stream multi-scale convolutional networks, which automatically classifies all nodule types relevant for nodule workup.
The system processes raw CT data containing a nodule without the need for any additional information such as nodule segmentation or nodule size and learns a representation of 3D data by analyzing an arbitrary number of 2D views of a given nodule.
The deep learning system was trained with data from the Italian MILD screening trial and validated on an independent set of data from the Danish DLCST screening trial.
We analyze the advantage of processing nodules at multiple scales with a multi-stream convolutional network architecture, and we show that the proposed deep learning system achieves performance at classifying nodule type that surpasses the one of classical machine learning approaches and is within the inter-observer variability among four experienced human observers.
Most geometric approaches to monocular Visual Odometry (VO) provide robust pose estimates, but sparse or semi-dense depth estimates.
Off late, deep methods have shown good performance in generating dense depths and VO from monocular images by optimizing the photometric consistency between images.
Despite being intuitive, a naive photometric loss does not ensure proper pixel correspondences between two views, which is the key factor for accurate depth and relative pose estimations.
It is a well known fact that simply minimizing such an error is prone to failures.
We propose a method using Epipolar constraints to make the learning more geometrically sound.
We use the Essential matrix, obtained using Nister's Five Point Algorithm, for enforcing meaningful geometric constraints on the loss, rather than using it as labels for training.
Our method, although simplistic but more geometrically meaningful, using lesser number of parameters, gives a comparable performance to state-of-the-art methods which use complex losses and large networks showing the effectiveness of using epipolar constraints.
Such a geometrically constrained learning method performs successfully even in cases where simply minimizing the photometric error would fail.
The identification of authorship in disputed documents still requires human expertise, which is now unfeasible for many tasks owing to the large volumes of text and authors in practical applications.
In this study, we introduce a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors.
The texts were divided into sections with equal number of linguistic tokens, from which time series were created for 12 topological metrics.
The series were proven to be stationary (p-value>0.05), which permits to use distribution moments as learning attributes.
With an optimized supervised learning procedure using a Radial Basis Function Network, 68 out of 80 texts were correctly classified, i.e. a remarkable 85% author matching success rate.
Therefore, fluctuations in purely dynamic network metrics were found to characterize authorship, thus opening the way for the description of texts in terms of small evolving networks.
Moreover, the approach introduced allows for comparison of texts with diverse characteristics in a simple, fast fashion.
This article explores a relationship between inconsistency in the pairwise comparisons method and conditions of order preservation.
A pairwise comparisons matrix with elements from an alo-group is investigated.
This approach allows for a generalization of previous results.
Sufficient conditions for order preservation based on the properties of elements of pairwise comparisons matrix are derived.
A numerical example is presented.
While Monte Carlo Tree Search and closely related methods have dominated General Video Game Playing, recent research has demonstrated the promise of Rolling Horizon Evolutionary Algorithms as an interesting alternative.
However, there is little attention paid to population initialization techniques in the setting of general real-time video games.
Therefore, this paper proposes the use of population seeding to improve the performance of Rolling Horizon Evolution and presents the results of two methods, One Step Look Ahead and Monte Carlo Tree Search, tested on 20 games of the General Video Game AI corpus with multiple evolution parameter values (population size and individual length).
An in-depth analysis is carried out between the results of the seeding methods and the vanilla Rolling Horizon Evolution.
In addition, the paper presents a comparison to a Monte Carlo Tree Search algorithm.
The results are promising, with seeding able to boost performance significantly over baseline evolution and even match the high level of play obtained by the Monte Carlo Tree Search.
Prior efforts to create an autonomous computer system capable of predicting what a human being is thinking or feeling from facial expression data have been largely based on outdated, inaccurate models of how emotions work that rely on many scientifically questionable assumptions.
In our research, we are creating an empathetic system that incorporates the latest provable scientific understanding of emotions: that they are constructs of the human mind, rather than universal expressions of distinct internal states.
Thus, our system uses a user-dependent method of analysis and relies heavily on contextual information to make predictions about what subjects are experiencing.
Our system's accuracy and therefore usefulness are built on provable ground truths that prohibit the drawing of inaccurate conclusions that other systems could too easily make.
"Concentrated differential privacy" was recently introduced by Dwork and Rothblum as a relaxation of differential privacy, which permits sharper analyses of many privacy-preserving computations.
We present an alternative formulation of the concept of concentrated differential privacy in terms of the Renyi divergence between the distributions obtained by running an algorithm on neighboring inputs.
With this reformulation in hand, we prove sharper quantitative results, establish lower bounds, and raise a few new questions.
We also unify this approach with approximate differential privacy by giving an appropriate definition of "approximate concentrated differential privacy."
Public Good Software's products match journalistic articles and other narrative content to relevant charitable causes and nonprofit organizations so that readers can take action on the issues raised by the articles' publishers.
Previously an expensive and labor-intensive process, application of machine learning and other automated textual analyses now allow us to scale this matching process to the volume of content produced daily by multiple large national media outlets.
This paper describes the development of a layered system of tactics working across a general news model that minimizes the need for human curation while maintaining the particular focus of concern for each individual publication.
We present a number of general strategies for categorizing heterogenous texts, and suggest editorial and operational tactics for publishers to make their publications and individual content items more efficiently analyzed by automated systems.
Interference alignment is a signaling technique that provides high multiplexing gain in the interference channel.
It can be extended to multi-hop interference channels, where relays aid transmission between sources and destinations.
In addition to coverage extension and capacity enhancement, relays increase the multiplexing gain in the interference channel.
In this paper, three cooperative algorithms are proposed for a multiple-antenna amplify-and-forward (AF) relay interference channel.
The algorithms design the transmitters and relays so that interference at the receivers can be aligned and canceled.
The first algorithm minimizes the sum power of enhanced noise from the relays and interference at the receivers.
The second and third algorithms rely on a connection between mean square error and mutual information to solve the end-to-end sum-rate maximization problem with either equality or inequality power constraints via matrix-weighted sum mean square error minimization.
The resulting iterative algorithms converge to stationary points of the corresponding optimization problems.
Simulations show that the proposed algorithms achieve higher end-to-end sum-rates and multiplexing gains that existing strategies for AF relays, decode-and-forward relays, and direct transmission.
The first algorithm outperforms the other algorithms at high signal-to-noise ratio (SNR) but performs worse than them at low SNR.
Thanks to power control, the third algorithm outperforms the second algorithm at the cost of overhead.
This paper is mainly a semi-tutorial introduction to elementary algebraic topology and its applications to Ising-type models of statistical physics, using graphical models of linear and group codes.
It contains new material on systematic (n,k) group codes and their information sets; normal realizations of homology and cohomology spaces; dual and hybrid models; and connections with system-theoretic concepts such as observability, controllability, and input/output realizations.
Sustainable and economical generation of electrical power is an essential and mandatory component of infrastructure in today's world.
Optimal generation (generator subset selection) of power requires a careful evaluation of various factors like type of source, generation, transmission & storage capacities, congestion among others which makes this a difficult task.
We created a grid to simulate various conditions including stimuli like generator supply, weather and load demand using Siemens PSS/E software and this data is trained using deep learning methods and subsequently tested.
The results are highly encouraging.
As per our knowledge, this is the first paper to propose a working and scalable deep learning model for this problem.
In this paper we present a blind deconvolution scheme based on statistical wavelet estimation.
We assume no prior knowledge of the wavelet, and do not select a reflector from the signal.
Instead, the wavelet (ultrasound pulse) is statistically estimated from the signal itself by a kurtosis-based metric.
This wavelet is then used to deconvolve the RF (radiofrequency) signal through Wiener filtering, and the resultant zero phase trace is subjected to spectral broadening by Autoregressive Spectral Extrapolation (ASE).
These steps increase the time resolution of diffraction techniques.
Results on synthetic and real cases show the robustness of the proposed method.
This paper investigates the problem of efficient displacement of random sensors where good communication within the network is provided, there is no interference between sensors and the movement cost is minimized in expectation.
There have been extensive efforts in government, academia, and industry to anticipate, forecast, and mitigate cyber attacks.
A common approach is time-series forecasting of cyber attacks based on data from network telescopes, honeypots, and automated intrusion detection/prevention systems.
This research has uncovered key insights such as systematicity in cyber attacks.
Here, we propose an alternate perspective of this problem by performing forecasting of attacks that are analyst-detected and -verified occurrences of malware.
We call these instances of malware cyber event data.
Specifically, our dataset was analyst-detected incidents from a large operational Computer Security Service Provider (CSSP) for the U.S. Department of Defense, which rarely relies only on automated systems.
Our data set consists of weekly counts of cyber events over approximately seven years.
Since all cyber events were validated by analysts, our dataset is unlikely to have false positives which are often endemic in other sources of data.
Further, the higher-quality data could be used for a number for resource allocation, estimation of security resources, and the development of effective risk-management strategies.
We used a Bayesian State Space Model for forecasting and found that events one week ahead could be predicted.
To quantify bursts, we used a Markov model.
Our findings of systematicity in analyst-detected cyber attacks are consistent with previous work using other sources.
The advanced information provided by a forecast may help with threat awareness by providing a probable value and range for future cyber events one week ahead.
Other potential applications for cyber event forecasting include proactive allocation of resources and capabilities for cyber defense (e.g., analyst staffing and sensor configuration) in CSSPs.
Enhanced threat awareness may improve cybersecurity.
GitHub is the largest source code repository in the world.
It provides a git-based source code management platform and also many features inspired by social networks.
For example, GitHub users can show appreciation to projects by adding stars to them.
Therefore, the number of stars of a repository is a direct measure of its popularity.
In this paper, we use multiple linear regressions to predict the number of stars of GitHub repositories.
These predictions are useful both to repository owners and clients, who usually want to know how their projects are performing in a competitive open source development market.
In a large-scale analysis, we show that the proposed models start to provide accurate predictions after being trained with the number of stars received in the last six months.
Furthermore, specific models---generated using data from repositories that share the same growth trends---are recommended for repositories with slow growth and/or for repositories with less stars.
Finally, we evaluate the ability to predict not the number of stars of a repository but its rank among the GitHub repositories.
We found a very strong correlation between predicted and real rankings (Spearman's rho greater than 0.95).
In this work, we introduce pose interpreter networks for 6-DoF object pose estimation.
In contrast to other CNN-based approaches to pose estimation that require expensively annotated object pose data, our pose interpreter network is trained entirely on synthetic pose data.
We use object masks as an intermediate representation to bridge real and synthetic.
We show that when combined with a segmentation model trained on RGB images, our synthetically trained pose interpreter network is able to generalize to real data.
Our end-to-end system for object pose estimation runs in real-time (20 Hz) on live RGB data, without using depth information or ICP refinement.
Due to recent technical and scientific advances, we have a wealth of information hidden in unstructured text data such as offline/online narratives, research articles, and clinical reports.
To mine these data properly, attributable to their innate ambiguity, a Word Sense Disambiguation (WSD) algorithm can avoid numbers of difficulties in Natural Language Processing (NLP) pipeline.
However, considering a large number of ambiguous words in one language or technical domain, we may encounter limiting constraints for proper deployment of existing WSD models.
This paper attempts to address the problem of one-classifier-per-one-word WSD algorithms by proposing a single Bidirectional Long Short-Term Memory (BLSTM) network which by considering senses and context sequences works on all ambiguous words collectively.
Evaluated on SensEval-3 benchmark, we show the result of our model is comparable with top-performing WSD algorithms.
We also discuss how applying additional modifications alleviates the model fault and the need for more training data.
The 55th Design Automation Conference (DAC) held its first System Design Contest (SDC) in 2018.
SDC'18 features a lower power object detection challenge (LPODC) on designing and implementing novel algorithms based object detection in images taken from unmanned aerial vehicles (UAV).
The dataset includes 95 categories and 150k images, and the hardware platforms include Nvidia's TX2 and Xilinx's PYNQ Z1.
DAC-SDC'18 attracted more than 110 entries from 12 countries.
This paper presents in detail the dataset and evaluation procedure.
It further discusses the methods developed by some of the entries as well as representative results.
The paper concludes with directions for future improvements.
We study the problem of distributed maximum computation in an open multi-agent system, where agents can leave and arrive during the execution of the algorithm.
The main challenge comes from the possibility that the agent holding the largest value leaves the system, which changes the value to be computed.
The algorithms must as a result be endowed with mechanisms allowing to forget outdated information.
The focus is on systems in which interactions are pairwise gossips between randomly selected agents.
We consider situations where leaving agents can send a last message, and situations where they cannot.
For both cases, we provide algorithms able to eventually compute the maximum of the values held by agents.
In the problem of matrix compressed sensing we aim to recover a low-rank matrix from few of its element-wise linear projections.
In this contribution we analyze the asymptotic performance of a Bayes-optimal inference procedure for a model where the matrix to be recovered is a product of random matrices.
The results that we obtain using the replica method describe the state evolution of the recently introduced P-BiG-AMP algorithm.
We show the existence of different types of phase transitions, their implications for the solvability of the problem, and we compare the results of the theoretical analysis to the performance reached by P-BiG-AMP.
Remarkably the asymptotic replica equations for matrix compressed sensing are the same as those for a related but formally different problem of matrix factorization.
The banking industry is very important for an economic cycle of each country and provides some quality of services for us.
With the advancement in technology and rapidly increasing of the complexity of the business environment, it has become more competitive than the past so that efficiency analysis in the banking industry attracts much attention in recent years.
From many aspects, such analyses at the branch level are more desirable.
Evaluating the branch performance with the purpose of eliminating deficiency can be a crucial issue for branch managers to measure branch efficiency.
This work not only can lead to a better understanding of bank branch performance but also give further information to enhance managerial decisions to recognize problematic areas.
To achieve this purpose, this study presents an integrated approach based on Data Envelopment Analysis (DEA), Clustering algorithms and Polynomial Pattern Classifier for constructing a classifier to identify a class of bank branches.
First, the efficiency estimates of individual branches are evaluated by using the DEA approach.
Next, when the range and number of classes were identified by experts, the number of clusters is identified by an agglomerative hierarchical clustering algorithm based on some statistical methods.
Next, we divide our raw data into k clusters By means of self-organizing map (SOM) neural networks.
Finally, all clusters are fed into the reduced multivariate polynomial model to predict the classes of data.
The need for countering Advanced Persistent Threat (APT) attacks has led to the solutions that ubiquitously monitor system activities in each host, and perform timely attack investigation over the monitoring data for analyzing attack provenance.
However, existing query systems based on relational databases and graph databases lack language constructs to express key properties of major attack behaviors, and often execute queries inefficiently since their semantics-agnostic design cannot exploit the properties of system monitoring data to speed up query execution.
To address this problem, we propose a novel query system built on top of existing monitoring tools and databases, which is designed with novel types of optimizations to support timely attack investigation.
Our system provides (1) domain-specific data model and storage for scaling the storage, (2) a domain-specific query language, Attack Investigation Query Language (AIQL) that integrates critical primitives for attack investigation, and (3) an optimized query engine based on the characteristics of the data and the semantics of the queries to efficiently schedule the query execution.
We deployed our system in NEC Labs America comprising 150 hosts and evaluated it using 857 GB of real system monitoring data (containing 2.5 billion events).
Our evaluations on a real-world APT attack and a broad set of attack behaviors show that our system surpasses existing systems in both efficiency (124x over PostgreSQL, 157x over Neo4j, and 16x over Greenplum) and conciseness (SQL, Neo4j Cypher, and Splunk SPL contain at least 2.4x more constraints than AIQL).
Clock synchronization is a widely discussed topic in the engineering literature.
Ensuring that individual clocks are closely aligned is important in network systems, since the correct timing of various events in a network is usually necessary for proper system implementation.
However, many existing clock synchronization algorithms update clock values abruptly, resulting in discontinuous clocks which have been shown to lead to undesirable behavior.
In this paper, we propose using the pulse-coupled oscillator model to guarantee clock continuity, demonstrating two general methods for achieving continuous phase evolution in any pulse-coupled oscillator network.
We provide rigorous mathematical proof that the pulse-coupled oscillator algorithm is able to converge to the synchronized state when the phase continuity methods are applied.
We provide simulation results supporting these proofs.
We further investigate the convergence behavior of other pulse-coupled oscillator synchronization algorithms using the proposed methods.
A software project has "Hero Developers" when 80% of contributions are delivered by 20% of the developers.
Are such heroes a good idea?
Are too many heroes bad for software quality?
Is it better to have more/less heroes for different kinds of projects?
To answer these questions, we studied 661 open source projects from Public open source software (OSS) Github and 171 projects from an Enterprise Github.
We find that hero projects are very common.
In fact, as projects grow in size, nearly all project become hero projects.
These findings motivated us to look more closely at the effects of heroes on software development.
Analysis shows that the frequency to close issues and bugs are not significantly affected by the presence of project type (Public or Enterprise).
Similarly, the time needed to resolve an issue/bug/enhancement is not affected by heroes or project type.
This is a surprising result since, before looking at the data, we expected that increasing heroes on a project will slow down howfast that project reacts to change.
However, we do find a statistically significant association between heroes, project types, and enhancement resolution rates.
Heroes do not affect enhancement resolution rates in Public projects.
However, in Enterprise projects, the more heroes increase the rate at which project complete enhancements.
In summary, our empirical results call for a revision of a long-held truism in software engineering.
Software heroes are far more common and valuable than suggested by the literature, particularly for medium to large Enterprise developments.
Organizations should reflect on better ways to find and retain more of these heroes
In this paper, an online adaptation algorithm for bipedal walking on uneven surfaces with height uncertainty is proposed.
In order to generate walking patterns on flat terrains, the trajectories in the task space are planned to satisfy the dynamic balance and slippage avoidance constraints, and also to guarantee smooth landing of the swing foot.
To ensure smooth landing of the swing foot on surfaces with height uncertainty, the preplanned trajectories in the task space should be adapted.
The proposed adaptation algorithm consists of two stages.
In the first stage, once the swing foot reaches its maximum height, the supervisory control is initiated until the touch is detected.
After the detection, the trajectories in the task space are modified to guarantee smooth landing.
In the second stage, this modification is preserved during the Double Support Phase (DSP), and released in the next Single Support Phase (SSP).
Effectiveness of the proposed online adaptation algorithm is experimentally verified through realization of the walking patterns on the SURENA III humanoid robot, designed and fabricated at CAST.
The walking is tested on a surface with various flat obstacles, where the swing foot is prone to either land on the ground soon or late.
This short paper provides a description of an architecture to acquisition and use of knowledge by intelligent agents over a restricted domain of the Internet Infrastructure.
The proposed architecture is added to an intelligent agent deployment model over a very useful server for Internet Autonomous System administrators.
Such servers, which are heavily dependent on arbitrary and eventual updates of human beings, become unreliable.
This is a position paper that proposes three research questions that are still in progress.
Virality of online content on social networking websites is an important but esoteric phenomenon often studied in fields like marketing, psychology and data mining.
In this paper we study viral images from a computer vision perspective.
We introduce three new image datasets from Reddit, and define a virality score using Reddit metadata.
We train classifiers with state-of-the-art image features to predict virality of individual images, relative virality in pairs of images, and the dominant topic of a viral image.
We also compare machine performance to human performance on these tasks.
We find that computers perform poorly with low level features, and high level information is critical for predicting virality.
We encode semantic information through relative attributes.
We identify the 5 key visual attributes that correlate with virality.
We create an attribute-based characterization of images that can predict relative virality with 68.10% accuracy (SVM+Deep Relative Attributes) -- better than humans at 60.12%.
Finally, we study how human prediction of image virality varies with different `contexts' in which the images are viewed, such as the influence of neighbouring images, images recently viewed, as well as the image title or caption.
This work is a first step in understanding the complex but important phenomenon of image virality.
Our datasets and annotations will be made publicly available.
This paper discusses models for dialogue state tracking using recurrent neural networks (RNN).
We present experiments on the standard dialogue state tracking (DST) dataset, DSTC2.
On the one hand, RNN models became the state of the art models in DST, on the other hand, most state-of-the-art models are only turn-based and require dataset-specific preprocessing (e.g.DSTC2-specific) in order to achieve such results.
We implemented two architectures which can be used in incremental settings and require almost no preprocessing.
We compare their performance to the benchmarks on DSTC2 and discuss their properties.
With only trivial preprocessing, the performance of our models is close to the state-of- the-art results.
Everyone knows that thousand of words are represented by a single image.
As a result image search has become a very popular mechanism for the Web searchers.
Image search means, the search results are produced by the search engine should be a set of images along with their Web page Unified Resource Locator.
Now Web searcher can perform two types of image search, they are Text to Image and Image to Image search.
In Text to Image search, search query should be a text.
Based on the input text data system will generate a set of images along with their Web page URL as an output.
On the other hand, in Image to Image search, search query should be an image and based on this image system will generate a set of images along with their Web page URL as an output.
According to the current scenarios, Text to Image search mechanism always not returns perfect result.
It matches the text data and then displays the corresponding images as an output, which is not always perfect.
To resolve this problem, Web researchers have introduced the Image to Image search mechanism.
In this paper, we have also proposed an alternate approach of Image to Image search mechanism using Histogram.
Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition.
However, research on end-to-end audiovisual models is very limited.
In this work, we present an end-to-end audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs).
To the best of our knowledge, this is the first audiovisual fusion model which simultaneously learns to extract features directly from the image pixels and audio waveforms and performs within-context word recognition on a large publicly available dataset (LRW).
The model consists of two streams, one for each modality, which extract features directly from mouth regions and raw waveforms.
The temporal dynamics in each stream/modality are modeled by a 2-layer BGRU and the fusion of multiple streams/modalities takes place via another 2-layer BGRU.
A slight improvement in the classification rate over an end-to-end audio-only and MFCC-based model is reported in clean audio conditions and low levels of noise.
In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models.
A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain.
Recently, inspired by the successes of transfer learning, several authors have proposed to learn instead universal, fixed feature extractors that, used as the first stage of any deep network, work well for several tasks and domains simultaneously.
Nevertheless, such universal features are still somewhat inferior to specialized networks.
To overcome this limitation, in this paper we propose to consider instead universal parametric families of neural networks, which still contain specialized problem-specific models, but differing only by a small number of parameters.
We study different designs for such parametrizations, including series and parallel residual adapters, joint adapter compression, and parameter allocations, and empirically identify the ones that yield the highest compression.
We show that, in order to maximize performance, it is necessary to adapt both shallow and deep layers of a deep network, but the required changes are very small.
We also show that these universal parametrization are very effective for transfer learning, where they outperform traditional fine-tuning techniques.
This paper describes the deployment and implementation of a blockchain to improve the security, knowledge, intelligence and collaboration during the inter-agent communication processes in restrict domains of the Internet Infrastructure.
It is a work that proposes the application of a blockchain, platform independent, on a particular model of agents, but that can be used in similar proposals, once the results on the specific model were satisfactory.
The state-of-the-art performance of deep learning algorithms has led to a considerable increase in the utilization of machine learning in security-sensitive and critical applications.
However, it has recently been shown that a small and carefully crafted perturbation in the input space can completely fool a deep model.
In this study, we explore the extent to which face recognition systems are vulnerable to geometrically-perturbed adversarial faces.
We propose a fast landmark manipulation method for generating adversarial faces, which is approximately 200 times faster than the previous geometric attacks and obtains 99.86% success rate on the state-of-the-art face recognition models.
To further force the generated samples to be natural, we introduce a second attack constrained on the semantic structure of the face which has the half speed of the first attack with the success rate of 99.96%.
Both attacks are extremely robust against the state-of-the-art defense methods with the success rate of equal or greater than 53.59%.
Code is available at https://github.com/alldbi/FLM
The Multiple Instance Hybrid Estimator for discriminative target characterization from imprecisely labeled hyperspectral data is presented.
In many hyperspectral target detection problems, acquiring accurately labeled training data is difficult.
Furthermore, each pixel containing target is likely to be a mixture of both target and non-target signatures (i.e., sub-pixel targets), making extracting a pure prototype signature for the target class from the data extremely difficult.
The proposed approach addresses these problems by introducing a data mixing model and optimizing the response of the hybrid sub-pixel detector within a multiple instance learning framework.
The proposed approach iterates between estimating a set of discriminative target and non-target signatures and solving a sparse unmixing problem.
After learning target signatures, a signature based detector can then be applied on test data.
Both simulated and real hyperspectral target detection experiments show the proposed algorithm is effective at learning discriminative target signatures and achieves superior performance over state-of-the-art comparison algorithms.
Undetected errors are important for linear codes, which are the only type of errors after hard decision and automatic-repeat-request (ARQ), but do not receive much attention on their correction.
In concatenated channel coding, suboptimal source coding and joint source-channel coding, constrains among successive codewords may be utilized to improve decoding performance.
In this paper, list decoding is used to correct the undetected errors.
The benefit proportion of the correction is obviously improved especially on Hamming codes and Reed-Muller codes, which achieves about 40% in some cases.
But this improvement is significant only after the selection of final codewords from the lists based on the constrains among the successive transmitted codewords.
The selection algorithm is investigated here to complete the list decoding program in the application of Markov context model.
The performance of the algorithm is analysed and a lower bound of the correctly selected probability is derived to determine the proper context length.
The idea of a two wheel self-balancing robot has become very popular among control system researchers worldwide over the last decade.
This paper presents a one variant of the implementation of the self-balancing robot using the VEX Robotics Kit.
Convolutional neural networks (CNNs) are widely used in many image recognition tasks due to their extraordinary performance.
However, training a good CNN model can still be a challenging task.
In a training process, a CNN model typically learns a large number of parameters over time, which usually results in different performance.
Often, it is difficult to explore the relationships between the learned parameters and the model performance due to a large number of parameters and different random initializations.
In this paper, we present a visual analytics approach to compare two different snapshots of a trained CNN model taken after different numbers of epochs, so as to provide some insight into the design or the training of a better CNN model.
Our system compares snapshots by exploring the differences in operation parameters and the corresponding blob data at different levels.
A case study has been conducted to demonstrate the effectiveness of our system.
We present, GeniePath, a scalable approach for learning adaptive receptive fields of neural networks defined on permutation invariant graph data.
In GeniePath, we propose an adaptive path layer consists of two complementary functions designed for breadth and depth exploration respectively, where the former learns the importance of different sized neighborhoods, while the latter extracts and filters signals aggregated from neighbors of different hops away.
Our method works in both transductive and inductive settings, and extensive experiments compared with competitive methods show that our approaches yield state-of-the-art results on large graphs.
Objectives: The article provides an overview of current trends in personal sensor, signal and imaging informatics, that are based on emerging mobile computing and communications technologies enclosed in a smartphone and enabling the provision of personal, pervasive health informatics services.
Methods: The article reviews examples of these trends from the PubMed and Google scholar literature search engines, which, by no means claim to be complete, as the field is evolving and some recent advances may not be documented yet.
Results: There exist critical technological advances in the surveyed smartphone technologies, employed in provision and improvement of diagnosis, acute and chronic treatment and rehabilitation health services, as well as in education and training of healthcare practitioners.
However, the most emerging trend relates to a routine application of these technologies in a prevention/wellness sector, helping its users in self-care to stay healthy.
Conclusions: Smartphone-based personal health informatics services exist, but still have a long way to go to become an everyday, personalized healthcare-provisioning tool in the medical field and in a clinical practice.
Key main challenge for their widespread adoption involve lack of user acceptance striving from variable credibility and reliability of applications and solutions as they a) lack evidence-based approach; b) have low levels of medical professional involvement in their design and content; c) are provided in an unreliable way, influencing negatively its usability; and, in some cases, d) being industry-driven, hence exposing bias in information provided, for example towards particular types of treatment or intervention procedures.
Machine Learning on graph-structured data is an important and omnipresent task for a vast variety of applications including anomaly detection and dynamic network analysis.
In this paper, a deep generative model is introduced to capture continuous probability densities corresponding to the nodes of an arbitrary graph.
In contrast to all learning formulations in the area of discriminative pattern recognition, we propose a scalable generative optimization/algorithm theoretically proved to capture distributions at the nodes of a graph.
Our model is able to generate samples from the probability densities learned at each node.
This probabilistic data generation model, i.e. convolutional graph auto-encoder (CGAE), is devised based on the localized first-order approximation of spectral graph convolutions, deep learning, and the variational Bayesian inference.
We apply our CGAE to a new problem, the spatio-temporal probabilistic solar irradiance prediction.
Multiple solar radiation measurement sites in a wide area in northern states of the US are modeled as an undirected graph.
Using our proposed model, the distribution of future irradiance given historical radiation observations is estimated for every site/node.
Numerical results on the National Solar Radiation Database show state-of-the-art performance for probabilistic radiation prediction on geographically distributed irradiance data in terms of reliability, sharpness, and continuous ranked probability score.
There is surprisingly little known about agenda setting for international development in the United Nations (UN) despite it having a significant influence on the process and outcomes of development efforts.
This paper addresses this shortcoming using a novel approach that applies natural language processing techniques to countries' annual statements in the UN General Debate.
Every year UN member states deliver statements during the General Debate on their governments' perspective on major issues in world politics.
These speeches provide invaluable information on state preferences on a wide range of issues, including international development, but have largely been overlooked in the study of global politics.
This paper identifies the main international development topics that states raise in these speeches between 1970 and 2016, and examine the country-specific drivers of international development rhetoric.
Automatic feature extraction using neural networks has accomplished remarkable success for images, but for sound recognition, these models are usually modified to fit the nature of the multi-dimensional temporal representation of the audio signal in spectrograms.
This may not efficiently harness the time-frequency representation of the signal.
The ConditionaL Neural Network (CLNN) takes into consideration the interrelation between the temporal frames, and the Masked ConditionaL Neural Network (MCLNN) extends upon the CLNN by forcing a systematic sparseness over the network's weights using a binary mask.
The masking allows the network to learn about frequency bands rather than bins, mimicking a filterbank used in signal transformations such as MFCC.
Additionally, the Mask is designed to consider various combinations of features, which automates the feature hand-crafting process.
We applied the MCLNN for the Environmental Sound Recognition problem using the Urbansound8k, YorNoise, ESC-10 and ESC-50 datasets.
The MCLNN have achieved competitive performance compared to state-of-the-art Convolutional Neural Networks and hand-crafted attempts.
A new stereoscopic image quality assessment database rendered using the 2D-image-plus-depth source, called MCL-3D, is described and the performance benchmarking of several known 2D and 3D image quality metrics using the MCL-3D database is presented in this work.
Nine image-plus-depth sources are first selected, and a depth image-based rendering (DIBR) technique is used to render stereoscopic image pairs.
Distortions applied to either the texture image or the depth image before stereoscopic image rendering include: Gaussian blur, additive white noise, down-sampling blur, JPEG and JPEG-2000 (JP2K) compression and transmission error.
Furthermore, the distortion caused by imperfect rendering is also examined.
The MCL-3D database contains 693 stereoscopic image pairs, where one third of them are of resolution 1024x728 and two thirds are of resolution 1920x1080.
The pair-wise comparison was adopted in the subjective test for user friendliness, and the Mean Opinion Score (MOS) can be computed accordingly.
Finally, we evaluate the performance of several 2D and 3D image quality metrics applied to MCL-3D.
All texture images, depth images, rendered image pairs in MCL-3D and their MOS values obtained in the subjective test are available to the public (http://mcl.usc.edu/mcl-3d-database/) for future research and development.
With the ever increasing demands of cloud computing services, planning and management of cloud resources has become a more and more important issue which directed affects the resource utilization and SLA and customer satisfaction.
But before any management strategy is made, a good understanding of applications' workload in virtualized environment is the basic fact and principle to the resource management methods.
Unfortunately, little work has been focused on this area.
Lack of raw data could be one reason; another reason is that people still use the traditional models or methods shared under non-virtualized environment.
The study of applications' workload in virtualized environment should take on some of its peculiar features comparing to the non-virtualized environment.
In this paper, we are open to analyze the workload demands that reflect applications' behavior and the impact of virtualization.
The results are obtained from an experimental cloud testbed running web applications, specifically the RUBiS benchmark application.
We profile the workload dynamics on both virtualized and non-virtualized environments and compare the findings.
The experimental results are valuable for us to estimate the performance of applications on computer architectures, to predict SLA compliance or violation based on the projected application workload and to guide the decision making to support applications with the right hardware.
Functional Electrical Stimulation (FES) systems are successful in restoring motor function and supporting paralyzed users.
Commercially available FES products are open loop, meaning that the system is unable to adapt to changing conditions with the user and their muscles which results in muscle fatigue and poor stimulation protocols.
This is because it is difficult to close the loop between stimulation and monitoring of muscle contraction using adaptive stimulation.
FES causes electrical artefacts which make it challenging to monitor muscle contractions with traditional methods such as electromyography (EMG).
We look to overcome this limitation by combining FES with novel mechanomyographic (MMG) sensors to be able to monitor muscle activity during stimulation in real time.
To provide a meaningful task we built an FES cycling rig with a software interface that enabled us to perform adaptive recording and stimulation, and then combine this with sensors to record forces applied to the pedals using force sensitive resistors (FSRs), crank angle position using a magnetic incremental encoder and inputs from the user using switches and a potentiometer.
We illustrated this with a closed-loop stimulation algorithm that used the inputs from the sensors to control the output of a programmable RehaStim 1 FES stimulator (Hasomed) in real-time.
This recumbent bicycle rig was used as a testing platform for FES cycling.
The algorithm was designed to respond to a change in requested speed (RPM) from the user and change the stimulation power (% of maximum current mA) until this speed was achieved and then maintain it.
We describe a method to produce a network where current methods such as DeepFool have great difficulty producing adversarial samples.
Our construction suggests some insights into how deep networks work.
We provide a reasonable analyses that our construction is difficult to defeat, and show experimentally that our method is hard to defeat with both Type I and Type II attacks using several standard networks and datasets.
This SafetyNet architecture is used to an important and novel application SceneProof, which can reliably detect whether an image is a picture of a real scene or not.
SceneProof applies to images captured with depth maps (RGBD images) and checks if a pair of image and depth map is consistent.
It relies on the relative difficulty of producing naturalistic depth maps for images in post processing.
We demonstrate that our SafetyNet is robust to adversarial examples built from currently known attacking approaches.
Size-Change Termination is an increasingly-popular technique for verifying program termination.
These termination proofs are deduced from an abstract representation of the program in the form of "size-change graphs".
We present algorithms that, for certain classes of size-change graphs, deduce a global ranking function: an expression that ranks program states, and decreases on every transition.
A ranking function serves as a witness for a termination proof, and is therefore interesting for program certification.
The particular form of the ranking expressions that represent SCT termination proofs sheds light on the scope of the proof method.
The complexity of the expressions is also interesting, both practicaly and theoretically.
While deducing ranking functions from size-change graphs has already been shown possible, the constructions in this paper are simpler and more transparent than previously known.
They improve the upper bound on the size of the ranking expression from triply exponential down to singly exponential (for certain classes of instances).
We claim that this result is, in some sense, optimal.
To this end, we introduce a framework for lower bounds on the complexity of ranking expressions and prove exponential lower bounds.
We propose a novel convolutional neural network architecture for estimating geospatial functions such as population density, land cover, or land use.
In our approach, we combine overhead and ground-level images in an end-to-end trainable neural network, which uses kernel regression and density estimation to convert features extracted from the ground-level images into a dense feature map.
The output of this network is a dense estimate of the geospatial function in the form of a pixel-level labeling of the overhead image.
To evaluate our approach, we created a large dataset of overhead and ground-level images from a major urban area with three sets of labels: land use, building function, and building age.
We find that our approach is more accurate for all tasks, in some cases dramatically so.
There is a recent interest in developing statistical filtering methods for stochastic optimization (FSO) by leveraging a probabilistic perspective of incremental proximity methods (IPMs).
The existent FSO methods are derived based on the Kalman filter (KF) and extended KF (EKF).
Different with classical stochastic optimization methods such as the stochastic gradient descent (SGD) and typical IPMs, such KF-type algorithms possess a desirable property, namely they do not require pre-scheduling of the learning rate for convergence.
However, on the other side, they have inherent limitations inherited from the nature of KF mechanisms.
It is a consensus that the class of particle filters (PFs) outperforms the KF and its variants remarkably for nonlinear and/or non-Gaussian statistical filtering tasks.
Hence, it is natural to ask if the FSO methods can benefit from the PF theory to get around of the limitations of the KF-type IPMs.
We provide an affirmative answer to the aforementioned question by developing three PF based stochastic optimization (PFSO) algorithms.
For performance evaluation, we apply them to solve a least-square fitting problem using a simulated data set, and the empirical risk minimization (ERM) problem in binary classification using real data sets.
Experimental results demonstrate that our algorithms outperform remarkably existent methods in terms of numerical stability, convergence speed, and flexibility in handling different types of loss functions.
Relative worst-order analysis is a technique for assessing the relative quality of online algorithms.
We survey the most important results obtained with this technique and compare it with other quality measures.
Early software effort estimation is a hallmark of successful software project management.
Building a reliable effort estimation model usually requires historical data.
Unfortunately, since the information available at early stages of software development is scarce, it is recommended to use software size metrics as key cost factor of effort estimation.
Use Case Points (UCP) is a prominent size measure designed mainly for object-oriented projects.
Nevertheless, there are no established models that can translate UCP into its corresponding effort, therefore, most models use productivity as a second cost driver.
The productivity in those models is usually guessed by experts and does not depend on historical data, which makes it subject to uncertainty.
Thus, these models were not well examined using a large number of historical data.
In this paper, we designed a hybrid model that consists of classification and prediction stages using a support vector machine and radial basis neural networks.
The proposed model was constructed over a large number of observations collected from industrial and student projects.
The proposed model was compared against previous UCP prediction models.
The validation and empirical results demonstrated that the proposed model significantly surpasses these models on all datasets.
The main conclusion is that the environmental factors of UCP can be used to classify and estimate productivity.
The research explores and examines factors for supplier evaluation and its impact on process improvement particularly aiming on a steel pipe manufacturing firm in Gujarat, India.
Data was collected using in-depth interview.
The questionnaire primarily involves the perception of evaluation of supplier.
Factors influencing supplier evaluation and its influence on process improvement is also examined in this study.
The model testing and validation were done using partial least square method.
Outcomes signified that the factors that influence the evaluation of the supplier are quality, cost, delivery and supplier relationship management.
The study depicted that quality and cost factors for supplier evaluation are insignificant.
The delivery and supplier relationship management have the significant influence on the evaluation of the supplier.
The research also depicted that supplier evaluation has a significant influence on process improvement.
Many researchers have considered quality, cost and delivery as the factors for evaluating the suppliers.
But for a company, it is quintessential to have a good relationship with the supplier.
Hence, the factor, supplier relationship management is considered for the study.
Also, the case study company focused more on quality and cost factors for the supplier evaluation of the firm.
However, delivery and supplier relationship management are also equally important for a firm in evaluating the supplier.
In order to avoid unnecessary applications of Miller-Rabin algorithm to the number in question, we resort to trial division by a few initial prime numbers, since such a division take less time.
How far we should go with such a division is the that we are trying to answer in this paper?For the theory of the matter is fully resolved.
However, that in practice we do not have much use.
Therefore, we present a solution that is probably irrelevant to theorists, but it is very useful to people who have spent many nights to produce large (probably) prime numbers using its own software.
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.
Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition.
Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints.
To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons.
Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.
The advancement in technology has brought a new era in terrorism where Online Social Networks have become a major platform of communication with wide range of usage from message channeling to propaganda and recruitment of new followers in terrorist groups.
Meanwhile, during the terrorist attacks people use social networks for information exchange, mobilizing and uniting and raising money for the victims.
This paper critically analyses the specific usage of social networks in the times of terrorism attacks in developing countries.
We characterize binary words that have exactly two unbordered conjugates and show that they can be expressed as a product of two palindromes.
The current study examines how adequate coordination among different cognitive processes including visual recognition, attention switching, action preparation and generation can be developed via learning of robots by introducing a novel model, the Visuo-Motor Deep Dynamic Neural Network (VMDNN).
The proposed model is built on coupling of a dynamic vision network, a motor generation network, and a higher level network allocated on top of these two.
The simulation experiments using the iCub simulator were conducted for cognitive tasks including visual object manipulation responding to human gestures.
The results showed that synergetic coordination can be developed via iterative learning through the whole network when spatio-temporal hierarchy and temporal one can be self-organized in the visual pathway and in the motor pathway, respectively, such that the higher level can manipulate them with abstraction.
Our Chapter in the upcoming Volume I: Computer Science and Software Engineering of Computing Handbook (Third edition), Allen Tucker, Teo Gonzales and Jorge L. Diaz-Herrera, editors, covers Algebraic Algorithms, both symbolic and numerical, for matrix computations and root-finding for polynomials and systems of polynomials equations.
We cover part of these large subjects and include basic bibliography for further study.
To meet space limitation we cite books, surveys, and comprehensive articles with pointers to further references, rather than including all the original technical papers.
Automatic segmentation of retinal blood vessels from fundus images plays an important role in the computer aided diagnosis of retinal diseases.
The task of blood vessel segmentation is challenging due to the extreme variations in morphology of the vessels against noisy background.
In this paper, we formulate the segmentation task as a multi-label inference task and utilize the implicit advantages of the combination of convolutional neural networks and structured prediction.
Our proposed convolutional neural network based model achieves strong performance and significantly outperforms the state-of-the-art for automatic retinal blood vessel segmentation on DRIVE dataset with 95.33% accuracy and 0.974 AUC score.
We investigate pivot-based translation between related languages in a low resource, phrase-based SMT setting.
We show that a subword-level pivot-based SMT model using a related pivot language is substantially better than word and morpheme-level pivot models.
It is also highly competitive with the best direct translation model, which is encouraging as no direct source-target training corpus is used.
We also show that combining multiple related language pivot models can rival a direct translation model.
Thus, the use of subwords as translation units coupled with multiple related pivot languages can compensate for the lack of a direct parallel corpus.
Understanding structural controllability of a complex network requires to identify a Minimum Input nodes Set (MIS) of the network.
It has been suggested that finding an MIS is equivalent to computing a maximum matching of the network, where the unmatched nodes constitute an MIS.
However, maximum matching of a network is often not unique, and finding all MISs may provide deep insights to the controllability of the network.
Finding all possible input nodes, which form the union of all MISs, is computationally challenging for large networks.
Here we present an efficient enumerative algorithm for the problem.
The main idea is to modify a maximum matching algorithm to make it efficient for finding all possible input nodes by computing only one MIS.
We rigorously proved the correctness of the new algorithm and evaluated its performance on synthetic and large real networks.
The experimental results showed that the new algorithm ran several orders of magnitude faster than the existing method on large real networks.
GFDM and WCP-COQAM are amongst the candidate physical layer modulation formats to be used in 5G, whose claimed lower out-of-band (OOB) emissions are important with respect to cognitive radio based dynamic spectrum access solutions.
In this study, we compare OFDM, GFDM and WCP-COQAM in terms of OOB emissions in a fair manner such that their spectral efficiencies are the same and OOB emission reduction techniques are applied to all of the modulation types.
Analytical PSD expressions are also correlated with the simulation based OOB emission results.
Maintaining the same spectral efficiency, carrier frequency offset immunities will also be compared.
Memristors are low-power memory-holding resistors thought to be useful for neuromophic computing, which can compute via spike-interactions mediated through the device's short-term memory.
Using interacting spikes, it is possible to build an AND gate that computes OR at the same time, similarly a full adder can be built that computes the arithmetical sum of its inputs.
Here we show how these gates can be understood by modelling the memristors as a novel type of perceptron: one which is sensitive to input order.
The memristor's memory can change the input weights for later inputs, and thus the memristor gates cannot be accurately described by a single perceptron, requiring either a network of time-invarient perceptrons or a complex time-varying self-reprogrammable perceptron.
This work demonstrates the high functionality of memristor logic gates, and also that the addition of theasholding could enable the creation of a standard perceptron in hardware, which may have use in building neural net chips.
We present a novel semantic light field (LF) refocusing technique that can achieve unprecedented see-through quality.
Different from prior art, our semantic see-through (SST) differentiates rays in their semantic meaning and depth.
Specifically, we combine deep learning and stereo matching to provide each ray a semantic label.
We then design tailored weighting schemes for blending the rays.
Although simple, our solution can effectively remove foreground residues when focusing on the background.
At the same time, SST maintains smooth transitions in varying focal depths.
Comprehensive experiments on synthetic and new real indoor and outdoor datasets demonstrate the effectiveness and usefulness of our technique.
This letter deals with the controllability issue of complex networks.
An index is chosen to quantitatively measure the extent of controllability of given network.
The effect of this index is analyzed based on empirical studies on various classes of network topologies, such as random network, small-world network, and scale-free network.
This article addresses an open problem in the area of cognitive systems and architectures: namely the problem of handling (in terms of processing and reasoning capabilities) complex knowledge structures that can be at least plausibly comparable, both in terms of size and of typology of the encoded information, to the knowledge that humans process daily for executing everyday activities.
Handling a huge amount of knowledge, and selectively retrieve it ac- cording to the needs emerging in different situational scenarios, is an important aspect of human intelligence.
For this task, in fact, humans adopt a wide range of heuristics (Gigerenzer and Todd) due to their bounded rationality (Simon, 1957).
In this perspective, one of the re- quirements that should be considered for the design, the realization and the evaluation of intelligent cognitively inspired systems should be represented by their ability of heuristically identify and retrieve, from the general knowledge stored in their artificial Long Term Memory (LTM), that one which is synthetically and contextually relevant.
This require- ment, however, is often neglected.
Currently, artificial cognitive systems and architectures are not able, de facto, to deal with complex knowledge structures that can be even slightly comparable to the knowledge heuris- tically managed by humans.
In this paper I will argue that this is not only a technological problem but also an epistemological one and I will briefly sketch a proposal for a possible solution.
This paper presents a summary of the first Workshop on Building Linguistically Generalizable Natural Language Processing Systems, and the associated Build It Break It, The Language Edition shared task.
The goal of this workshop was to bring together researchers in NLP and linguistics with a shared task aimed at testing the generalizability of NLP systems beyond the distributions of their training data.
We describe the motivation, setup, and participation of the shared task, provide discussion of some highlighted results, and discuss lessons learned.
We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation.
The problem of clustering features arises naturally in text classification for instance, to reduce dimensionality by grouping words together and identify synonyms.
The sample clustering problem on the other hand, applies to multiclass problems where we are allowed to make multiple predictions and the performance of the best answer is recorded.
We derive a unified optimization formulation highlighting the common structure of these problems and produce algorithms whose core iteration complexity amounts to a k-means clustering step, which can be approximated efficiently.
We extend these results to combine sparsity and clustering constraints, and develop a new projection algorithm on the set of clustered sparse vectors.
We prove convergence of our algorithms on random instances, based on a union of subspaces interpretation of the clustering structure.
Finally, we test the robustness of our methods on artificial data sets as well as real data extracted from movie reviews.
Next generation multi-beam SatCom architectures will heavily exploit full frequency reuse schemes along with interference management techniques, e.g., precoding or multiuser detection, to drastically increase the system throughput.
In this framework, we address the problem of the user selection for multicast precoding by formulating it as a clustering problem.
By introducing a novel mathematical framework, we design fixed-/variable-size clustering algorithms that group users into simultaneously precoded and served clusters while maximising the system throughput.
Numerical simulations are used to validate the proposed algorithms and to identify the main system-level trade-offs.
With the rapid advancement of wireless network technology, usage of WSN in real time applications like military, forest monitoring etc. found increasing.
Generally WSN operate in an unattended environment and handles critical data.
Authenticating the user trying to access the sensor memory is one of the critical requirements.
Many researchers have proposed remote user authentication schemes focusing on various parameters.
In 2013, Li et al. proposed a temporal-credential-based mutual authentication and key agreement scheme for WSNs.
Li et al. claimed that their scheme is secure against all major cryptographic attacks and requires less computation cost due to usage of hash function instead encryption operations.
Unfortunately, in this paper we will show that their scheme is vulnerable to offline password guessing attack, stolen smart card attack, leakage of password etc. and failure to provide data privacy.
A heavy path in a weighted graph represents a notion of connectivity and ordering that goes beyond two nodes.
The heaviest path of length l in the graph, simply means a sequence of nodes with edges between them, such that the sum of edge weights is maximum among all paths of length l. It is trivial to state the heaviest edge in the graph is the heaviest path of length 1, that represents a heavy connection between (any) two existing nodes.
This can be generalized in many different ways for more than two nodes, one of which is finding the heavy weight paths in the graph.
In an influence network, this represents a highway for spreading information from a node to one of its indirect neighbors at distance l. Moreover, a heavy path implies an ordering of nodes.
For instance, we can discover which ordering of songs (tourist spots) on a playlist (travel itinerary) is more pleasant to a user or a group of users who enjoy all songs (tourist spots) on the playlist (itinerary).
This can also serve as a hard optimization problem, maximizing different types of quantities of a path such as score, flow, probability or surprise, defined as edge weight.
Therefore, if one can solve the Heavy Path Problem (HPP) efficiently, they can as well use HPP for modeling and reduce other complex problems to it.
Conventional approaches to image de-fencing use multiple adjacent frames for segmentation of fences in the reference image and are limited to restoring images of static scenes only.
In this paper, we propose a de-fencing algorithm for images of dynamic scenes using an occlusion-aware optical flow method.
We divide the problem of image de-fencing into the tasks of automated fence segmentation from a single image, motion estimation under known occlusions and fusion of data from multiple frames of a captured video of the scene.
Specifically, we use a pre-trained convolutional neural network to segment fence pixels from a single image.
The knowledge of spatial locations of fences is used to subsequently estimate optical flow in the occluded frames of the video for the final data fusion step.
We cast the fence removal problem in an optimization framework by modeling the formation of the degraded observations.
The inverse problem is solved using fast iterative shrinkage thresholding algorithm (FISTA).
Experimental results show the effectiveness of proposed algorithm.
The NUbots are an interdisciplinary RoboCup team from The University of Newcastle, Australia.
The team has a history of strong contributions in the areas of machine learning and computer vision.
The NUbots have participated in RoboCup leagues since 2002, placing first several times in the past.
In 2014 the NUbots also partnered with the University of Newcastle Mechatronics Laboratory to participate in the RobotX Marine Robotics Challenge, which resulted in several new ideas and improvements to the NUbots vision system for RoboCup.
This paper summarizes the history of the NUbots team, describes the roles and research of the team members, gives an overview of the NUbots' robots, their software system, and several associated research projects.
The automatic parking is being massively developed by car manufacturers and providers.
Until now, there are two problems with the automatic parking.
First, there is no openly-available segmentation labels of parking slot on panoramic surround view (PSV) dataset.
Second, how to detect parking slot and road structure robustly.
Therefore, in this paper, we build up a public PSV dataset.
At the same time, we proposed a highly fused convolutional network (HFCN) based segmentation method for parking slot and lane markings based on the PSV dataset.
A surround-view image is made of four calibrated images captured from four fisheye cameras.
We collect and label more than 4,200 surround view images for this task, which contain various illuminated scenes of different types of parking slots.
A VH-HFCN network is proposed, which adopts an HFCN as the base, with an extra efficient VH-stage for better segmenting various markings.
The VH-stage consists of two independent linear convolution paths with vertical and horizontal convolution kernels respectively.
This modification enables the network to robustly and precisely extract linear features.
We evaluated our model on the PSV dataset and the results showed outstanding performance in ground markings segmentation.
Based on the segmented markings, parking slots and lanes are acquired by skeletonization, hough line transform and line arrangement.
Formal Concept Analysis "FCA" is a data analysis method which enables to discover hidden knowledge existing in data.
A kind of hidden knowledge extracted from data is association rules.
Different quality measures were reported in the literature to extract only relevant association rules.
Given a dataset, the choice of a good quality measure remains a challenging task for a user.
Given a quality measures evaluation matrix according to semantic properties, this paper describes how FCA can highlight quality measures with similar behavior in order to help the user during his choice.
The aim of this article is the discovery of Interestingness Measures "IM" clusters, able to validate those found due to the hierarchical and partitioning clustering methods "AHC" and "k-means".
Then, based on the theoretical study of sixty one interestingness measures according to nineteen properties, proposed in a recent study, "FCA" describes several groups of measures.
We propose the residual expansion (RE) algorithm: a global (or near-global) optimization method for nonconvex least squares problems.
Unlike most existing nonconvex optimization techniques, the RE algorithm is not based on either stochastic or multi-point searches; therefore, it can achieve fast global optimization.
Moreover, the RE algorithm is easy to implement and successful in high-dimensional optimization.
The RE algorithm exhibits excellent empirical performance in terms of k-means clustering, point-set registration, optimized product quantization, and blind image deblurring.
In this paper some new experimental results about the statistical characterization of the non-line-of-sight (NLOS) bias affecting time-of-arrival (TOA) estimation in ultrawideband (UWB) wireless localization systems are illustrated.
Then, these results are exploited to assess the performance of various maximum-likelihood (ML) based algorithms for joint TOA localization and NLOS bias mitigation.
Our numerical results evidence that the accuracy of all the considered algorithms is appreciably influenced by the LOS/NLOS conditions of the propagation environment.
A large amount of research effort has been dedicated to adapting boosting for imbalanced classification.
However, boosting methods are yet to be satisfactorily immune to class imbalance, especially for multi-class problems.
This is because most of the existing solutions for handling class imbalance rely on expensive cost set tuning for determining the proper level of compensation.
We show that the assignment of weights to the component classifiers of a boosted ensemble can be thought of as a game of Tug of War between the classes in the margin space.
We then demonstrate how this insight can be used to attain a good compromise between the rare and abundant classes without having to resort to cost set tuning, which has long been the norm for imbalanced classification.
The solution is based on a lexicographic linear programming framework which requires two stages.
Initially, class-specific component weight combinations are found so as to minimize a hinge loss individually for each of the classes.
Subsequently, the final component weights are assigned so that the maximum deviation from the class-specific minimum loss values (obtained in the previous stage) is minimized.
Hence, the proposal is not only restricted to two-class situations, but is also readily applicable to multi-class problems.
Additionally,we also derive the dual formulation corresponding to the proposed framework.
Experiments conducted on artificial and real-world imbalanced datasets as well as on challenging applications such as hyperspectral image classification and ImageNet classification establish the efficacy of the proposal.
Unified Virtual Memory (UVM) was recently introduced on recent NVIDIA GPUs.
Through software and hardware support, UVM provides a coherent shared memory across the entire heterogeneous node, migrating data as appropriate.
The older CUDA programming style is akin to older large-memory UNIX applications which used to directly load and unload memory segments.
Newer CUDA programs have started taking advantage of UVM for the same reasons of superior programmability that UNIX applications long ago switched to assuming the presence of virtual memory.
Therefore, checkpointing of UVM will become increasingly important, especially as NVIDIA CUDA continues to gain wider popularity: 87 of the top 500 supercomputers in the latest listings are GPU-accelerated, with a current trend of ten additional GPU-based supercomputers each year.
A new scalable checkpointing mechanism, CRUM (Checkpoint-Restart for Unified Memory), is demonstrated for hybrid CUDA/MPI computations across multiple computer nodes.
CRUM supports a fast, forked checkpointing, which mostly overlaps the CUDA computation with storage of the checkpoint image in stable storage.
The runtime overhead of using CRUM is 6% on average, and the time for forked checkpointing is seen to be a factor of up to 40 times less than traditional, synchronous checkpointing.
Patterns of interdisciplinarity in science can be quantified through diverse complementary dimensions.
This paper studies as a case study the scientific environment of a generalist journal in Geography, Cybergeo, in order to introduce a novel methodology combining citation network analysis and semantic analysis.
We collect a large corpus of around 200,000 articles with their abstracts and the corresponding citation network that provides a first citation classification.
Relevant keywords are extracted for each article through text-mining, allowing us to construct a semantic classification.
We study the qualitative patterns of relations between endogenous disciplines within each classification, and finally show the complementarity of classifications and of their associated interdisciplinarity measures.
The tools we develop accordingly are open and reusable for similar large scale studies of scientific environments.
Humor is an integral part of human lives.
Despite being tremendously impactful, it is perhaps surprising that we do not have a detailed understanding of humor yet.
As interactions between humans and AI systems increase, it is imperative that these systems are taught to understand subtleties of human expressions such as humor.
In this work, we are interested in the question - what content in a scene causes it to be funny?
As a first step towards understanding visual humor, we analyze the humor manifested in abstract scenes and design computational models for them.
We collect two datasets of abstract scenes that facilitate the study of humor at both the scene-level and the object-level.
We analyze the funny scenes and explore the different types of humor depicted in them via human studies.
We model two tasks that we believe demonstrate an understanding of some aspects of visual humor.
The tasks involve predicting the funniness of a scene and altering the funniness of a scene.
We show that our models perform well quantitatively, and qualitatively through human studies.
Our datasets are publicly available.
LTE-Unlicensed (LTE-U) has recently attracted worldwide interest to meet the explosion in cellular traffic data.
By using carrier aggregation (CA), licensed and unlicensed bands are integrated to enhance transmission capacity while maintaining reliable and predictable performance.
As there may exist other conventional unlicensed band users, such as Wi-Fi users, LTE-U users have to share the same unlicensed bands with them.
Thus, an optimized resource allocation scheme to ensure the fairness between LTE-U users and conventional unlicensed band users is critical for the deployment of LTE-U networks.
In this paper, we investigate an energy efficient resource allocation problem in LTE-U coexisting with other wireless networks, which aims at guaranteeing fairness among the users of different radio access networks (RANs).
We formulate the problem as a multi-objective optimization problem and propose a semi-distributed matching framework with a partial information-based algorithm to solve it.
We demonstrate our contributions with simulations in which various network densities and traffic load levels are considered.
Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics.
As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration.
In this work, we address this issue by formulating BP estimation as a sequence prediction problem in which both the input and target are temporal sequences.
We propose a novel deep recurrent neural network (RNN) consisting of multilayered Long Short-Term Memory (LSTM) networks, which are incorporated with (1) a bidirectional structure to access larger-scale context information of input sequence, and (2) residual connections to allow gradients in deep RNN to propagate more effectively.
The proposed deep RNN model was tested on a static BP dataset, and it achieved root mean square error (RMSE) of 3.90 and 2.66 mmHg for systolic BP (SBP) and diastolic BP (DBP) prediction respectively, surpassing the accuracy of traditional BP prediction models.
On a multi-day BP dataset, the deep RNN achieved RMSE of 3.84, 5.25, 5.80 and 5.81 mmHg for the 1st day, 2nd day, 4th day and 6th month after the 1st day SBP prediction, and 1.80, 4.78, 5.0, 5.21 mmHg for corresponding DBP prediction, respectively, which outperforms all previous models with notable improvement.
The experimental results suggest that modeling the temporal dependencies in BP dynamics significantly improves the long-term BP prediction accuracy.
Artificial Neural Networks (ANNs) have found widespread applications in tasks such as pattern recognition and image classification.
However, hardware implementations of ANNs using conventional binary arithmetic units are computationally expensive, energy-intensive and have large area overheads.
Stochastic Computing (SC) is an emerging paradigm which replaces these conventional units with simple logic circuits and is particularly suitable for fault-tolerant applications.
Spintronic devices, such as Magnetic Tunnel Junctions (MTJs), are capable of replacing CMOS in memory and logic circuits.
In this work, we propose an energy-efficient use of MTJs, which exhibit probabilistic switching behavior, as Stochastic Number Generators (SNGs), which forms the basis of our NN implementation in the SC domain.
Further, error resilient target applications of NNs allow us to introduce Approximate Computing, a framework wherein accuracy of computations is traded-off for substantial reductions in power consumption.
We propose approximating the synaptic weights in our MTJ-based NN implementation, in ways brought about by properties of our MTJ-SNG, to achieve energy-efficiency.
We design an algorithm that can perform such approximations within a given error tolerance in a single-layer NN in an optimal way owing to the convexity of the problem formulation.
We then use this algorithm and develop a heuristic approach for approximating multi-layer NNs.
To give a perspective of the effectiveness of our approach, a 43% reduction in power consumption was obtained with less than 1% accuracy loss on a standard classification problem, with 26% being brought about by the proposed algorithm.
Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on.
In this article, we propose a solution to solve this problem based on a novel deep learning model dubbed Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machine (DFFW-CRBM).
Our method improves state-of-the-art deep learning techniques for high dimensional time-series modeling by introducing a novel tensor factorization capable of driving forth order Boltzmann machines to considerably lower energy levels, at no computational costs.
DFFW-CRBMs are capable of accurately estimating, recognizing, and performing near-future prediction of three-dimensional trajectories from their 2D projections while requiring limited amount of labeled data.
We evaluate our method on both simulated and real-world data, showing its effectiveness in predicting and classifying complex ball trajectories and human activities.
Traditional approaches for complementary product recommendations rely on behavioral and non-visual data such as customer co-views or co-buys.
However, certain domains such as fashion are primarily visual.
We propose a framework that harnesses visual cues in an unsupervised manner to learn the distribution of co-occurring complementary items in real world images.
Our model learns a non-linear transformation between the two manifolds of source and target complementary item categories (e.g., tops and bottoms in outfits).
Given a large dataset of images containing instances of co-occurring object categories, we train a generative transformer network directly on the feature representation space by casting it as an adversarial optimization problem.
Such a conditional generative model can produce multiple novel samples of complementary items (in the feature space) for a given query item.
The final recommendations are selected from the closest real world examples to the synthesized complementary features.
We apply our framework to the task of recommending complementary tops for a given bottom clothing item.
The recommendations made by our system are diverse, and are favored by human experts over the baseline approaches.
Generative models with an encoding component such as autoencoders currently receive great interest.
However, training of autoencoders is typically complicated by the need to train a separate encoder and decoder model that have to be enforced to be reciprocal to each other.
To overcome this problem, by-design reversible neural networks (RevNets) had been previously used as generative models either directly optimizing the likelihood of the data under the model or using an adversarial approach on the generated data.
Here, we instead investigate their performance using an adversary on the latent space in the adversarial autoencoder framework.
We investigate the generative performance of RevNets on the CelebA dataset, showing that generative RevNets can generate coherent faces with similar quality as Variational Autoencoders.
This first attempt to use RevNets inside the adversarial autoencoder framework slightly underperformed relative to recent advanced generative models using an autoencoder component on CelebA, but this gap may diminish with further optimization of the training setup of generative RevNets.
In addition to the experiments on CelebA, we show a proof-of-principle experiment on the MNIST dataset suggesting that adversary-free trained RevNets can discover meaningful latent dimensions without pre-specifying the number of dimensions of the latent sampling distribution.
In summary, this study shows that RevNets can be employed in different generative training settings.
Source code for this study is at https://github.com/robintibor/generative-reversible
Localization performance in wireless networks has been traditionally benchmarked using the Cramer-Rao lower bound (CRLB), given a fixed geometry of anchor nodes and a target.
However, by endowing the target and anchor locations with distributions, this paper recasts this traditional, scalar benchmark as a random variable.
The goal of this work is to derive an analytical expression for the distribution of this now random CRLB, in the context of Time-of-Arrival-based positioning.
To derive this distribution, this work first analyzes how the CRLB is affected by the order statistics of the angles between consecutive participating anchors (i.e., internodal angles).
This analysis reveals an intimate connection between the second largest internodal angle and the CRLB, which leads to an accurate approximation of the CRLB.
Using this approximation, a closed-form expression for the distribution of the CRLB, conditioned on the number of participating anchors, is obtained.
Next, this conditioning is eliminated to derive an analytical expression for the marginal CRLB distribution.
Since this marginal distribution accounts for all target and anchor positions, across all numbers of participating anchors, it therefore statistically characterizes localization error throughout an entire wireless network.
This paper concludes with a comprehensive analysis of this new network-wide-CRLB paradigm.
We took part in the YouTube-8M Video Understanding Challenge hosted on Kaggle, and achieved the 10th place within less than one month's time.
In this paper, we present an extensive analysis and solution to the underlying machine-learning problem based on frame-level data, where major challenges are identified and corresponding preliminary methods are proposed.
It's noteworthy that, with merely the proposed strategies and uniformly-averaging multi-crop ensemble was it sufficient for us to reach our ranking.
We also report the methods we believe to be promising but didn't have enough time to train to convergence.
We hope this paper could serve, to some extent, as a review and guideline of the YouTube-8M multi-label video classification benchmark, inspiring future attempts and research.
In computer simulation of the learning process is usually assumed that all elements of the training material are assimilated equally durable.
But in practice, the knowledge, which a student uses in its operations, are remembered much better.
For a more precise study of didactic systems the multi component model of learning are proposed.
It takes into account: 1) the transition of weak knowledge in trustworthy knowledge; 2) the difference in the rate of forgetting the trustworthy and weak knowledge.
It is assumed that the rate of increase of student's knowledge is proportional to: 1) the difference between the level of the requirements of teachers and the number of learned knowledge; 2) the amount of learned knowledge, raised to some power.
Examples of the use of a multi component model for the study of situations in the learning process are considered, the resulting graphs of the student's level of knowledge of the time are presented.
A generalized model of learning, which allows to take into account the complexity of the various elements of the educational material are proposed.
The possibility of creating a training program for the training of students of pedagogical institutes are considered.
A tower is a sequence of words alternating between two languages in such a way that every word is a subsequence of the following word.
The height of the tower is the number of words in the sequence.
If there is no infinite tower (a tower of infinite height), then the height of all towers between the languages is bounded.
We study upper and lower bounds on the height of maximal finite towers with respect to the size of the NFA (the DFA) representation of the languages.
We show that the upper bound is polynomial in the number of states and exponential in the size of the alphabet, and that it is asymptotically tight if the size of the alphabet is fixed.
If the alphabet may grow, then, using an alphabet of size approximately the number of states of the automata, the lower bound on the height of towers is exponential with respect to that number.
In this case, there is a gap between the lower and upper bound, and the asymptotically optimal bound remains an open problem.
Since, in many cases, the constructed towers are sequences of prefixes, we also study towers of prefixes.
Tabular notations, in particular SCR specifications, have proved to be a useful means for formally describing complex requirements.
The SCR method offers a powerful family of analysis tools, known as the SCR Toolset, but its availability is restricted by the Naval Research Laboratory of the USA.
This toolset applies different kinds of analysis considering the whole set of behaviours associated with a requirements specification.
In this paper we present a tool for describing and analyzing SCR requirements descriptions, that complements the SCR Toolset in two aspects.
First, its use is not limited by any institution, and resorts to a standard model checking tool for analysis; and second, it allows to concentrate the analysis to particular sets of behaviours (subsets of the whole specifications), that correspond to particular scenarios explicitly mentioned in the specification.
We take an operational notation that allows the engineer to describe behavioural "scenarios" by means of programs, and provide a translation into Promela to perform the analysis via Spin, an efficient off-the-shelf model checker freely available.
In addition, we apply the SCR method to a Pacemaker system and we use its tabular specification as a running example of this article.
We define the task of salient structure (SS) detection to unify the saliency-related tasks like fixation prediction, salient object detection, and other detection of structures of interest.
In this study, we propose a unified framework for SS detection by modeling the two-pathway-based guided search strategy of biological vision.
Firstly, context-based spatial prior (CBSP) is extracted based on the layout of edges in the given scene along a fast visual pathway, called non-selective pathway.
This is a rough and non-selective estimation of the locations where the potential SSs present.
Secondly, another flow of local feature extraction is executed in parallel along the selective pathway.
Finally, Bayesian inference is used to integrate local cues guided by CBSP, and to predict the exact locations of SSs in the input scene.
The proposed model is invariant to size and features of objects.
Experimental results on four datasets (two fixation prediction datasets and two salient object datasets) demonstrate that our system achieves competitive performance for SS detection (i.e., both the tasks of fixation prediction and salient object detection) comparing to the state-of-the-art methods.
Differential privacy is a promising approach to privacy preserving data analysis with a well-developed theory for functions.
Despite recent work on implementing systems that aim to provide differential privacy, the problem of formally verifying that these systems have differential privacy has not been adequately addressed.
This paper presents the first results towards automated verification of source code for differentially private interactive systems.
We develop a formal probabilistic automaton model of differential privacy for systems by adapting prior work on differential privacy for functions.
The main technical result of the paper is a sound proof technique based on a form of probabilistic bisimulation relation for proving that a system modeled as a probabilistic automaton satisfies differential privacy.
The novelty lies in the way we track quantitative privacy leakage bounds using a relation family instead of a single relation.
We illustrate the proof technique on a representative automaton motivated by PINQ, an implemented system that is intended to provide differential privacy.
To make our proof technique easier to apply to realistic systems, we prove a form of refinement theorem and apply it to show that a refinement of the abstract PINQ automaton also satisfies our differential privacy definition.
Finally, we begin the process of automating our proof technique by providing an algorithm for mechanically checking a restricted class of relations from the proof technique.
Features that capture well the textural patterns of a certain class of images are crucial for the performance of texture segmentation methods.
The manual selection of features or designing new ones can be a tedious task.
Therefore, it is desirable to automatically adapt the features to a certain image or class of images.
Typically, this requires a large set of training images with similar textures and ground truth segmentation.
In this work, we propose a framework to learn features for texture segmentation when no such training data is available.
The cost function for our learning process is constructed to match a commonly used segmentation model, the piecewise constant Mumford-Shah model.
This means that the features are learned such that they provide an approximately piecewise constant feature image with a small jump set.
Based on this idea, we develop a two-stage algorithm which first learns suitable convolutional features and then performs a segmentation.
We note that the features can be learned from a small set of images, from a single image, or even from image patches.
The proposed method achieves a competitive rank in the Prague texture segmentation benchmark, and it is effective for segmenting histological images.
Continual data collection and widespread deployment of machine learning algorithms, particularly the distributed variants, have raised new privacy challenges.
In a distributed machine learning scenario, the dataset is stored among several machines and they solve a distributed optimization problem to collectively learn the underlying model.
We present a secure multi-party computation inspired privacy preserving distributed algorithm for optimizing a convex function consisting of several possibly non-convex functions.
Each individual objective function is privately stored with an agent while the agents communicate model parameters with neighbor machines connected in a network.
We show that our algorithm can correctly optimize the overall objective function and learn the underlying model accurately.
We further prove that under a vertex connectivity condition on the topology, our algorithm preserves privacy of individual objective functions.
We establish limits on the what a coalition of adversaries can learn by observing the messages and states shared over a network.
This study analyzes how web audiences flow across online digital features.
We construct a directed network of user flows based on sequential user clickstreams for all popular websites (n=1761), using traffic data obtained from a panel of a million web users in the United States.
We analyze these data to identify constellations of websites that are frequently browsed together in temporal sequences, both by similar user groups in different browsing sessions as well as by disparate users.
Our analyses thus render visible previously hidden online collectives and generate insight into the varied roles that curatorial infrastructures may play in shaping audience fragmentation on the web.
As long as human beings exist on this earth, there will be confidential images intended for limited audience.
These images have to be transmitted in such a way that no unauthorized person gets knowledge of them.
DNA sequences play a vital role in modern cryptography and DNA sequence based cryptography renders a helping hand for transmission of such confidential images over a public insecure channel as the intended recipient alone can decipher them.
This paper outlines an integrated encryption scheme based on DNA sequences and scrambling according to magic square of doubly even order pattern.
Since there is negligible correlation between the original and encrypted image this method is robust against any type of crypt attack.
Face recognition has achieved great progress owing to the fast development of the deep neural network in the past a few years.
As an important part of deep neural networks, a number of the loss functions have been proposed which significantly improve the state-of-the-art methods.
In this paper, we proposed a new loss function called Minimum Margin Loss (MML) which aims at enlarging the margin of those overclose class centre pairs so as to enhance the discriminative ability of the deep features.
MML supervises the training process together with the Softmax Loss and the Centre Loss, and also makes up the defect of Softmax + Centre Loss.
The experimental results on MegaFace, LFW and YTF datasets show that the proposed method achieves the state-of-the-art performance, which demonstrates the effectiveness of the proposed MML.
As computer vision before, remote sensing has been radically changed by the introduction of Convolution Neural Networks.
Land cover use, object detection and scene understanding in aerial images rely more and more on deep learning to achieve new state-of-the-art results.
Recent architectures such as Fully Convolutional Networks (Long et al., 2015) can even produce pixel level annotations for semantic mapping.
In this work, we show how to use such deep networks to detect, segment and classify different varieties of wheeled vehicles in aerial images from the ISPRS Potsdam dataset.
This allows us to tackle object detection and classification on a complex dataset made up of visually similar classes, and to demonstrate the relevance of such a subclass modeling approach.
Especially, we want to show that deep learning is also suitable for object-oriented analysis of Earth Observation data.
First, we train a FCN variant on the ISPRS Potsdam dataset and show how the learnt semantic maps can be used to extract precise segmentation of vehicles, which allow us studying the repartition of vehicles in the city.
Second, we train a CNN to perform vehicle classification on the VEDAI (Razakarivony and Jurie, 2016) dataset, and transfer its knowledge to classify candidate segmented vehicles on the Potsdam dataset.
A major challenge in consumer credit risk portfolio management is to classify households according to their risk profile.
In order to build such risk profiles it is necessary to employ an approach that analyses data systematically in order to detect important relationships, interactions, dependencies and associations amongst the available continuous and categorical variables altogether and accurately generate profiles of most interesting household segments according to their credit risk.
The objective of this work is to employ a knowledge discovery from database process to identify groups of indebted households and describe their profiles using a database collected by the Consumer Credit Counselling Service (CCCS) in the UK.
Employing a framework that allows the usage of both categorical and continuous data altogether to find hidden structures in unlabelled data it was established the ideal number of clusters and such clusters were described in order to identify the households who exhibit a high propensity of excessive debt levels.
The rise of social media is enabling people to freely express their opinions about products and services.
The aim of sentiment analysis is to automatically determine subject's sentiment (e.g., positive, negative, or neutral) towards a particular aspect such as topic, product, movie, news etc.
Deep learning has recently emerged as a powerful machine learning technique to tackle a growing demand of accurate sentiment analysis.
However, limited work has been conducted to apply deep learning algorithms to languages other than English, such as Persian.
In this work, two deep learning models (deep autoencoders and deep convolutional neural networks (CNNs)) are developed and applied to a novel Persian movie reviews dataset.
The proposed deep learning models are analyzed and compared with the state-of-the-art shallow multilayer perceptron (MLP) based machine learning model.
Simulation results demonstrate the enhanced performance of deep learning over state-of-the-art MLP.
The Information Flow Framework (IFF) is a descriptive category metatheory currently under development, which is being offered as the structural aspect of the Standard Upper Ontology (SUO).
The architecture of the IFF is composed of metalevels, namespaces and meta-ontologies.
The main application of the IFF is institutional: the notion of institutions and their morphisms are being axiomatized in the upper metalevels of the IFF, and the lower metalevel of the IFF has axiomatized various institutions in which semantic integration has a natural expression as the colimit of theories.
Recent studies indicate the feasibility of full-duplex (FD) bidirectional wireless communications.
Due to its potential to increase the capacity, analyzing the performance of a cellular network that contains full-duplex devices is crucial.
In this paper, we consider maximizing the weighted sum-rate of downlink and uplink of an FD heterogeneous OFDMA network where each cell consists of an imperfect FD base-station (BS) and a mixture of half-duplex and imperfect full-duplex mobile users.
To this end, first, the joint problem of sub-channel assignment and power allocation for a single cell network is investigated.
Then, the proposed algorithms are extended for solving the optimization problem for an FD heterogeneous network in which intra-cell and inter-cell interferences are taken into account.
Simulation results demonstrate that in a single cell network, when all the users and the BSs are perfect FD nodes, the network throughput could be doubled.
Otherwise, the performance improvement is limited by the inter-cell interference, inter-node interference, and self-interference.
We also investigate the effect of the percentage of FD users on the network performance in both indoor and outdoor scenarios, and analyze the effect of the self-interference cancellation capability of the FD nodes on the network performance.
For a closed-loop system, which has a contention-based multiple access network on its sensor link, the Medium Access Controller (MAC) may discard some packets when the traffic on the link is high.
We use a local state-based scheduler to select a few critical data packets to send to the MAC.
In this paper, we analyze the impact of such a scheduler on the closed-loop system in the presence of traffic, and show that there is a dual effect with state-based scheduling.
In general, this makes the optimal scheduler and controller hard to find.
However, by removing past controls from the scheduling criterion, we find that certainty equivalence holds.
This condition is related to the classical result of Bar-Shalom and Tse, and it leads to the design of a scheduler with a certainty equivalent controller.
This design, however, does not result in an equivalent system to the original problem, in the sense of Witsenhausen.
Computing the estimate is difficult, but can be simplified by introducing a symmetry constraint on the scheduler.
Based on these findings, we propose a dual predictor architecture for the closed-loop system, which ensures separation between scheduler, observer and controller.
We present an example of this architecture, which illustrates a network-aware event-triggering mechanism.
A number of differences have emerged between modern and classic approaches to constituency parsing in recent years, with structural components like grammars and feature-rich lexicons becoming less central while recurrent neural network representations rise in popularity.
The goal of this work is to analyze the extent to which information provided directly by the model structure in classical systems is still being captured by neural methods.
To this end, we propose a high-performance neural model (92.08 F1 on PTB) that is representative of recent work and perform a series of investigative experiments.
We find that our model implicitly learns to encode much of the same information that was explicitly provided by grammars and lexicons in the past, indicating that this scaffolding can largely be subsumed by powerful general-purpose neural machinery.
This paper provides guidance to an analyst who wants to extract insight from a spreadsheet model.
It discusses the terminology of spreadsheet analytics, how to prepare a spreadsheet model for analysis, and a hierarchy of analytical techniques.
These techniques include sensitivity analysis, tornado charts,and backsolving (or goal-seeking).
This paper presents native-Excel approaches for automating these techniques, and discusses add-ins that are even more efficient.
Spreadsheet optimization and spreadsheet Monte Carlo simulation are briefly discussed.
The paper concludes by calling for empirical research, and describing desired features spreadsheet sensitivity analysis and spreadsheet optimization add-ins.
Rare diseases affect a relatively small number of people, which limits investment in research for treatments and cures.
Developing an efficient method for rare disease detection is a crucial first step towards subsequent clinical research.
In this paper, we present a semi-supervised learning framework for rare disease detection using generative adversarial networks.
Our method takes advantage of the large amount of unlabeled data for disease detection and achieves the best results in terms of precision-recall score compared to baseline techniques.
Recent developments in quaternion-valued widely linear processing have established that the exploitation of complete second-order statistics requires consideration of both the standard covariance and the three complementary covariance matrices.
Although such matrices have a tremendous amount of structure and their decomposition is a powerful tool in a variety of applications, the non-commutative nature of the quaternion product has been prohibitive to the development of quaternion uncorrelating transforms.
To this end, we introduce novel techniques for a simultaneous decomposition of the covariance and complementary covariance matrices in the quaternion domain, whereby the quaternion version of the Takagi factorisation is explored to diagonalise symmetric quaternion-valued matrices.
This gives new insights into the quaternion uncorrelating transform (QUT) and forms a basis for the proposed quaternion approximate uncorrelating transform (QAUT) which simultaneously diagonalises all four covariance matrices associated with improper quaternion signals.
The effectiveness of the proposed uncorrelating transforms is validated by simulations on both synthetic and real-world quaternion-valued signals.
Virtual reality allows to create situations which can be experimented under the control of the user, without risks, in a very flexible way.
This allows to develop skills and to have confidence to work in real conditions with real equipment.
VR is then widely used as a training and learning tool.
More recently, VR has also showed its potential in rehabilitation and therapy fields because it provides users with the ability of repeat their actions several times and to progress at their own pace.
In this communication, we present our work in the development of a wheelchair simulator designed to allow children with multiple disabilities to familiarize themselves with the wheelchair.
This paper focuses on a multimodal language understanding method for carry-and-place tasks with domestic service robots.
We address the case of ambiguous instructions, that is, when the target area is not specified.
For instance "put away the milk and cereal" is a natural instruction where there is ambiguity regarding the target area, considering environments in daily life.
Conventionally, this instruction can be disambiguated from a dialogue system, but at the cost of time and cumbersome interaction.
Instead, we propose a multimodal approach, in which the instructions are disambiguated using the robot's state and environment context.
We develop the Multi-Modal Classifier Generative Adversarial Network (MMC-GAN) to predict the likelihood of different target areas considering the robot's physical limitation and the target clutter.
Our approach, MMC-GAN, significantly improves accuracy compared with baseline methods that use instructions only or simple deep neural networks.
Semantically understanding complex drivers' encountering behavior, wherein two or multiple vehicles are spatially close to each other, does potentially benefit autonomous car's decision-making design.
This paper presents a framework of analyzing various encountering behaviors through decomposing driving encounter data into small building blocks, called driving primitives, using nonparametric Bayesian learning (NPBL) approaches, which offers a flexible way to gain an insight into the complex driving encounters without any prerequisite knowledge.
The effectiveness of our proposed primitive-based framework is validated based on 976 naturalistic driving encounters, from which more than 4000 driving primitives are learned using NPBL - a sticky HDP-HMM, combined a hidden Markov model (HMM) with a hierarchical Dirichlet process (HDP).
After that, a dynamic time warping method integrated with k-means clustering is then developed to cluster all these extracted driving primitives into groups.
Experimental results find that there exist 20 kinds of driving primitives capable of representing the basic components of driving encounters in our database.
This primitive-based analysis methodology potentially reveals underlying information of vehicle-vehicle encounters for self-driving applications.
Suicide is an important but often misunderstood problem, one that researchers are now seeking to better understand through social media.
Due in large part to the fuzzy nature of what constitutes suicidal risks, most supervised approaches for learning to automatically detect suicide-related activity in social media require a great deal of human labor to train.
However, humans themselves have diverse or conflicting views on what constitutes suicidal thoughts.
So how to obtain reliable gold standard labels is fundamentally challenging and, we hypothesize, depends largely on what is asked of the annotators and what slice of the data they label.
We conducted multiple rounds of data labeling and collected annotations from crowdsourcing workers and domain experts.
We aggregated the resulting labels in various ways to train a series of supervised models.
Our preliminary evaluations show that using unanimously agreed labels from multiple annotators is helpful to achieve robust machine models.
Semantic parsing aims at mapping natural language to machine interpretable meaning representations.
Traditional approaches rely on high-quality lexicons, manually-built templates, and linguistic features which are either domain- or representation-specific.
In this paper we present a general method based on an attention-enhanced encoder-decoder model.
We encode input utterances into vector representations, and generate their logical forms by conditioning the output sequences or trees on the encoding vectors.
Experimental results on four datasets show that our approach performs competitively without using hand-engineered features and is easy to adapt across domains and meaning representations.
Software Defined Networking (SDN) can effectively improve the performance of traffic engineering and has promising application foreground in backbone networks.
Therefore, new energy saving schemes must take SDN into account, which is extremely important considering the rapidly increasing energy consumption from Telecom and ISP networks.
At the same time, the introduction of SDN in a current network must be incremental in most cases, for both technical and economic reasons.
During this period, operators have to manage hybrid networks, where SDN and traditional protocols coexist.
In this paper, we study the energy efficient traffic engineering problem in hybrid SDN/IP networks.
We first formulate the mathematic optimization model considering SDN/IP hybrid routing mode.
As the problem is NP-hard, we propose the fast heuristic algorithm named HEATE (Hybrid Energy-Aware Traffic Engineering).
In our proposed HEATE algorithm, the IP routers perform the shortest path routing using the distribute OSPF link weight optimization.
The SDNs perform the multi-path routing with traffic flow splitting by the global SDN controller.
The HEATE algorithm finds the optimal setting of OSPF link weight and splitting ratio of SDNs.
Thus traffic flow is aggregated onto partial links and the underutilized links can be turned off to save energy.
By computer simulation results, we show that our algorithm has a significant improvement in energy efficiency in hybrid SDN/IP networks.
Blockchain platforms, such as Ethereum, allow a set of actors to maintain a ledger of transactions without relying on a central authority and to deploy scripts, called smart contracts, that are executed whenever certain transactions occur.
These features can be used as basic building blocks for executing collaborative business processes between mutually untrusting parties.
However, implementing business processes using the low-level primitives provided by blockchain platforms is cumbersome and error-prone.
In contrast, established business process management systems, such as those based on the standard Business Process Model and Notation (BPMN), provide convenient abstractions for rapid development of process-oriented applications.
This article demonstrates how to combine the advantages of a business process management system with those of a blockchain platform.
The article introduces a blockchain-based BPMN execution engine, namely Caterpillar.
Like any BPMN execution engine, Caterpillar supports the creation of instances of a process model and allows users to monitor the state of process instances and to execute tasks thereof.
The specificity of Caterpillar is that the state of each process instance is maintained on the (Ethereum) blockchain and the workflow routing is performed by smart contracts generated by a BPMN-to-Solidity compiler.
The Caterpillar compiler supports a large array of BPMN constructs, including subprocesses, multi-instances activities and event handlers.
The paper describes the architecture of Caterpillar, and the interfaces it provides to support the monitoring of process instances, the allocation and execution of work items, and the execution of service tasks.
With the increasing usage of smartphones, there is a corresponding increase in the phone metadata generated by individuals using these devices.
Managing the privacy of personal information on these devices can be a complex task.
Recent research has suggested the use of social and behavioral data for automatically recommending privacy settings.
This paper is the first effort to connect users' phone use metadata with their privacy attitudes.
Based on a 10-week long field study involving phone metadata collection via an app, and a survey on privacy attitudes, we report that an analysis of cell phone metadata may reveal vital clues to a person's privacy attitudes.
Specifically, a predictive model based on phone usage metadata significantly outperforms a comparable personality features-based model in predicting individual privacy attitudes.
The results motivate a newer direction of automatically inferring a user's privacy attitudes by looking at their phone usage characteristics.
The Turing Test (TT) checks for human intelligence, rather than any putative general intelligence.
It involves repeated interaction requiring learning in the form of adaption to the human conversation partner.
It is a macro-level post-hoc test in contrast to the definition of a Turing Machine (TM), which is a prior micro-level definition.
This raises the question of whether learning is just another computational process, i.e. can be implemented as a TM.
Here we argue that learning or adaption is fundamentally different from computation, though it does involve processes that can be seen as computations.
To illustrate this difference we compare (a) designing a TM and (b) learning a TM, defining them for the purpose of the argument.
We show that there is a well-defined sequence of problems which are not effectively designable but are learnable, in the form of the bounded halting problem.
Some characteristics of human intelligence are reviewed including it's: interactive nature, learning abilities, imitative tendencies, linguistic ability and context-dependency.
A story that explains some of these is the Social Intelligence Hypothesis.
If this is broadly correct, this points to the necessity of a considerable period of acculturation (social learning in context) if an artificial intelligence is to pass the TT.
Whilst it is always possible to 'compile' the results of learning into a TM, this would not be a designed TM and would not be able to continually adapt (pass future TTs).
We conclude three things, namely that: a purely "designed" TM will never pass the TT; that there is no such thing as a general intelligence since it necessary involves learning; and that learning/adaption and computation should be clearly distinguished.
In this paper a novel Quantum Double Delta Swarm (QDDS) algorithm modeled after the mechanism of convergence to the center of attractive potential field generated within a single well in a double Dirac delta well setup has been put forward and the preliminaries discussed.
Theoretical foundations and experimental illustrations have been incorporated to provide a first basis for further development, specifically in refinement of solutions and applicability to problems in high dimensional spaces.
Simulations are carried out over varying dimensionality on four benchmark functions, viz.Rosenbrock, Rastrigrin, Griewank and Sphere as well as the multidimensional Finite Impulse Response (FIR) Filter design problem with different population sizes.
Test results illustrate the algorithm yields superior results to some related reports in the literature while reinforcing the need of substantial future work to deliver near-optimal results consistently, especially if dimensionality scales up.
We extend the notion of the distance to a measure from Euclidean space to probability measures on general metric spaces as a way to do topological data analysis in a way that is robust to noise and outliers.
We then give an efficient way to approximate the sub-level sets of this function by a union of metric balls and extend previous results on sparse Rips filtrations to this setting.
This robust and efficient approach to topological data analysis is illustrated with several examples from an implementation.
An overwhelming number of true and false news stories are posted and shared in social networks, and users diffuse the stories based on multiple factors.
Diffusion of news stories from one user to another depends not only on the stories' content and the genuineness but also on the alignment of the topical interests between the users.
In this paper, we propose a novel Bayesian nonparametric model that incorporates homogeneity of news stories as the key component that regulates the topical similarity between the posting and sharing users' topical interests.
Our model extends hierarchical Dirichlet process to model the topics of the news stories and incorporates Bayesian Gaussian process latent variable model to discover the homogeneity values.
We train our model on a real-world social network dataset and find homogeneity values of news stories that strongly relate to their labels of genuineness and their contents.
Finally, we show that the supervised version of our model predicts the labels of news stories better than the state-of-the-art neural network and Bayesian models.
arXiv is a popular pre-print server focusing on natural science disciplines (e.g.physics, computer science, quantitative biology).
As a platform with focus on easy publishing services it does not provide enhanced search functionality -- but offers programming interfaces which allow external parties to add these services.
This paper presents extensions of the open source framework arXiv Sanity Preserver (SP).
With respect to the original framework, it derestricts the topical focus and allows for text-based search and visualisation of all papers in arXiv.
To this end, all papers are stored in a unified back-end; the extension provides enhanced search and ranking facilities and allows the exploration of arXiv papers by a novel user interface.
This paper is concerned with how to make efficient use of social information to improve recommendations.
Most existing social recommender systems assume people share similar preferences with their social friends.
Which, however, may not hold true due to various motivations of making online friends and dynamics of online social networks.
Inspired by recent causal process based recommendations that first model user exposures towards items and then use these exposures to guide rating prediction, we utilize social information to capture user exposures rather than user preferences.
We assume that people get information of products from their online friends and they do not have to share similar preferences, which is less restrictive and seems closer to reality.
Under this new assumption, in this paper, we present a novel recommendation approach (named SERec) to integrate social exposure into collaborative filtering.
We propose two methods to implement SERec, namely social regularization and social boosting, each with different ways to construct social exposures.
Experiments on four real-world datasets demonstrate that our methods outperform the state-of-the-art methods on top-N recommendations.
Further study compares the robustness and scalability of the two proposed methods.
We now advocate a novel physical layer security solution that is unique to our previously proposed GPSM scheme with the aid of the proposed antenna scrambling.
The novelty and contribution of our paper lies in three aspects: 1/ principle: we introduce a `security key' generated at Alice that is unknown to both Bob and Eve, where the design goal is that the publicly unknown security key only imposes barrier for Eve.
2/ approach: we achieve it by conveying useful information only through the activation of RA indices, which is in turn concealed by the unknown security key in terms of the randomly scrambled symbols used in place of the conventional modulated symbols in GPSM scheme.
3/ design: we consider both Circular Antenna Scrambling (CAS) and Gaussian Antenna Scrambling (GAS) in detail and the resultant security capacity of both designs are quantified and compared.
Provenance, or information about the sources, derivation, custody or history of data, has been studied recently in a number of contexts, including databases, scientific workflows and the Semantic Web.
Many provenance mechanisms have been developed, motivated by informal notions such as influence, dependence, explanation and causality.
However, there has been little study of whether these mechanisms formally satisfy appropriate policies or even how to formalize relevant motivating concepts such as causality.
We contend that mathematical models of these concepts are needed to justify and compare provenance techniques.
In this paper we review a theory of causality based on structural models that has been developed in artificial intelligence, and describe work in progress on using causality to give a semantics to provenance graphs.
Pulmonary vein isolation (PVI) is a common procedure for the treatment of atrial fibrillation (AF).
A successful isolation produces a continuous lesion (scar) completely encircling the veins that stops activation waves from propagating to the atrial body.
Unfortunately, the encircling lesion is often incomplete, becoming a combination of scar and gaps of healthy tissue.
These gaps are potential causes of AF recurrence, which requires a redo of the isolation procedure.
Late-gadolinium enhanced cardiac magnetic resonance (LGE-CMR) is a non-invasive method that may also be used to detect gaps, but it is currently a time-consuming process, prone to high inter-observer variability.
In this paper, we present a method to semi-automatically identify and quantify ablation gaps.
Gap quantification is performed through minimum path search in a graph where every node is a scar patch and the edges are the geodesic distances between patches.
We propose the Relative Gap Measure (RGM) to estimate the percentage of gap around a vein, which is defined as the ratio of the overall gap length and the total length of the path that encircles the vein.
Additionally, an advanced version of the RGM has been developed to integrate gap quantification estimates from different scar segmentation techniques into a single figure-of-merit.
Population-based statistical and regional analysis of gap distribution was performed using a standardised parcellation of the left atrium.
We have evaluated our method on synthetic and clinical data from 50 AF patients who underwent PVI with radiofrequency ablation.
The population-based analysis concluded that the left superior PV is more prone to lesion gaps while the left inferior PV tends to have less gaps (p<0.05 in both cases), in the processed data.
This type of information can be very useful for the optimization and objective assessment of PVI interventions.
Resources for the non-English languages are scarce and this paper addresses this problem in the context of machine translation, by automatically extracting parallel sentence pairs from the multilingual articles available on the Internet.
In this paper, we have used an end-to-end Siamese bidirectional recurrent neural network to generate parallel sentences from comparable multilingual articles in Wikipedia.
Subsequently, we have showed that using the harvested dataset improved BLEU scores on both NMT and phrase-based SMT systems for the low-resource language pairs: English--Hindi and English--Tamil, when compared to training exclusively on the limited bilingual corpora collected for these language pairs.
The deluge of date rate in today's networks imposes a cost burden on the backhaul network design.
Developing cost efficient backhaul solutions becomes an exciting, yet challenging, problem.
Traditional technologies for backhaul networks include either radio-frequency backhauls (RF) or optical fibers (OF).
While RF is a cost-effective solution as compared to OF, it supports lower data rate requirements.
Another promising backhaul solution is the free-space optics (FSO) as it offers both a high data rate and a relatively low cost.
FSO, however, is sensitive to nature conditions, e.g., rain, fog, line-of-sight.
This paper combines both RF and FSO advantages and proposes a hybrid RF/FSO backhaul solution.
It considers the problem of minimizing the cost of the backhaul network by choosing either OF or hybrid RF/FSO backhaul links between the base-stations (BS) so as to satisfy data rate, connectivity, and reliability constraints.
It shows that under a specified realistic assumption about the cost of OF and hybrid RF/FSO links, the problem is equivalent to a maximum weight clique problem, which can be solved with moderate complexity.
Simulation results show that the proposed solution shows a close-to-optimal performance, especially for practical prices of the hybrid RF/FSO links.
The search engine is tightly coupled with social networks and is primarily designed for users to acquire interested information.
Specifically, the search engine assists the information dissemination for social networks, i.e., enabling users to access interested contents with keywords-searching and promoting the process of contents-transferring from the source users directly to potential interested users.
Accompanying such processes, the social network evolves as new links emerge between users with common interests.
However, there is no clear understanding of such a "chicken-and-egg" problem, namely, new links encourage more social interactions, and vice versa.
In this paper, we aim to quantitatively characterize the social network evolution phenomenon driven by a search engine.
First, we propose a search network model for social network evolution.
Second, we adopt two performance metrics, namely, degree distribution and network diameter.
Theoretically, we prove that the degree distribution follows an intensified power-law, and the network diameter shrinks.
Third, we quantitatively show that the search engine accelerates the rumor propagation in social networks.
Finally, based on four real-world data sets (i.e., CDBLP, Facebook, Weibo Tweets, P2P), we verify our theoretical findings.
Furthermore, we find that the search engine dramatically increases the speed of rumor propagation.
Graph processing is becoming increasingly prevalent across many application domains.
In spite of this prevalence, there is little research about how graphs are actually used in practice.
We conducted an online survey aimed at understanding: (i) the types of graphs users have; (ii) the graph computations users run; (iii) the types of graph software users use; and (iv) the major challenges users face when processing their graphs.
We describe the participants' responses to our questions highlighting common patterns and challenges.
We further reviewed user feedback in the mailing lists, bug reports, and feature requests in the source repositories of a large suite of software products for processing graphs.
Through our review, we were able to answer some new questions that were raised by participants' responses and identify specific challenges that users face when using different classes of graph software.
The participants' responses and data we obtained revealed surprising facts about graph processing in practice.
In particular, real-world graphs represent a very diverse range of entities and are often very large, and scalability and visualization are undeniably the most pressing challenges faced by participants.
We hope these findings can guide future research.
Computational modeling of visual saliency has become an important research problem in recent years, with applications in video quality estimation, video compression, object tracking, retargeting, summarization, and so on.
While most visual saliency models for dynamic scenes operate on raw video, several models have been developed for use with compressed-domain information such as motion vectors and transform coefficients.
This paper presents a comparative study of eleven such models as well as two high-performing pixel-domain saliency models on two eye-tracking datasets using several comparison metrics.
The results indicate that highly accurate saliency estimation is possible based only on a partially decoded video bitstream.
The strategies that have shown success in compressed-domain saliency modeling are highlighted, and certain challenges are identified as potential avenues for further improvement.
The World Wide Web (WWW) allows the people to share the information (data) from the large database repositories globally.
The amount of information grows billions of databases.
We need to search the information will specialize tools known generically search engine.
There are many of search engines available today, retrieving meaningful information is difficult.
However to overcome this problem in search engines to retrieve meaningful information intelligently, semantic web technologies are playing a major role.
In this paper we present survey on the search engine generations and the role of search engines in intelligent web and semantic search technologies.
Computer algebra systems are a great help for mathematical research but sometimes unexpected errors in the software can also badly affect it.
As an example, we show how we have detected an error of Mathematica computing determinants of matrices of integer numbers: not only it computes the determinants wrongly, but also it produces different results if one evaluates the same determinant twice.
Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training.
Prior works on this task are based on either variant graphical models such as HMMs and CRFs, or deep learning models such as Recurrent Neural Networks and Temporal Convolutional Networks.
Most of the current approaches usually suffer from over-segmentation and therefore low segment-level edit scores.
In contrast, we present an essentially different methodology by modeling the task as a sequential decision-making process.
An intelligent agent is trained using reinforcement learning with hierarchical features from a deep model.
Temporal consistency is integrated into our action design and reward mechanism to reduce over-segmentation errors.
Experiments on JIGSAWS dataset demonstrate that the proposed method performs better than state-of-the-art methods in terms of the edit score and on par in frame-wise accuracy.
Our code will be released later.
In this paper, we study the generation of maximal Poisson-disk sets with varying radii.
First, we present a geometric analysis of gaps in such disk sets.
This analysis is the basis for maximal and adaptive sampling in Euclidean space and on manifolds.
Second, we propose efficient algorithms and data structures to detect gaps and update gaps when disks are inserted, deleted, moved, or have their radius changed.
We build on the concepts of the regular triangulation and the power diagram.
Third, we will show how our analysis can make a contribution to the state-of-the-art in surface remeshing.
Aspect Term Extraction (ATE), a key sub-task in Aspect-Based Sentiment Analysis, aims to extract explicit aspect expressions from online user reviews.
We present a new framework for tackling ATE.
It can exploit two useful clues, namely opinion summary and aspect detection history.
Opinion summary is distilled from the whole input sentence, conditioned on each current token for aspect prediction, and thus the tailor-made summary can help aspect prediction on this token.
Another clue is the information of aspect detection history, and it is distilled from the previous aspect predictions so as to leverage the coordinate structure and tagging schema constraints to upgrade the aspect prediction.
Experimental results over four benchmark datasets clearly demonstrate that our framework can outperform all state-of-the-art methods.
Test functions are important to validate and compare the performance of optimization algorithms.
There have been many test or benchmark functions reported in the literature; however, there is no standard list or set of benchmark functions.
Ideally, test functions should have diverse properties so that can be truly useful to test new algorithms in an unbiased way.
For this purpose, we have reviewed and compiled a rich set of 175 benchmark functions for unconstrained optimization problems with diverse properties in terms of modality, separability, and valley landscape.
This is by far the most complete set of functions so far in the literature, and tt can be expected this complete set of functions can be used for validation of new optimization in the future.
The vast parallelism, exceptional energy efficiency and extraordinary information inherent in DNA molecules are being explored for computing, data storage and cryptography.
DNA cryptography is a emerging field of cryptography.
In this paper a novel encryption algorithm is devised based on number conversion, DNA digital coding, PCR amplification, which can effectively prevent attack.
Data treatment is used to transform the plain text into cipher text which provides excellent security
The present survey aims at presenting the current machine learning techniques employed in security games domains.
Specifically, we focused on papers and works developed by the Teamcore of University of Southern California, which deepened different directions in this field.
After a brief introduction on Stackelberg Security Games (SSGs) and the poaching setting, the rest of the work presents how to model a boundedly rational attacker taking into account her human behavior, then describes how to face the problem of having attacker's payoffs not defined and how to estimate them and, finally, presents how online learning techniques have been exploited to learn a model of the attacker.
In the past few years, deep reinforcement learning has been proven to solve problems which have complex states like video games or board games.
The next step of intelligent agents would be able to generalize between tasks, and using prior experience to pick up new skills more quickly.
However, most reinforcement learning algorithms for now are often suffering from catastrophic forgetting even when facing a very similar target task.
Our approach enables the agents to generalize knowledge from a single source task, and boost the learning progress with a semisupervised learning method when facing a new task.
We evaluate this approach on Atari games, which is a popular reinforcement learning benchmark, and show that it outperforms common baselines based on pre-training and fine-tuning.
This paper introduces a novel framework for modeling interacting humans in a multi-stage game.
This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world systems.
We achieve these benefits by combining concepts from game theory and reinforcement learning.
To be precise, we extend the bounded rational "level-K reasoning" model to apply to games over multiple stages.
Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each of which can be solved by standard reinforcement learning algorithms.
We call this hybrid approach "level-K reinforcement learning".
We investigate these ideas in a cyber battle scenario over a smart power grid and discuss the relationship between the behavior predicted by our model and what one might expect of real human defenders and attackers.
Neural program embeddings have shown much promise recently for a variety of program analysis tasks, including program synthesis, program repair, fault localization, etc.
However, most existing program embeddings are based on syntactic features of programs, such as raw token sequences or abstract syntax trees.
Unlike images and text, a program has an unambiguous semantic meaning that can be difficult to capture by only considering its syntax (i.e. syntactically similar pro- grams can exhibit vastly different run-time behavior), which makes syntax-based program embeddings fundamentally limited.
This paper proposes a novel semantic program embedding that is learned from program execution traces.
Our key insight is that program states expressed as sequential tuples of live variable values not only captures program semantics more precisely, but also offer a more natural fit for Recurrent Neural Networks to model.
We evaluate different syntactic and semantic program embeddings on predicting the types of errors that students make in their submissions to an introductory programming class and two exercises on the CodeHunt education platform.
Evaluation results show that our new semantic program embedding significantly outperforms the syntactic program embeddings based on token sequences and abstract syntax trees.
In addition, we augment a search-based program repair system with the predictions obtained from our se- mantic embedding, and show that search efficiency is also significantly improved.
A convex network can be defined as a network such that every connected induced subgraph includes all the shortest paths between its nodes.
Fully convex network would therefore be a collection of cliques stitched together in a tree.
In this paper, we study the largest high-convexity part of empirical networks obtained by removing the least number of edges, which we call a convex skeleton.
A convex skeleton is a generalisation of a network spanning tree in which each edge can be replaced by a clique of arbitrary size.
We present different approaches for extracting convex skeletons and apply them to social collaboration and protein interactions networks, autonomous systems graphs and food webs.
We show that the extracted convex skeletons retain the degree distribution, clustering, connectivity, distances, node position and also community structure, while making the shortest paths between the nodes largely unique.
Moreover, in the Slovenian computer scientists coauthorship network, a convex skeleton retains the strongest ties between the authors, differently from a spanning tree or high-betweenness backbone and high-salience skeleton.
A convex skeleton thus represents a simple definition of a network backbone with applications in coauthorship and other social collaboration networks.
High-resolution depth maps can be inferred from low-resolution depth measurements and an additional high-resolution intensity image of the same scene.
To that end, we introduce a bimodal co-sparse analysis model, which is able to capture the interdependency of registered intensity and depth information.
This model is based on the assumption that the co-supports of corresponding bimodal image structures are aligned when computed by a suitable pair of analysis operators.
No analytic form of such operators exist and we propose a method for learning them from a set of registered training signals.
This learning process is done offline and returns a bimodal analysis operator that is universally applicable to natural scenes.
We use this to exploit the bimodal co-sparse analysis model as a prior for solving inverse problems, which leads to an efficient algorithm for depth map super-resolution.
Biological plastic neural networks are systems of extraordinary computational capabilities shaped by evolution, development, and lifetime learning.
The interplay of these elements leads to the emergence of adaptive behavior and intelligence.
Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed plastic neural networks with a large variety of dynamics, architectures, and plasticity rules: these artificial systems are composed of inputs, outputs, and plastic components that change in response to experiences in an environment.
These systems may autonomously discover novel adaptive algorithms, and lead to hypotheses on the emergence of biological adaptation.
EPANNs have seen considerable progress over the last two decades.
Current scientific and technological advances in artificial neural networks are now setting the conditions for radically new approaches and results.
In particular, the limitations of hand-designed networks could be overcome by more flexible and innovative solutions.
This paper brings together a variety of inspiring ideas that define the field of EPANNs.
The main methods and results are reviewed.
Finally, new opportunities and developments are presented.
Community detection is a key data analysis problem across different fields.
During the past decades, numerous algorithms have been proposed to address this issue.
However, most work on community detection does not address the issue of statistical significance.
Although some research efforts have been made towards mining statistically significant communities, deriving an analytical solution of p-value for one community under the configuration model is still a challenging mission that remains unsolved.
To partially fulfill this void, we present a tight upper bound on the p-value of a single community under the configuration model, which can be used for quantifying the statistical significance of each community analytically.
Meanwhile, we present a local search method to detect statistically significant communities in an iterative manner.
Experimental results demonstrate that our method is comparable with the competing methods on detecting statistically significant communities.
Convolutional Neural Networks (CNNs) is one of successful method in many areas such as image classification tasks.
However, the amount of memory and computational cost needed for CNNs inference obstructs them to run efficiently in mobile devices because of memory and computational ability limitation.
One of the method to compress CNNs is compressing the layers iteratively, i.e. by layer-by-layer compression and fine-tuning, with CP-decomposition in convolutional layers.
To compress with CP-decomposition, rank selection is important.
In the previous approach rank selection that is based on sensitivity of each layer, the average rank of the network was still arbitrarily selected.
Additionally, the rank of all layers were decided before whole process of iterative compression, while the rank of a layer can be changed after fine-tuning.
Therefore, this paper proposes selecting rank of each layer using Variational Bayesian Matrix Factorization (VBMF) which is more systematic than arbitrary approach.
Furthermore, to consider the change of each layer's rank after fine-tuning of previous iteration, the method is applied just before compressing the target layer, i.e. after fine-tuning of the previous iteration.
The results show better accuracy while also having more compression rate in AlexNet's convolutional layers compression.
The article describes the prospects of model base management system design automation for decision support systems and suggests the toolbox scheme for design automation based on intelligent technologies.
We here summarize our experience running a challenge with open data for musical genre recognition.
Those notes motivate the task and the challenge design, show some statistics about the submissions, and present the results.
Full-reference image quality assessment (FR-IQA) techniques compare a reference and a distorted/test image and predict the perceptual quality of the test image in terms of a scalar value representing an objective score.
The evaluation of FR-IQA techniques is carried out by comparing the objective scores from the techniques with the subjective scores (obtained from human observers) provided in the image databases used for the IQA.
Hence, we reasonably assume that the goal of a human observer is to rate the distortion present in the test image.
The goal oriented tasks are processed by the human visual system (HVS) through top-down processing which actively searches for local distortions driven by the goal.
Therefore local distortion measures in an image are important for the top-down processing.
At the same time, bottom-up processing also takes place signifying spontaneous visual functions in the HVS.
To account for this, global perceptual features can be used.
Therefore, we hypothesize that the resulting objective score for an image can be derived from the combination of local and global distortion measures calculated from the reference and test images.
We calculate the local distortion by measuring the local correlation differences from the gradient and contrast information.
For global distortion, dissimilarity of the saliency maps computed from a bottom-up model of saliency is used.
The motivation behind the proposed approach has been thoroughly discussed, accompanied by an intuitive analysis.
Finally, experiments are conducted in six benchmark databases suggesting the effectiveness of the proposed approach that achieves competitive performance with the state-of-the-art methods providing an improvement in the overall performance.
People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning.
Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: giving adequate success rates to specifically protected groups.
In contrast, the alternative paradigm of individual fairness has received relatively little attention, and this paper advances this less explored direction.
The paper introduces a method for probabilistically mapping user records into a low-rank representation that reconciles individual fairness and the utility of classifiers and rankings in downstream applications.
Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes.
We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on a variety of real-world datasets.
Our experiments show substantial improvements over the best prior work for this setting.
Recently, arithmetic coding has attracted the attention of many scholars because of its high compression capability.
Accordingly, in this paper a method which adds secrecy to this well-known source code is proposed.
Finite state arithmetic code (FSAC) is used as source code to add security.
Its finite state machine (FSM) characteristic is exploited to insert some random jumps during source coding process.
In addition, a Huffman code is designed for each state to make decoding possible even in jumps.
Being Prefix free, Huffman codes are useful in tracking correct states for an authorized user when s/he decodes with correct symmetric pseudo random key.
The robustness of our proposed scheme is further reinforced by adding another extra uncertainty by swapping outputs of Huffman codes in each state.
Several test images are used for inspecting the validity of the proposed Huffman Finite State Arithmetic Coding (HFSAC).
The results of several experimental, key space analyses, statistical analysis, key sensitivity and plaintext sensitivity tests show that HFSAC with a little effect on compression efficiency for image cryptosystem provides an efficient and secure way for real-time image encryption and transmission.
Nowadays with the help of advanced technology, modern vehicles are not only made up of mechanical devices but also consist of highly complex electronic devices and connections to the outside world.
There are around 70 Electronic Control Units (ECUs) in modern vehicle which are communicating with each other over the standard communication protocol known as Controller Area Network (CAN-Bus) that provides the communication rate up to 1Mbps.
There are different types of in-vehicle network protocol and bus system namely Controlled Area Network (CAN), Local Interconnected Network (LIN), Media Oriented System Transport (MOST), and FlexRay.
Even though CAN-Bus is considered as de-facto standard for in-vehicle network communication, it inherently lacks the fundamental security features by design like message authentication.
This security limitation has paved the way for adversaries to penetrate into the vehicle network and do malicious activities which can pose a dangerous situation for both driver and passengers.
In particular, nowadays vehicular networks are not only closed systems, but also they are open to different external interfaces namely Bluetooth, GPS, to the outside world.
Therefore, it creates new opportunities for attackers to remotely take full control of the vehicle.
The objective of this research is to survey the current limitations of CAN-Bus protocol in terms of secure communication and different solutions that researchers in the society of automotive have provided to overcome the CAN-Bus limitation on different layers.
In this paper, the tracking control problem of a class of uncertain Euler-Lagrange systems subjected to unknown input delay and bounded disturbances is addressed.
To this front, a novel delay dependent control law, referred as Adaptive Robust Outer Loop Control (AROLC) is proposed.
Compared to the conventional predictor based approaches, the proposed controller is capable of negotiating any input delay, within a stipulated range, without knowing the delay or its variation.
The maximum allowable input delay is computed through Razumikhin-type stability analysis.
AROLC also provides robustness against the disturbances due to input delay, parametric variations and unmodelled dynamics through switching control law.
The novel adaptive law allows the switching gain to modify itself online in accordance with the tracking error without any prerequisite of the uncertainties.
The uncertain system, employing AROLC, is shown to be Uniformly Ultimately Bounded (UUB).
As a proof of concept, experimentation is carried out on a nonholonomic wheeled mobile robot with various time varying as well as fixed input delay, and better tracking accuracy of the proposed controller is noted compared to predictor based methodology.
For many people suffering from motor disabilities, assistive devices controlled with only brain activity are the only way to interact with their environment.
Natural tasks often require different kinds of interactions, involving different controllers the user should be able to select in a self-paced way.
We developed a Brain-Computer Interface (BCI) allowing users to switch between four control modes in a self-paced way in real-time.
Since the system is devised to be used in domestic environments in a user-friendly way, we selected non-invasive electroencephalographic (EEG) signals and convolutional neural networks (CNNs), known for their ability to find the optimal features in classification tasks.
We tested our system using the Cybathlon BCI computer game, which embodies all the challenges inherent to real-time control.
Our preliminary results show that an efficient architecture (SmallNet), with only one convolutional layer, can classify 4 mental activities chosen by the user.
The BCI system is run and validated online.
It is kept up-to-date through the use of newly collected signals along playing, reaching an online accuracy of 47.6% where most approaches only report results obtained offline.
We found that models trained with data collected online better predicted the behaviour of the system in real-time.
This suggests that similar (CNN based) offline classifying methods found in the literature might experience a drop in performance when applied online.
Compared to our previous decoder of physiological signals relying on blinks, we increased by a factor 2 the amount of states among which the user can transit, bringing the opportunity for finer control of specific subtasks composing natural grasping in a self-paced way.
Our results are comparable to those shown at the Cybathlon's BCI Race but further improvements on accuracy are required.
Recently deep learning based recommendation systems have been actively explored to solve the cold-start problem using a hybrid approach.
However, the majority of previous studies proposed a hybrid model where collaborative filtering and content-based filtering modules are independently trained.
The end-to-end approach that takes different modality data as input and jointly trains the model can provide better optimization but it has not been fully explored yet.
In this work, we propose deep content-user embedding model, a simple and intuitive architecture that combines the user-item interaction and music audio content.
We evaluate the model on music recommendation and music auto-tagging tasks.
The results show that the proposed model significantly outperforms the previous work.
We also discuss various directions to improve the proposed model further.
We examine volume computation of general-dimensional polytopes and more general convex bodies, defined as the intersection of a simplex by a family of parallel hyperplanes, and another family of parallel hyperplanes or a family of concentric ellipsoids.
Such convex bodies appear in modeling and predicting financial crises.
The impact of crises on the economy (labor, income, etc.)makes its detection of prime interest.
Certain features of dependencies in the markets clearly identify times of turmoil.
We describe the relationship between asset characteristics by means of a copula; each characteristic is either a linear or quadratic form of the portfolio components, hence the copula can be constructed by computing volumes of convex bodies.
We design and implement practical algorithms in the exact and approximate setting, we experimentally juxtapose them and study the tradeoff of exactness and accuracy for speed.
We analyze the following methods in order of increasing generality: rejection sampling relying on uniformly sampling the simplex, which is the fastest approach, but inaccurate for small volumes; exact formulae based on the computation of integrals of probability distribution functions; an optimized Lawrence sign decomposition method, since the polytopes at hand are shown to be simple; Markov chain Monte Carlo algorithms using random walks based on the hit-and-run paradigm generalized to nonlinear convex bodies and relying on new methods for computing a ball enclosed; the latter is experimentally extended to non-convex bodies with very encouraging results.
Our C++ software, based on CGAL and Eigen and available on github, is shown to be very effective in up to 100 dimensions.
Our results offer novel, effective means of computing portfolio dependencies and an indicator of financial crises, which is shown to correctly identify past crises.
We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers.
Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options.
Our method addresses these problems by leveraging a large corpus of domain-specific text and a small set of existing questions.
It produces model suggestions for document selection and answer distractor choice which aid the human question generation process.
With this method we have assembled SciQ, a dataset of 13.7K multiple choice science exam questions (Dataset available at http://allenai.org/data.html).
We demonstrate that the method produces in-domain questions by providing an analysis of this new dataset and by showing that humans cannot distinguish the crowdsourced questions from original questions.
When using SciQ as additional training data to existing questions, we observe accuracy improvements on real science exams.
In this paper, we propose a unified framework and an algorithm for the problem of group recommendation where a fixed number of items or alternatives can be recommended to a group of users.
The problem of group recommendation arises naturally in many real world contexts, and is closely related to the budgeted social choice problem studied in economics.
We frame the group recommendation problem as choosing a subgraph with the largest group consensus score in a completely connected graph defined over the item affinity matrix.
We propose a fast greedy algorithm with strong theoretical guarantees, and show that the proposed algorithm compares favorably to the state-of-the-art group recommendation algorithms according to commonly used relevance and coverage performance measures on benchmark dataset.
Multiple-input multiple-output (MIMO) millimeter wave (mmWave) systems are vulnerable to hardware impairments due to operating at high frequencies and employing a large number of radio- frequency (RF) hardware components.
In particular, nonlinear power amplifiers (PAs) employed at the transmitter distort the signal when operated close to saturation due to energy efficiency considerations.
In this paper, we study the performance of a MIMO mmWave hybrid beamforming scheme in the presence of nonlinear PAs.
First, we develop a statistical model for the transmitted signal in such systems and show that the spatial direction of the inband distortion is shaped by the beamforming filter.
This suggests that even in the large antenna regime, where narrow beams can be steered toward the receiver, the impact of nonlinear PAs should not be ignored.
Then, by employing a realistic power consumption model for the PAs, we investigate the trade-off between spectral and energy efficiency in such systems.
Our results show that increasing the transmit power level when the number of transmit antennas grows large can be counter-effective in terms of energy efficiency.
Furthermore, using numerical simulation, we show that when the transmit power is large, analog beamforming leads to higher spectral and energy efficiency compared to digital and hybrid beamforming schemes.
This paper deals with a method for generating realistic labeled masses.
Recently, there have been many attempts to apply deep learning to various bio-image computing fields including computer-aided detection and diagnosis.
In order to learn deep network model to be well-behaved in bio-image computing fields, a lot of labeled data is required.
However, in many bioimaging fields, the large-size of labeled dataset is scarcely available.
Although a few researches have been dedicated to solving this problem through generative model, there are some problems as follows: 1) The generated bio-image does not seem realistic; 2) the variation of generated bio-image is limited; and 3) additional label annotation task is needed.
In this study, we propose a realistic labeled bio-image generation method through visual feature processing in latent space.
Experimental results have shown that mass images generated by the proposed method were realistic and had wide expression range of targeted mass characteristics.
Inter-domain routing is a crucial part of the Internet designed for arbitrary policies, economical models, and topologies.
This versatility translates into a substantially complex system that is hard to comprehend.
Monitoring the inter-domain routing infrastructure is however essential for understanding the current state of the Internet and improving it.
In this paper we design a methodology to answer two simple questions: Which are the common transit networks used to reach a certain AS?
How much does this AS depends on these transit networks?
To answer these questions we digest AS paths advertised with the Border Gateway Protocol (BGP) into AS graphs and measure node centrality, that is the likelihood of an AS to lie on paths between two other ASes.
Our proposal relies solely on the AS hegemony metric, a new way to quantify node centrality while taking into account the bias towards the partial view offered by BGP.
Our analysis using 14 years of BGP data refines our knowledge on Internet flattening but also exhibits the consolidated position of tier-1 networks in today's IPv4 and IPv6 Internet.
We also study the connectivity to two content providers (Google and Akamai) and investigate the AS dependency of networks hosting DNS root servers.
These case studies emphasize the benefits of the proposed method to assist ISPs in planning and assessing infrastructure deployment.
Understanding the cognitive evolution of researchers as they progress in the academia is an important but complex problem, a problem belonging to a class of problems, which often require the development of models for gaining further understanding in the intricacies of the domain.
The research question that we address in this paper is how to effectively model this temporal cognitive mental development of prolific researchers.
Our proposed solution to this problem is based on noting that the academic progression and notability of a researcher are linked with a progressive increase in the citation count for the scholar's refereed publications quantified using indices such as the Hirsch index.
In other words, we propose the use of yearly cognitive increment of a scholar's cognition to be quantifiable using a function of the scholar's citation index, thereby considering the index as an indicator of the discrete approximation of the scholar's cognitive development.
Using validated agent-based modeling, a paradigm presented as part of our previous work i.e. Cognitive Agent-based Computing framework, we present both formal as well as visual agent-based complex network representations for this cognitive evolution in the form of a Temporal Cognitive Level Network (TCLN) model.
As a proof of the effectiveness of this approach, we demonstrate validation of the model using historic data of citations.
In general the problem of finding a miminum spanning tree for a weighted directed graph is difficult but solvable.
There are a lot of differences between problems for directed and undirected graphs, therefore the algorithms for undirected graphs cannot usually be applied to the directed case.
In this paper we examine the kind of weights such that the problems are equivalent and a minimum spanning tree of a directed graph may be found by a simple algorithm for an undirected graph.
In this paper, we present a learning based approach to depth fusion, i.e., dense 3D reconstruction from multiple depth images.
The most common approach to depth fusion is based on averaging truncated signed distance functions, which was originally proposed by Curless and Levoy in 1996.
While this method is simple and provides great results, it is not able to reconstruct (partially) occluded surfaces and requires a large number frames to filter out sensor noise and outliers.
Motivated by the availability of large 3D model repositories and recent advances in deep learning, we present a novel 3D CNN architecture that learns to predict an implicit surface representation from the input depth maps.
Our learning based method significantly outperforms the traditional volumetric fusion approach in terms of noise reduction and outlier suppression.
By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill in gaps in the reconstruction.
We demonstrate that our learning based approach outperforms both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric fusion.
Further, we demonstrate state-of-the-art 3D shape completion results.
It has been believed that stochastic feedforward neural networks (SFNNs) have several advantages beyond deterministic deep neural networks (DNNs): they have more expressive power allowing multi-modal mappings and regularize better due to their stochastic nature.
However, training large-scale SFNN is notoriously harder.
In this paper, we aim at developing efficient training methods for SFNN, in particular using known architectures and pre-trained parameters of DNN.
To this end, we propose a new intermediate stochastic model, called Simplified-SFNN, which can be built upon any baseline DNNand approximates certain SFNN by simplifying its upper latent units above stochastic ones.
The main novelty of our approach is in establishing the connection between three models, i.e., DNN->Simplified-SFNN->SFNN, which naturally leads to an efficient training procedure of the stochastic models utilizing pre-trained parameters of DNN.
Using several popular DNNs, we show how they can be effectively transferred to the corresponding stochastic models for both multi-modal and classification tasks on MNIST, TFD, CASIA, CIFAR-10, CIFAR-100 and SVHN datasets.
In particular, we train a stochastic model of 28 layers and 36 million parameters, where training such a large-scale stochastic network is significantly challenging without using Simplified-SFNN
Semantic segmentation of motion capture sequences plays a key part in many data-driven motion synthesis frameworks.
It is a preprocessing step in which long recordings of motion capture sequences are partitioned into smaller segments.
Afterwards, additional methods like statistical modeling can be applied to each group of structurally-similar segments to learn an abstract motion manifold.
The segmentation task however often remains a manual task, which increases the effort and cost of generating large-scale motion databases.
We therefore propose an automatic framework for semantic segmentation of motion capture data using a dilated temporal fully-convolutional network.
Our model outperforms a state-of-the-art model in action segmentation, as well as three networks for sequence modeling.
We further show our model is robust against high noisy training labels.
The predominant use of wireless access networks is for media streaming applications, which are only gaining popularity as ever more devices become available for this purpose.
However, current access networks treat all packets identically, and lack the agility to determine which clients are most in need of service at a given time.
Software reconfigurability of networking devices has seen wide adoption, and this in turn implies that agile control policies can be now instantiated on access networks.
The goal of this work is to design, develop and demonstrate FlowBazaar, an market-based approach to create a value chain from the application on one side, to algorithms operating over reconfigurable infrastructure on the other, so that applications are able to obtain necessary resources for optimal performance.
Using YouTube video streaming as an example, we illustrate how FlowBazaar is able to adaptively provide such resources and attain a high QoE for all clients at a wireless access point.
This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting.
Inspired by network pruning techniques, we exploit redundancies in large deep networks to free up parameters that can then be employed to learn new tasks.
By performing iterative pruning and network re-training, we are able to sequentially "pack" multiple tasks into a single network while ensuring minimal drop in performance and minimal storage overhead.
Unlike prior work that uses proxy losses to maintain accuracy on older tasks, we always optimize for the task at hand.
We perform extensive experiments on a variety of network architectures and large-scale datasets, and observe much better robustness against catastrophic forgetting than prior work.
In particular, we are able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task.
Code available at https://github.com/arunmallya/packnet
We present a methodology for fast prototyping of morphologies and controllers for robot locomotion.
Going beyond simulation-based approaches, we argue that the form and function of a robot, as well as their interplay with real-world environmental conditions are critical.
Hence, fast design and learning cycles are necessary to adapt robot shape and behavior to their environment.
To this end, we present a combination of laminate robot manufacturing and sample-efficient reinforcement learning.
We leverage this methodology to conduct an extensive robot learning experiment.
Inspired by locomotion in sea turtles, we design a low-cost crawling robot with variable, interchangeable fins.
Learning is performed using both bio-inspired and original fin designs in an artificial indoor environment as well as a natural environment in the Arizona desert.
The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments.
In contrast to that, sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot.
This paper presents a memory efficient architecture that implements the Multi-Scale Line Detector (MSLD) algorithm for real-time retinal blood vessel detection in fundus images on a Zynq FPGA.
This implementation benefits from the FPGA parallelism to drastically reduce the memory requirements of the MSLD from two images to a few values.
The architecture is optimized in terms of resource utilization by reusing the computations and optimizing the bit-width.
The throughput is increased by designing fully pipelined functional units.
The architecture is capable of achieving a comparable accuracy to its software implementation but 70x faster for low resolution images.
For high resolution images, it achieves an acceleration by a factor of 323x.
A human computation system can be viewed as a distributed system in which the processors are humans, called workers.
Such systems harness the cognitive power of a group of workers connected to the Internet to execute relatively simple tasks, whose solutions, once grouped, solve a problem that systems equipped with only machines could not solve satisfactorily.
Examples of such systems are Amazon Mechanical Turk and the Zooniverse platform.
A human computation application comprises a group of tasks, each of them can be performed by one worker.
Tasks might have dependencies among each other.
In this study, we propose a theoretical framework to analyze such type of application from a distributed systems point of view.
Our framework is established on three dimensions that represent different perspectives in which human computation applications can be approached: quality-of-service requirements, design and management strategies, and human aspects.
By using this framework, we review human computation in the perspective of programmers seeking to improve the design of human computation applications and managers seeking to increase the effectiveness of human computation infrastructures in running such applications.
In doing so, besides integrating and organizing what has been done in this direction, we also put into perspective the fact that the human aspects of the workers in such systems introduce new challenges in terms of, for example, task assignment, dependency management, and fault prevention and tolerance.
We discuss how they are related to distributed systems and other areas of knowledge.
In this paper we consider the uplink of a massive MIMO communication system using 5G New Radio-compliant multiple access, which is to co-exist with a radar system using the same frequency band.
We propose a system model taking into account the reverberation (clutter) produced by the radar system at the massive MIMO receiver.
Then, we propose several linear receivers for uplink data-detection, ranging by the simple channel-matched beamformer to the zero-forcing and linear minimum mean square error receivers for clutter disturbance rejection.
Our results show that the clutter may have a strong effect on the performance of the cellular communication system, but the use of large-scale antenna arrays at the base station is key to provide increased robustness against it, at least as far as data-detection is concerned.
Data-driven analysis of complex networks has been in the focus of research for decades.
An important question is to discover the relation between various network characteristics in real-world networks and how these relationships vary across network domains.
A related research question is to study how well the network models can capture the observed relations between the graph metrics.
In this paper, we apply statistical and machine learning techniques to answer the aforementioned questions.
We study 400 real-world networks along with 2400 networks generated by five frequently used network models with previously fitted parameters to make the generated graphs as similar to the real network as possible.
We find that the correlation profiles of the structural measures significantly differ across network domains and the domain can be efficiently determined using a small selection of graph metrics.
The goodness-of-fit of the network models and the best performing models themselves highly depend on the domains.
Using machine learning techniques, it turned out to be relatively easy to decide if a network is real or model-generated.
We also investigate what structural properties make it possible to achieve a good accuracy, i.e. what features the network models cannot capture.
Deep neuroevolution and deep reinforcement learning (deep RL) algorithms are two popular approaches to policy search.
The former is widely applicable and rather stable, but suffers from low sample efficiency.
By contrast, the latter is more sample efficient, but the most sample efficient variants are also rather unstable and highly sensitive to hyper-parameter setting.
So far, these families of methods have mostly been compared as competing tools.
However, an emerging approach consists in combining them so as to get the best of both worlds.
Two previously existing combinations use either an ad hoc evolutionary algorithm or a goal exploration process together with the Deep Deterministic Policy Gradient (DDPG) algorithm, a sample efficient off-policy deep RL algorithm.
In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (td3), another off-policy deep RL algorithm which improves over ddpg.
We evaluate the resulting method, cem-rl, on a set of benchmarks classically used in deep RL.
We show that cem-rl benefits from several advantages over its competitors and offers a satisfactory trade-off between performance and sample efficiency.
Motion planning under differential constraints, kinodynamic motion planning, is one of the canonical problems in robotics.
Currently, state-of-the-art methods evolve around kinodynamic variants of popular sampling-based algorithms, such as Rapidly-exploring Random Trees (RRTs).
However, there are still challenges remaining, for example, how to include complex dynamics while guaranteeing optimality.
If the open-loop dynamics are unstable, exploration by random sampling in control space becomes inefficient.
We describe a new sampling-based algorithm, called CL-RRT#, which leverages ideas from the RRT# algorithm and a variant of the RRT algorithm that generates trajectories using closed-loop prediction.
The idea of planning with closed-loop prediction allows us to handle complex unstable dynamics and avoids the need to find computationally hard steering procedures.
The search technique presented in the RRT# algorithm allows us to improve the solution quality by searching over alternative reference trajectories.
Numerical simulations using a nonholonomic system demonstrate the benefits of the proposed approach.
A number of methods have been proposed over the last decade for encoding information using deoxyribonucleic acid (DNA), giving rise to the emerging area of DNA data embedding.
Since a DNA sequence is conceptually equivalent to a sequence of quaternary symbols (bases), DNA data embedding (diversely called DNA watermarking or DNA steganography) can be seen as a digital communications problem where channel errors are tantamount to mutations of DNA bases.
Depending on the use of coding or noncoding DNA hosts, which, respectively, denote DNA segments that can or cannot be translated into proteins, DNA data embedding is essentially a problem of communications with or without side information at the encoder.
In this paper the Shannon capacity of DNA data embedding is obtained for the case in which DNA sequences are subject to substitution mutations modelled using the Kimura model from molecular evolution studies.
Inferences are also drawn with respect to the biological implications of some of the results presented.
Given a graph where every node has certain attributes associated with it and some nodes have labels associated with them, Collective Classification (CC) is the task of assigning labels to every unlabeled node using information from the node as well as its neighbors.
It is often the case that a node is not only influenced by its immediate neighbors but also by higher order neighbors, multiple hops away.
Recent state-of-the-art models for CC learn end-to-end differentiable variations of Weisfeiler-Lehman (WL) kernels to aggregate multi-hop neighborhood information.
In this work, we propose a Higher Order Propagation Framework, HOPF, which provides an iterative inference mechanism for these powerful differentiable kernels.
Such a combination of classical iterative inference mechanism with recent differentiable kernels allows the framework to learn graph convolutional filters that simultaneously exploit the attribute and label information available in the neighborhood.
Further, these iterative differentiable kernels can scale to larger hops beyond the memory limitations of existing differentiable kernels.
We also show that existing WL kernel-based models suffer from the problem of Node Information Morphing where the information of the node is morphed or overwhelmed by the information of its neighbors when considering multiple hops.
To address this, we propose a specific instantiation of HOPF, called the NIP models, which preserves the node information at every propagation step.
The iterative formulation of NIP models further helps in incorporating distant hop information concisely as summaries of the inferred labels.
We do an extensive evaluation across 11 datasets from different domains.
We show that existing CC models do not provide consistent performance across datasets, while the proposed NIP model with iterative inference is more robust.
This paper deals with a very renowned website (that is Book-Crossing) from two angles: The first angle focuses on the direct relations between users and books.
Many things can be inferred from this part of analysis such as who is more interested in book reading than others and why?
Which books are most popular and which users are most active and why?
The task requires the use of certain social network analysis measures (e.g. degree centrality).
What does it mean when two users like the same book?
Is it the same when other two users have one thousand books in common?
Who is more likely to be a friend of whom and why?
Are there specific people in the community who are more qualified to establish large circles of social relations?
These questions (and of course others) were answered through the other part of the analysis, which will take us to probe the potential social relations between users in this community.
Although these relationships do not exist explicitly, they can be inferred with the help of affiliation network analysis and techniques such as m-slice.
Spreadsheets are used to develop application software that is distributed to users.
Unfortunately, the users often have the ability to change the programming statements ("source code") of the spreadsheet application.
This causes a host of problems.
By critically examining the suitability of spreadsheet computer programming languages for application development, six "application development features" are identified, with source code protection being the most important.
We investigate the status of these features and discuss how they might be implemented in the dominant Microsoft Excel spreadsheet and in the new Google Spreadsheet.
Although Google Spreadsheet currently provides no source code control, its web-centric delivery model offers technical advantages for future provision of a rich set of features.
Excel has a number of tools that can be combined to provide "pretty good protection" of source code, but weak passwords reduce its robustness.
User access to Excel source code must be considered a programmer choice rather than an attribute of the spreadsheet.
Convolutional LDPC ensembles, introduced by Felstrom and Zigangirov, have excellent thresholds and these thresholds are rapidly increasing as a function of the average degree.
Several variations on the basic theme have been proposed to date, all of which share the good performance characteristics of convolutional LDPC ensembles.
We describe the fundamental mechanism which explains why "convolutional-like" or "spatially coupled" codes perform so well.
In essence, the spatial coupling of the individual code structure has the effect of increasing the belief-propagation (BP) threshold of the new ensemble to its maximum possible value, namely the maximum-a-posteriori (MAP) threshold of the underlying ensemble.
For this reason we call this phenomenon "threshold saturation."
This gives an entirely new way of approaching capacity.
One significant advantage of such a construction is that one can create capacity-approaching ensembles with an error correcting radius which is increasing in the blocklength.
Our proof makes use of the area theorem of the BP-EXIT curve and the connection between the MAP and BP threshold recently pointed out by Measson, Montanari, Richardson, and Urbanke.
Although we prove the connection between the MAP and the BP threshold only for a very specific ensemble and only for the binary erasure channel, empirically a threshold saturation phenomenon occurs for a wide class of ensembles and channels.
More generally, we conjecture that for a large range of graphical systems a similar saturation of the "dynamical" threshold occurs once individual components are coupled sufficiently strongly.
This might give rise to improved algorithms as well as to new techniques for analysis.
Health related social media mining is a valuable apparatus for the early recognition of the diverse antagonistic medicinal conditions.
Mostly, the existing methods are based on machine learning with knowledge-based learning.
This working note presents the Recurrent neural network (RNN) and Long short-term memory (LSTM) based embedding for automatic health text classification in the social media mining.
For each task, two systems are built and that classify the tweet at the tweet level.
RNN and LSTM are used for extracting features and non-linear activation function at the last layer facilitates to distinguish the tweets of different categories.
The experiments are conducted on 2nd Social Media Mining for Health Applications Shared Task at AMIA 2017.
The experiment results are considerable; however the proposed method is appropriate for the health text classification.
This is primarily due to the reason that, it doesn't rely on any feature engineering mechanisms.
For many real-life Bayesian networks, common knowledge dictates that the output established for the main variable of interest increases with higher values for the observable variables.
We define two concepts of monotonicity to capture this type of knowledge.
We say that a network is isotone in distribution if the probability distribution computed for the output variable given specific observations is stochastically dominated by any such distribution given higher-ordered observations; a network is isotone in mode if a probability distribution given higher observations has a higher mode.
We show that establishing whether a network exhibits any of these properties of monotonicity is coNPPP-complete in general, and remains coNP-complete for polytrees.
We present an approximate algorithm for deciding whether a network is monotone in distribution and illustrate its application to a real-life network in oncology.
Explainability and interpretability are two critical aspects of decision support systems.
Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications.
Despite their importance, it is only recently that researchers are starting to explore these aspects.
This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks.
Specifically, we review and study those mechanisms in the context of first impressions analysis.
To the best of our knowledge, this is the first effort in this direction.
Additionally, we describe a challenge we organized on explainability in first impressions analysis from video.
We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge.
Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field.
Secure multiparty computation (MPC) allows joint privacy-preserving computations on data of multiple parties.
Although MPC has been studied substantially, building solutions that are practical in terms of computation and communication cost is still a major challenge.
In this paper, we investigate the practical usefulness of MPC for multi-domain network security and monitoring.
We first optimize MPC comparison operations for processing high volume data in near real-time.
We then design privacy-preserving protocols for event correlation and aggregation of network traffic statistics, such as addition of volume metrics, computation of feature entropy, and distinct item count.
Optimizing performance of parallel invocations, we implement our protocols along with a complete set of basic operations in a library called SEPIA.
We evaluate the running time and bandwidth requirements of our protocols in realistic settings on a local cluster as well as on PlanetLab and show that they work in near real-time for up to 140 input providers and 9 computation nodes.
Compared to implementations using existing general-purpose MPC frameworks, our protocols are significantly faster, requiring, for example, 3 minutes for a task that takes 2 days with general-purpose frameworks.
This improvement paves the way for new applications of MPC in the area of networking.
Finally, we run SEPIA's protocols on real traffic traces of 17 networks and show how they provide new possibilities for distributed troubleshooting and early anomaly detection.
Peer review, evaluation, and selection is a fundamental aspect of modern science.
Funding bodies the world over employ experts to review and select the best proposals of those submitted for funding.
The problem of peer selection, however, is much more general: a professional society may want to give a subset of its members awards based on the opinions of all members; an instructor for a MOOC or online course may want to crowdsource grading; or a marketing company may select ideas from group brainstorming sessions based on peer evaluation.
We make three fundamental contributions to the study of procedures or mechanisms for peer selection, a specific type of group decision-making problem, studied in computer science, economics, and political science.
First, we propose a novel mechanism that is strategyproof, i.e., agents cannot benefit by reporting insincere valuations.
Second, we demonstrate the effectiveness of our mechanism by a comprehensive simulation-based comparison with a suite of mechanisms found in the literature.
Finally, our mechanism employs a randomized rounding technique that is of independent interest, as it solves the apportionment problem that arises in various settings where discrete resources such as parliamentary representation slots need to be divided proportionally.
Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event).
In this work, we attempt to build computational models that can produce witty descriptions for a given image.
Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions.
We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture.
We compare our approach against meaningful baseline approaches via human studies and show substantial improvements.
We find that when a human is subject to similar constraints as the model regarding word usage and style, people vote the image descriptions generated by our model to be slightly wittier than human-written witty descriptions.
Unsurprisingly, humans are almost always wittier than the model when they are free to choose the vocabulary, style, etc.
This paper attempts to explain consequences of the relational calculus not allowing relations to be domains of relations, and to suggest a solution for the issue.
On the example of SQL we describe the consequent problem of the multitude of different representations for relations; analyze in detail the disadvantages of the notions "TABLE" and "FOREIGN KEY"; and propose a complex solution which includes brand new data language, abandonment of tables as a representation for relations, and relatively small yet very significant alteration of the data storage concept, called "multitable index".
We establish exact recovery for the Least Unsquared Deviations (LUD) algorithm of Ozyesil and Singer.
More precisely, we show that for sufficiently many cameras with given corrupted pairwise directions, where both camera locations and pairwise directions are generated by a special probabilistic model, the LUD algorithm exactly recovers the camera locations with high probability.
A similar exact recovery guarantee was established for the ShapeFit algorithm by Hand, Lee and Voroninski, but with typically less corruption.
This paper proposes a design for a hybrid, city-wide urban navigation system for moving agents demanding dedicated assistance.
The hybrid system combines GPS and vehicle-to-vehicle communication from an ad-hoc network of parked cars, and RFID from fixed infrastructure -such as smart traffic lights- to enable a safely navigable city.
Applications for such a system include high-speed drone navigation and directing visually impaired pedestrians.
The Internet provides students with a unique opportunity to connect and maintain social ties with peers from other schools, irrespective of how far they are from each other.
However, little is known about the real structure of such online relationships.
In this paper, we investigate the structure of interschool friendship on a popular social networking site.
We use data from 36,951 students from 590 schools of a large European city.
We find that the probability of a friendship tie between students from neighboring schools is high and that it decreases with the distance between schools following the power law.
We also find that students are more likely to be connected if the educational outcomes of their schools are similar.
We show that this fact is not a consequence of residential segregation.
While high- and low-performing schools are evenly distributed across the city, this is not the case for the digital space, where schools turn out to be segregated by educational outcomes.
There is no significant correlation between the educational outcomes of a school and its geographical neighbors; however, there is a strong correlation between the educational outcomes of a school and its digital neighbors.
These results challenge the common assumption that the Internet is a borderless space, and may have important implications for the understanding of educational inequality in the digital age.
Mixed-criticality systems combine real-time components of different levels of criticality, i.e. severity of failure, on the same processor, in order to obtain good resource utilisation.
They must guarantee deadlines of highly-critical tasks at the expense of lower-criticality ones in the case of overload.
Present operating systems provide inadequate support for this kind of system, which is of growing importance in avionics and other verticals.
We present an approach that provides the required asymmetric integrity and its implementation in the high-assurance seL4 microkernel.
The infamous Facebook emotion contagion experiment is one of the most prominent and best-known online experiments based on the concept of what we here call "living labs".
In these kinds of experiments, real-world applications such as social web platforms trigger experimental switches inside their system to present experimental changes to their users - most of the time without the users being aware of their role as virtual guinea pigs.
In the Facebook example the researches changed the way users' personal timeline was compiled to test the influence on the users' moods and feelings.
The reactions to these experiments showed the inherent ethical issues such living labs settings bring up, mainly the study's lack of informed consent procedures, as well as a more general critique of the flaws in the experimental design.
In this chapter, we describe additional use cases: The so-called living labs that focus on experimentation with information systems such as search engines and wikis and especially on their real-world usage.
The living labs paradigm allows researchers to conduct research in real-world environments or systems.
In the field of information science and especially information retrieval - which is the scientific discipline that is concerned with the research of search engines, information systems, and search related algorithms and techniques - it is still common practice to perform in vitro or offline evaluations using static test collections.
Living labs are widely unknown or unavailable to academic researchers in these fields.
A main benefit of living labs is their potential to offer new ways and possibilities to experiment with information systems and especially their users, but on the other hand they introduce a whole set of ethical issues that we would like to address in this chapter.
One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters.
Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance.
To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used.
However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen.
In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning.
Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety.
It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability.
Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.
Grammatical Evolution (GE) is a population-based evolutionary algorithm, where a formal grammar is used in the genotype to phenotype mapping process.
PonyGE2 is an open source implementation of GE in Python, developed at UCD's Natural Computing Research and Applications group.
It is intended as an advertisement and a starting-point for those new to GE, a reference for students and researchers, a rapid-prototyping medium for our own experiments, and a Python workout.
As well as providing the characteristic genotype to phenotype mapping of GE, a search algorithm engine is also provided.
A number of sample problems and tutorials on how to use and adapt PonyGE2 have been developed.
Handwritten recognition (HWR) is the ability of a computer to receive and interpret intelligible handwritten input from source such as paper documents, photographs, touch-screens and other devices.
In this paper we will using three (3) classification t o re cognize the handwritten which is SVM, KNN and Neural Network.
Next Point-of-Interest (POI) recommendation is of great value for both location-based service providers and users.
Recently Recurrent Neural Networks (RNNs) have been proved to be effective on sequential recommendation tasks.
However, existing RNN solutions rarely consider the spatio-temporal intervals between neighbor check-ins, which are essential for modeling user check-in behaviors in next POI recommendation.
In this paper, we propose a new variant of LSTM, named STLSTM, which implements time gates and distance gates into LSTM to capture the spatio-temporal relation between successive check-ins.
Specifically, one-time gate and one distance gate are designed to control short-term interest update, and another time gate and distance gate are designed to control long-term interest update.
Furthermore, to reduce the number of parameters and improve efficiency, we further integrate coupled input and forget gates with our proposed model.
Finally, we evaluate the proposed model using four real-world datasets from various location-based social networks.
Our experimental results show that our model significantly outperforms the state-of-the-art approaches for next POI recommendation.
Many imaging tasks require global information about all pixels in an image.
Conventional bottom-up classification networks globalize information by decreasing resolution; features are pooled and downsampled into a single output.
But for semantic segmentation and object detection tasks, a network must provide higher-resolution pixel-level outputs.
To globalize information while preserving resolution, many researchers propose the inclusion of sophisticated auxiliary blocks, but these come at the cost of a considerable increase in network size and computational cost.
This paper proposes stacked u-nets (SUNets), which iteratively combine features from different resolution scales while maintaining resolution.
SUNets leverage the information globalization power of u-nets in a deeper network architectures that is capable of handling the complexity of natural images.
SUNets perform extremely well on semantic segmentation tasks using a small number of parameters.
The safety, mobility, environmental, energy, and economic benefits of transportation systems, which are the focus of recent Connected Vehicles (CVs) programs, are potentially dramatic.
However, realization of these benefits largely hinges on the timely integration of the digital technology into the existing transportation infrastructure.
CVs must be enabled to broadcast and receive data to and from other CVs (Vehicle-to-Vehicle, or V2V communication), to and from infrastructure (Vehicle-to-Infrastructure, or V2I, communication) and to and from other road users, such as bicyclists or pedestrians (Vehicle-to-Other road users communication).
Further, for V2I-focused applications, the infrastructure and the transportation agencies that manage it must be able to collect, process, distribute, and archive these data quickly, reliably, and securely.
This paper focuses V2I applications, and studies current digital roadway infrastructure initiatives.
It highlights the importance of including digital infrastructure investment alongside investment in more traditional transportation infrastructure to keep up with the auto industrys push towards connecting vehicles to other vehicles.
By studying the current CV testbeds and Smart City initiatives, this paper identifies digital infrastructure components being used by public agencies.
It also examines public agencies limited budgeting for digital infrastructure, and finds current expenditure is inadequate for realizing the potential benefits of V2I applications.
Finally, the paper presents a set of recommendations, based on a review of current practices and future needs, designed to guide agencies responsible for transportation infrastructure.
Techniques for dense semantic correspondence have provided limited ability to deal with the geometric variations that commonly exist between semantically similar images.
While variations due to scale and rotation have been examined, there lack practical solutions for more complex deformations such as affine transformations because of the tremendous size of the associated solution space.
To address this problem, we present a discrete-continuous transformation matching (DCTM) framework where dense affine transformation fields are inferred through a discrete label optimization in which the labels are iteratively updated via continuous regularization.
In this way, our approach draws solutions from the continuous space of affine transformations in a manner that can be computed efficiently through constant-time edge-aware filtering and a proposed affine-varying CNN-based descriptor.
Experimental results show that this model outperforms the state-of-the-art methods for dense semantic correspondence on various benchmarks.
Regular Path Queries (RPQs) are a type of graph query where answers are pairs of nodes connected by a sequence of edges matching a regular expression.
We study the techniques to process such queries on a distributed graph of data.
While many techniques assume the location of each data element (node or edge) is known, when the components of the distributed system are autonomous, the data will be arbitrarily distributed.
As the different query processing strategies are equivalently costly in the worst case, we isolate query-dependent cost factors and present a method to choose between strategies, using new query cost estimation techniques.
We evaluate our techniques using meaningful queries on biomedical data.
Sketching is an important activity for understanding, designing, and communicating different aspects of software systems such as their requirements or architecture.
Often, sketches start on paper or whiteboards, are revised, and may evolve into a digital version.
Users may then print a revised sketch, change it on paper, and digitize it again.
Existing tools focus on a paperless workflow, i.e., archiving analog documents, or rely on special hardware - they do not focus on integrating digital versions into the analog-focused workflow that many users follow.
In this paper, we present the conceptual design and a prototype of LivelySketches, a tool that supports the "round-trip" lifecycle of sketches from analog to digital and back.
The proposed workflow includes capturing both analog and digital sketches as well as relevant context information.
In addition, users can link sketches to other related sketches or documents.
They may access the linked artifacts and captured information using digital as well as augmented analog versions of the sketches.
We further present results from a formative user study with four students and outline possible directions for future work.
Cloud computing has emerged as a powerful and elastic platform for internet service hosting, yet it also draws concerns of the unpredictable performance of cloud-based services due to network congestion.
To offer predictable performance, the virtual cluster abstraction of cloud services has been proposed, which enables allocation and performance isolation regarding both computing resources and network bandwidth in a simplified virtual network model.
One issue arisen in virtual cluster allocation is the survivability of tenant services against physical failures.
Existing works have studied virtual cluster backup provisioning with fixed primary embeddings, but have not considered the impact of primary embeddings on backup resource consumption.
To address this issue, in this paper we study how to embed virtual clusters survivably in the cloud data center, by jointly optimizing primary and backup embeddings of the virtual clusters.
We formally define the survivable virtual cluster embedding problem.
We then propose a novel algorithm, which computes the most resource-efficient embedding given a tenant request.
Since the optimal algorithm has high time complexity, we further propose a faster heuristic algorithm, which is several orders faster than the optimal solution, yet able to achieve similar performance.
Besides theoretical analysis, we evaluate our algorithms via extensive simulations.
A bicolored rectangular family BRF is a collection of all axis-parallel rectangles contained in a given region Z of the plane formed by selecting a bottom-left corner from a set A and an upper-right corner from a set B.
We prove that the maximum independent set and the minimum hitting set of a BRF have the same cardinality and devise polynomial time algorithms to compute both.
As a direct consequence, we obtain the first polynomial time algorithm to compute minimum biclique covers, maximum cross-free matchings and jump numbers in a class of bipartite graphs that significantly extends convex bipartite graphs and interval bigraphs.
We also establish several connections between our work and other seemingly unrelated problems.
Furthermore, when the bicolored rectangular family is weighted, we show that the problem of finding the maximum weight of an independent set is NP-hard, and provide efficient algorithms to solve it on certain subclasses.
Unravelings are transformations from a conditional term rewriting system (CTRS, for short) over an original signature into an unconditional term rewriting systems (TRS, for short) over an extended signature.
They are not sound w.r.t. reduction for every CTRS, while they are complete w.r.t. reduction.
Here, soundness w.r.t. reduction means that every reduction sequence of the corresponding unraveled TRS, of which the initial and end terms are over the original signature, can be simulated by the reduction of the original CTRS.
In this paper, we show that an optimized variant of Ohlebusch's unraveling for a deterministic CTRS is sound w.r.t. reduction if the corresponding unraveled TRS is left-linear or both right-linear and non-erasing.
We also show that soundness of the variant implies that of Ohlebusch's unraveling.
Finally, we show that soundness of Ohlebusch's unraveling is the weakest in soundness of the other unravelings and a transformation, proposed by Serbanuta and Rosu, for (normal) deterministic CTRSs, i.e., soundness of them respectively implies that of Ohlebusch's unraveling.
Deep image translation methods have recently shown excellent results, outputting high-quality images covering multiple modes of the data distribution.
There has also been increased interest in disentangling the internal representations learned by deep methods to further improve their performance and achieve a finer control.
In this paper, we bridge these two objectives and introduce the concept of cross-domain disentanglement.
We aim to separate the internal representation into three parts.
The shared part contains information for both domains.
The exclusive parts, on the other hand, contain only factors of variation that are particular to each domain.
We achieve this through bidirectional image translation based on Generative Adversarial Networks and cross-domain autoencoders, a novel network component.
Our model offers multiple advantages.
We can output diverse samples covering multiple modes of the distributions of both domains, perform domain-specific image transfer and interpolation, and cross-domain retrieval without the need of labeled data, only paired images.
We compare our model to the state-of-the-art in multi-modal image translation and achieve better results for translation on challenging datasets as well as for cross-domain retrieval on realistic datasets.
This paper proposes a double-layered framework (or form of network) to integrate two mechanisms, termed consensus and conservation, achieving distributed solution of a linear equation.
The multi-agent framework considered in the paper is composed of clusters (which serve as a form of aggregating agent) and each cluster consists of a sub-network of agents.
By achieving consensus and conservation through agent-agent communications in the same cluster and cluster-cluster communications, distributed algorithms are devised for agents to cooperatively achieve a solution to the overall linear equation.
These algorithms outperform existing consensus-based algorithms, including but not limited to the following aspects: first, each agent does not have to know as much as a complete row or column of the overall equation; second, each agent only needs to control as few as two scalar states when the number of clusters and the number of agents are sufficiently large; third, the dimensions of agents' states in the proposed algorithms do not have to be the same (while in contrast, algorithms based on the idea of standard consensus inherently require all agents' states to be of the same dimension).
Both analytical proof and simulation results are provided to validate exponential convergence of the proposed distributed algorithms in solving linear equations.
This paper presents a double jaw hand for industrial assembly.
The hand comprises two orthogonal parallel grippers with different mechanisms.
The inner gripper is made of a crank-slider mechanism which is compact and able to firmly hold objects like shafts.
The outer gripper is made of a parallelogram that has large stroke to hold big objects like pulleys.
The two grippers are connected by a prismatic joint along the hand's approaching vector.
The hand is able to hold two objects and perform in-hand manipulation like pull-in (insertion) and push-out (ejection).
This paper presents the detailed design and implementation of the hand, and demonstrates the advantages by performing experiments on two sets of peg-in-multi-hole assembly tasks as parts of the World Robot Challenge (WRC) 2018 using a bimanual robot.
Dense local descriptors and machine learning have been used with success in several applications, like classification of textures, steganalysis, and forgery detection.
We develop a new image forgery detector building upon some descriptors recently proposed in the steganalysis field suitably merging some of such descriptors, and optimizing a SVM classifier on the available training set.
Despite the very good performance, very small forgeries are hardly ever detected because they contribute very little to the descriptors.
Therefore we also develop a simple, but extremely specific, copy-move detector based on region matching and fuse decisions so as to reduce the missing detection rate.
Overall results appear to be extremely encouraging.
Software Defined Networks (SDN) provide vital benefits to network administrators by offering global visibility and network-wide control over the switching infrastructure of the network.
It is rather much difficult to obtain the same benefits in the presence of middleboxes (MBs), due to (i) lack of a proper topology discovery mechanism in environments with a mix of forwarding devices and middleboxes.
(ii) lack of generic APIs to abstract and gain control on these rigid and heterogeneous third-party middleboxes (iii) lack of a generic network infrastructure framework to monitor and verify any specific device or path connectivity status in the network.
These limitations make automation of network operations such as, network-wide monitoring, policy enforcement and rule-placement much difficult to handle.
Hence, there is a greater urge even from middlebox vendors, to better handle the control and visibility aspects of the network in presence of middleboxes.
In this paper, we propose a Unified network infrastructure framework for gaining global network visibility, by discovering the network topology in the presence of middleboxes, along with a framework to support the end-to-end path connectivity verification, independent of SDN.
We have also addressed security aspects and provided necessary APIs to support our framework.
This paper presents a novel method to predict future human activities from partially observed RGB-D videos.
Human activity prediction is generally difficult due to its non-Markovian property and the rich context between human and environments.
We use a stochastic grammar model to capture the compositional structure of events, integrating human actions, objects, and their affordances.
We represent the event by a spatial-temporal And-Or graph (ST-AOG).
The ST-AOG is composed of a temporal stochastic grammar defined on sub-activities, and spatial graphs representing sub-activities that consist of human actions, objects, and their affordances.
Future sub-activities are predicted using the temporal grammar and Earley parsing algorithm.
The corresponding action, object, and affordance labels are then inferred accordingly.
Extensive experiments are conducted to show the effectiveness of our model on both semantic event parsing and future activity prediction.
The LLVM compiler framework supports a selection of loop transformations such as vectorization, distribution and unrolling.
Each transformation is carried-out by specialized passes that have been developed independently.
In this paper we propose an integrated approach to loop optimizations: A single dedicated pass that mutates a Loop Structure DAG.
Each transformation can make use of a common infrastructure such as dependency analysis, transformation preconditions, etc.
Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora.
Most models require parallel or comparable training corpora, which limits their ability to generalize.
In this paper, we first demystify the knowledge transfer mechanism behind multilingual topic models by defining an alternative but equivalent formulation.
Based on this analysis, we then relax the assumption of training data required by most existing models, creating a model that only requires a dictionary for training.
Experiments show that our new method effectively learns coherent multilingual topics from partially and fully incomparable corpora with limited amounts of dictionary resources.
Web intelligence can be considered as a subset of Artificial Intelligence.
It uses existing data in web to produce new data, knowledge and wisdom to support decision making and new predictions for web users.
Artificial Intelligence is ever changing and evolving field of computer science and it is extensively used in wide array of web based business applications.
Although it is used substantially in web based systems in developed countries, it is not examined whether it is being substantially used in Sri Lanka.
Every Sri Lankan citizen depends on Public Service more or less throughout his/ her life time and at least more than 3 times: at birth, marriage and death.
So providing most of these services to its citizen, Sri Lankan Government uses more or less of its country web portal.
This paper presents a model to evaluate web intelligence capability based on weight to key functionalities with respect to web intelligence.
The government websites were checked by the proposed criteria to show the potential of using web intelligent technology to provide website based services.
The result indicates that the use of web intelligence techniques openly and publicly to provide web based services through government web portal to its citizens is not satisfactory.
It also indicates that lack of using the technologies pertaining to web intelligence in the public service web hinders the most of the advantages that citizen and government can gain from such technological involvement.
In many advanced video based applications background modeling is a pre-processing step to eliminate redundant data, for instance in tracking or video surveillance applications.
Over the past years background subtraction is usually based on low level or hand-crafted features such as raw color components, gradients, or local binary patterns.
The background subtraction algorithms performance suffer in the presence of various challenges such as dynamic backgrounds, photometric variations, camera jitters, and shadows.
To handle these challenges for the purpose of accurate background modeling we propose a unified framework based on the algorithm of image inpainting.
It is an unsupervised visual feature learning hybrid Generative Adversarial algorithm based on context prediction.
We have also presented the solution of random region inpainting by the fusion of center region inpaiting and random region inpainting with the help of poisson blending technique.
Furthermore we also evaluated foreground object detection with the fusion of our proposed method and morphological operations.
The comparison of our proposed method with 12 state-of-the-art methods shows its stability in the application of background estimation and foreground detection.
Automatic Speech Recognition (ASR) by machine is an attractive research topic in signal processing domain and has attracted many researchers to contribute in this area.
In recent year, there have been many advances in automatic speech reading system with the inclusion of audio and visual speech features to recognize words under noisy conditions.
The objective of audio-visual speech recognition system is to improve recognition accuracy.
In this paper we computed visual features using Zernike moments and audio feature using Mel Frequency Cepstral Coefficients (MFCC) on vVISWa (Visual Vocabulary of Independent Standard Words) dataset which contains collection of isolated set of city names of 10 speakers.
The visual features were normalized and dimension of features set was reduced by Principal Component Analysis (PCA) in order to recognize the isolated word utterance on PCA space.The performance of recognition of isolated words based on visual only and audio only features results in 63.88 and 100 respectively.
Recently, topic modeling has been widely used to discover the abstract topics in text corpora.
Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words.
However, the assumption is not optimal.
Intuitively, it's more reasonable to assume that each topic is a probability distribution over concepts, and then each concept is a probability distribution over words, i.e. adding a latent concept layer between topic layer and word layer in traditional three-layer assumption.
In this paper, we verify the proposed assumption by incorporating the new assumption in two representative topic models, and obtain two novel topic models.
Extensive experiments were conducted among the proposed models and corresponding baselines, and the results show that the proposed models significantly outperform the baselines in terms of case study and perplexity, which means the new assumption is more reasonable than traditional one.
We describe a simple but effective method for cross-lingual syntactic transfer of dependency parsers, in the scenario where a large amount of translation data is not available.
The method makes use of three steps: 1) a method for deriving cross-lingual word clusters, which can then be used in a multilingual parser; 2) a method for transferring lexical information from a target language to source language treebanks; 3) a method for integrating these steps with the density-driven annotation projection method of Rasooli and Collins (2015).
Experiments show improvements over the state-of-the-art in several languages used in previous work, in a setting where the only source of translation data is the Bible, a considerably smaller corpus than the Europarl corpus used in previous work.
Results using the Europarl corpus as a source of translation data show additional improvements over the results of Rasooli and Collins (2015).
We conclude with results on 38 datasets from the Universal Dependencies corpora.
This note explores the relation between the boxicity of undirected graphs and the Ferrers dimension of digraphs.
The unmanned air-vehicle (UAV) or mini-drones equipped with sensors are becoming increasingly popular for various commercial, industrial, and public-safety applications.
However, drones with uncontrolled deployment poses challenges for highly security-sensitive areas such as President house, nuclear plants, and commercial areas because they can be used unlawfully.
In this article, to cope with security-sensitive challenges, we propose point-to-point and flying ad-hoc network (FANET) architectures to assist the efficient deployment of monitoring drones (MDr).
To capture amateur drone (ADr), MDr must have the capability to efficiently and timely detect, track, jam, and hunt the ADr.
We discuss the capabilities of the existing detection, tracking, localization, and routing schemes and also present the limitations in these schemes as further research challenges.
Moreover, the future challenges related to co-channel interference, channel model design, and cooperative schemes are discussed.
Our findings indicate that MDr deployment is necessary for caring of ADr, and intensive research and development is required to fill the gaps in the existing technologies.
Compressive sensing (CS) has recently emerged as a powerful framework for acquiring sparse signals.
The bulk of the CS literature has focused on the case where the acquired signal has a sparse or compressible representation in an orthonormal basis.
In practice, however, there are many signals that cannot be sparsely represented or approximated using an orthonormal basis, but that do have sparse representations in a redundant dictionary.
Standard results in CS can sometimes be extended to handle this case provided that the dictionary is sufficiently incoherent or well-conditioned, but these approaches fail to address the case of a truly redundant or overcomplete dictionary.
In this paper we describe a variant of the iterative recovery algorithm CoSaMP for this more challenging setting.
We utilize the D-RIP, a condition on the sensing matrix analogous to the well-known restricted isometry property.
In contrast to prior work, the method and analysis are "signal-focused"; that is, they are oriented around recovering the signal rather than its dictionary coefficients.
Under the assumption that we have a near-optimal scheme for projecting vectors in signal space onto the model family of candidate sparse signals, we provide provable recovery guarantees.
Developing a practical algorithm that can provably compute the required near-optimal projections remains a significant open problem, but we include simulation results using various heuristics that empirically exhibit superior performance to traditional recovery algorithms.
We examine some variants of computation with closed timelike curves (CTCs), where various restrictions are imposed on the memory of the computer, and the information carrying capacity and range of the CTC.
We give full characterizations of the classes of languages recognized by polynomial time probabilistic and quantum computers that can send a single classical bit to their own past.
Such narrow CTCs are demonstrated to add the power of limited nondeterminism to deterministic computers, and lead to exponential speedup in constant-space probabilistic and quantum computation.
We show that, given a time machine with constant negative delay, one can implement CTC-based computations without the need to know about the runtime beforehand.
Learning to optimize - the idea that we can learn from data algorithms that optimize a numerical criterion - has recently been at the heart of a growing number of research efforts.
One of the most challenging issues within this approach is to learn a policy that is able to optimize over classes of functions that are fairly different from the ones that it was trained on.
We propose a novel way of framing learning to optimize as a problem of learning a good navigation policy on a partially observable loss surface.
To this end, we develop Rover Descent, a solution that allows us to learn a fairly broad optimization policy from training on a small set of prototypical two-dimensional surfaces that encompasses the classically hard cases such as valleys, plateaus, cliffs and saddles and by using strictly zero-order information.
We show that, without having access to gradient or curvature information, we achieve state-of-the-art convergence speed on optimization problems not presented at training time such as the Rosenbrock function and other hard cases in two dimensions.
We extend our framework to optimize over high dimensional landscapes, while still handling only two-dimensional local landscape information and show good preliminary results.
We have employed a recent implementation of genetic algorithms to study a range of standard benchmark functions for global optimization.
It turns out that some of them are not very useful as challenging test functions, since they neither allow for a discrimination between different variants of genetic operators nor exhibit a dimensionality scaling resembling that of real-world problems, for example that of global structure optimization of atomic and molecular clusters.
The latter properties seem to be simulated better by two other types of benchmark functions.
One type is designed to be deceptive, exemplified here by Lunacek's function.
The other type offers additional advantages of markedly increased complexity and of broad tunability in search space characteristics.
For the latter type, we use an implementation based on randomly distributed Gaussians.
We advocate the use of the latter types of test functions for algorithm development and benchmarking.
Motivated by recently derived fundamental limits on total (transmit + decoding) power for coded communication with VLSI decoders, this paper investigates the scaling behavior of the minimum total power needed to communicate over AWGN channels as the target bit-error-probability tends to zero.
We focus on regular-LDPC codes and iterative message-passing decoders.
We analyze scaling behavior under two VLSI complexity models of decoding.
One model abstracts power consumed in processing elements ("node model"), and another abstracts power consumed in wires which connect the processing elements ("wire model").
We prove that a coding strategy using regular-LDPC codes with Gallager-B decoding achieves order-optimal scaling of total power under the node model.
However, we also prove that regular-LDPC codes and iterative message-passing decoders cannot meet existing fundamental limits on total power under the wire model.
Further, if the transmit energy-per-bit is bounded, total power grows at a rate that is worse than uncoded transmission.
Complementing our theoretical results, we develop detailed physical models of decoding implementations using post-layout circuit simulations.
Our theoretical and numerical results show that approaching fundamental limits on total power requires increasing the complexity of both the code design and the corresponding decoding algorithm as communication distance is increased or error-probability is lowered.
In the context of 3D mapping, larger and larger point clouds are acquired with LIDAR sensors.
The Iterative Closest Point (ICP) algorithm is used to align these point clouds.
However, its complexity is directly dependent of the number of points to process.
Several strategies exist to address this problem by reducing the number of points.
However, they tend to underperform with non-uniform density, large sensor noise, spurious measurements, and large-scale point clouds, which is the case in mobile robotics.
This paper presents a novel sampling algorithm for registration in ICP algorithm based on spectral decomposition analysis and called Spectral Decomposition Filter (SpDF).
It preserves geometric information along the topology of point clouds and is able to scale to large environments with non-uniform density.
The effectiveness of our method is validated and illustrated by quantitative and qualitative experiments on various environments.
We describe a method for attaching persistent metadata to an image.
The method can be interpreted as a template-based blind watermarking scheme, robust to common editing operations, namely: cropping, rotation, scaling, stretching, shearing, compression, printing, scanning, noise, and color removal.
Robustness is achieved through the reciprocity of the embedding and detection invariants.
The embedded patterns are real onedimensional Mellin monomial patterns distributed over two-dimensions.
The embedded patterns are scale invariant and can be directly embedded in an image by simple pixel addition.
Detection achieves rotation and general affine invariance by signal projection using implicit Radon transformation.
Embedded signals contract to one-dimension in the two-dimensional Fourier polar domain.
The real signals are detected by correlation with complex Mellin monomial templates.
Using a unique template of 4 chirp patterns we detect the affine signature with exquisite sensitivity and moderate security.
The practical implementation achieves efficiencies through fast Fourier transform (FFT) correspondences such as the projection-slice theorem, the FFT correlation relation, and fast resampling via the chirp-z transform.
The overall method utilizes orthodox spread spectrum patterns for the payload and performs well in terms of the classic robustness-capacity-visibility performance triangle.
Tags are entirely imperceptible with a mean SSIM greater than 0.988 in all cases tested.
Watermarked images survive almost all Stirmark attacks.
The method is ideal for attaching metadata robustly to both digital and analogue images.
The process of designing neural architectures requires expert knowledge and extensive trial and error.
While automated architecture search may simplify these requirements, the recurrent neural network (RNN) architectures generated by existing methods are limited in both flexibility and components.
We propose a domain-specific language (DSL) for use in automated architecture search which can produce novel RNNs of arbitrary depth and width.
The DSL is flexible enough to define standard architectures such as the Gated Recurrent Unit and Long Short Term Memory and allows the introduction of non-standard RNN components such as trigonometric curves and layer normalization.
Using two different candidate generation techniques, random search with a ranking function and reinforcement learning, we explore the novel architectures produced by the RNN DSL for language modeling and machine translation domains.
The resulting architectures do not follow human intuition yet perform well on their targeted tasks, suggesting the space of usable RNN architectures is far larger than previously assumed.
Optimization on manifolds is a rapidly developing branch of nonlinear optimization.
Its focus is on problems where the smooth geometry of the search space can be leveraged to design efficient numerical algorithms.
In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints.
Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on.
The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms.
We aim particularly at reaching practitioners outside our field.
Special scattered subwords, in which the gaps are of length from a given set, are defined.
The scattered subword complexity, which is the number of such scattered subwords, is computed for rainbow words.
This paper presents a wp-style calculus for obtaining expectations on the outcomes of (mutually) recursive probabilistic programs.
We provide several proof rules to derive one-- and two--sided bounds for such expectations, and show the soundness of our wp-calculus with respect to a probabilistic pushdown automaton semantics.
We also give a wp-style calculus for obtaining bounds on the expected runtime of recursive programs that can be used to determine the (possibly infinite) time until termination of such programs.
The problem of state reconstruction and estimation is considered for a class of switched dynamical systems whose subsystems are modeled using linear differential-algebraic equations (DAEs).
Since this system class imposes time-varying dynamic and static (in the form of algebraic constraints) relations on the evolution of state trajectories, an appropriate notion of observability is presented which accommodates these phenomena.
Based on this notion, we first derive a formula for the reconstruction of the state of the system where we explicitly obtain an injective mapping from the output to the state.
In practice, such a mapping may be difficult to realize numerically and hence a class of estimators is proposed which ensures that the state estimate converges asymptotically to the real state of the system.
Real-world multi-agent planning problems cannot be solved using decision-theoretic planning methods due to the exponential complexity.
We approximate firefighting in rescue simulation as a spatially distributed task and model with multi-agent Markov decision process.
We use recent approximation methods for spatial task problems to reduce the model complexity.
Our approximations are single-agent, static task, shortest path pruning, dynamic planning horizon, and task clustering.
We create scenarios from RoboCup Rescue Simulation maps and evaluate our methods on these graph worlds.
The results show that our approach is faster and better than comparable methods and has negligible performance loss compared to the optimal policy.
We also show that our method has a similar performance as DCOP methods on example RCRS scenarios.
In this paper, we will demonstrate how Manhattan structure can be exploited to transform the Simultaneous Localization and Mapping (SLAM) problem, which is typically solved by a nonlinear optimization over feature positions, into a model selection problem solved by a convex optimization over higher order layout structures, namely walls, floors, and ceilings.
Furthermore, we show how our novel formulation leads to an optimization procedure that automatically performs data association and loop closure and which ultimately produces the simplest model of the environment that is consistent with the available measurements.
We verify our method on real world data sets collected with various sensing modalities.
The paper presents original approach to concurrent optimization of the transmitting and receiving parts of adaptive communication systems (CS) with feedback channels.
The results of research show a possibility and the way of designing the systems transmitting the signals with a bit rate equal to the capacity of the forward channel under given bit-error rate (BER).
The results of work can be used for design of different classes of high-efficient low energy/size/cost CS, as well as allow further development and extension.
Although the CSP (constraint satisfaction problem) is NP-complete, even in the case when all constraints are binary, certain classes of instances are tractable.
We study classes of instances defined by excluding subproblems.
This approach has recently led to the discovery of novel tractable classes.
The complete characterisation of all tractable classes defined by forbidding patterns (where a pattern is simply a compact representation of a set of subproblems) is a challenging problem.
We demonstrate a dichotomy in the case of forbidden patterns consisting of either one or two constraints.
This has allowed us to discover new tractable classes including, for example, a novel generalisation of 2SAT.
We investigate weak recognizability of deterministic languages of infinite trees.
We prove that for deterministic languages the Borel hierarchy and the weak index hierarchy coincide.
Furthermore, we propose a procedure computing for a deterministic automaton an equivalent minimal index weak automaton with a quadratic number of states.
The algorithm works within the time of solving the emptiness problem.
In this work, we propose a novel framework for privacy-preserving client-distributed machine learning.
It is motivated by the desire to achieve differential privacy guarantees in the local model of privacy in a way that satisfies all systems constraints using asynchronous client-server communication and provides attractive model learning properties.
We call it "Draw and Discard" because it relies on random sampling of models for load distribution (scalability), which also provides additional server-side privacy protections and improved model quality through averaging.
We present the mechanics of client and server components of "Draw and Discard" and demonstrate how the framework can be applied to learning Generalized Linear models.
We then analyze the privacy guarantees provided by our approach against several types of adversaries and showcase experimental results that provide evidence for the framework's viability in practical deployments.
We present a language independent, unsupervised approach for transforming word embeddings from source language to target language using a transformation matrix.
Our model handles the problem of data scarcity which is faced by many languages in the world and yields improved word embeddings for words in the target language by relying on transformed embeddings of words of the source language.
We initially evaluate our approach via word similarity tasks on a similar language pair - Hindi as source and Urdu as the target language, while we also evaluate our method on French and German as target languages and English as source language.
Our approach improves the current state of the art results - by 13% for French and 19% for German.
For Urdu, we saw an increment of 16% over our initial baseline score.
We further explore the prospects of our approach by applying it on multiple models of the same language and transferring words between the two models, thus solving the problem of missing words in a model.
We evaluate this on word similarity and word analogy tasks.
Image Forensics has already achieved great results for the source camera identification task on images.
Standard approaches for data coming from Social Network Platforms cannot be applied due to different processes involved (e.g., scaling, compression, etc.).
Over 1 billion images are shared each day on the Internet and obtaining information about their history from the moment they were acquired could be exploited for investigation purposes.
In this paper, a classification engine for the reconstruction of the history of an image, is presented.
Specifically, exploiting K-NN and decision trees classifiers and a-priori knowledge acquired through image analysis, we propose an automatic approach that can understand which Social Network Platform has processed an image and the software application used to perform the image upload.
The engine makes use of proper alterations introduced by each platform as features.
Results, in terms of global accuracy on a dataset of 2720 images, confirm the effectiveness of the proposed strategy.
Mined Semantic Analysis (MSA) is a novel concept space model which employs unsupervised learning to generate semantic representations of text.
MSA represents textual structures (terms, phrases, documents) as a Bag of Concepts (BoC) where concepts are derived from concept rich encyclopedic corpora.
Traditional concept space models exploit only target corpus content to construct the concept space.
MSA, alternatively, uncovers implicit relations between concepts by mining for their associations (e.g., mining Wikipedia's "See also" link graph).
We evaluate MSA's performance on benchmark datasets for measuring semantic relatedness of words and sentences.
Empirical results show competitive performance of MSA compared to prior state-of-the-art methods.
Additionally, we introduce the first analytical study to examine statistical significance of results reported by different semantic relatedness methods.
Our study shows that, the nuances of results across top performing methods could be statistically insignificant.
The study positions MSA as one of state-of-the-art methods for measuring semantic relatedness, besides the inherent interpretability and simplicity of the generated semantic representation.
This paper presents a deep-learning based framework for addressing the problem of accurate cloud detection in remote sensing images.
This framework benefits from a Fully Convolutional Neural Network (FCN), which is capable of pixel-level labeling of cloud regions in a Landsat 8 image.
Also, a gradient-based identification approach is proposed to identify and exclude regions of snow/ice in the ground truths of the training set.
We show that using the hybrid of the two methods (threshold-based and deep-learning) improves the performance of the cloud identification process without the need to manually correct automatically generated ground truths.
In average the Jaccard index and recall measure are improved by 4.36% and 3.62%, respectively.
Standardized corpora of undeciphered scripts, a necessary starting point for computational epigraphy, requires laborious human effort for their preparation from raw archaeological records.
Automating this process through machine learning algorithms can be of significant aid to epigraphical research.
Here, we take the first steps in this direction and present a deep learning pipeline that takes as input images of the undeciphered Indus script, as found in archaeological artifacts, and returns as output a string of graphemes, suitable for inclusion in a standard corpus.
The image is first decomposed into regions using Selective Search and these regions are classified as containing textual and/or graphical information using a convolutional neural network.
Regions classified as potentially containing text are hierarchically merged and trimmed to remove non-textual information.
The remaining textual part of the image is segmented using standard image processing techniques to isolate individual graphemes.
This set is finally passed to a second convolutional neural network to classify the graphemes, based on a standard corpus.
The classifier can identify the presence or absence of the most frequent Indus grapheme, the "jar" sign, with an accuracy of 92%.
Our results demonstrate the great potential of deep learning approaches in computational epigraphy and, more generally, in the digital humanities.
Rollating walkers are popular mobility aids used by older adults to improve balance control.
There is a need to automatically recognize the activities performed by walker users to better understand activity patterns, mobility issues and the context in which falls are more likely to happen.
We design and compare several techniques to recognize walker related activities.
A comprehensive evaluation with control subjects and walker users from a retirement community is presented.
Event-based state estimation can achieve estimation quality comparable to traditional time-triggered methods, but with a significantly lower number of samples.
In networked estimation problems, this reduction in sampling instants does, however, not necessarily translate into better usage of the shared communication resource.
Because typical event-based approaches decide instantaneously whether communication is needed or not, free slots cannot be reallocated immediately, and hence remain unused.
In this paper, novel predictive and self triggering protocols are proposed, which give the communication system time to adapt and reallocate freed resources.
From a unified Bayesian decision framework, two schemes are developed: self-triggers that predict, at the current triggering instant, the next one; and predictive triggers that indicate, at every time step, whether communication will be needed at a given prediction horizon.
The effectiveness of the proposed triggers in trading off estimation quality for communication reduction is compared in numerical simulations.
We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN).
Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units.
The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial details and the category-level semantic information.
Furthermore, we propose a feature compression technique that drastically reduces the memory requirements while maintaining the capability of feature representation.
Two-stage training (pre-training and fine-tuning) allows our network to handle any target object regardless of its category (even if the object's type does not belong to the pre-training data) or of variations in its appearance through a video sequence.
Experiments on large datasets demonstrate the effectiveness of our model - against related methods - in terms of accuracy, speed, and stability.
Finally, we introduce the transferability of our network to different domains, such as the infrared data domain.
Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough.
In this paper, we propose a novel approach to better leveraging monolingual data for neural machine translation by jointly learning source-to-target and target-to-source NMT models for a language pair with a joint EM optimization method.
The training process starts with two initial NMT models pre-trained on parallel data for each direction, and these two models are iteratively updated by incrementally decreasing translation losses on training data.
In each iteration step, both NMT models are first used to translate monolingual data from one language to the other, forming pseudo-training data of the other NMT model.
Then two new NMT models are learnt from parallel data together with the pseudo training data.
Both NMT models are expected to be improved and better pseudo-training data can be generated in next step.
Experiment results on Chinese-English and English-German translation tasks show that our approach can simultaneously improve translation quality of source-to-target and target-to-source models, significantly outperforming strong baseline systems which are enhanced with monolingual data for model training including back-translation.
While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost.
This is particularly important for deep learning since these learners need hours (to weeks) to train the model.
Such long training time limits the ability of (a) a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b) other researchers to repeat, improve, or even refute that original work.
For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together.
That deep learning system took 14 hours to execute.
We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results.
The DE approach terminated in 10 minutes; i.e.84 times faster hours than deep learning method.
We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis.
If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.
YouTube draws large number of users who contribute actively by uploading videos or commenting on existing videos.
However, being a crowd sourced and large content pushed onto it, there is limited control over the content.
This makes malicious users push content (videos and comments) which is inappropriate (unsafe), particularly when such content is placed around cartoon videos which are typically watched by kids.
In this paper, we focus on presence of unsafe content for children and users who promote it.
For detection of child unsafe content and its promoters, we perform two approaches, one based on supervised classification which uses an extensive set of video-level, user-level and comment-level features and another based Convolutional Neural Network using video frames.
Detection accuracy of 85.7% is achieved which can be leveraged to build a system to provide a safe YouTube experience for kids.
Through detailed characterization studies, we are able to successfully conclude that unsafe content promoters are less popular and engage less as compared with other users.
Finally, using a network of unsafe content promoters and other users based on their engagements (likes, subscription and playlist addition) and other factors, we find that unsafe content is present very close to safe content and unsafe content promoters form very close knit communities with other users, thereby further increasing the likelihood of a child getting getting exposed to unsafe content.
Facial expression recognition has been an active area in computer vision with application areas including animation, social robots, personalized banking, etc.
In this study, we explore the problem of image classification for detecting facial expressions based on features extracted from pre-trained convolutional neural networks trained on ImageNet database.
Features are extracted and transferred to a Linear Support Vector Machine for classification.
All experiments are performed on two publicly available datasets such as JAFFE and CK+ database.
The results show that representations learned from pre-trained networks for a task such as object recognition can be transferred, and used for facial expression recognition.
Furthermore, for a small dataset, using features from earlier layers of the VGG19 network provides better classification accuracy.
Accuracies of 92.26% and 92.86% were achieved for the CK+ and JAFFE datasets respectively.
This is the preprint version of our paper on JOMS.
In this paper, two mHealth applications are introduced, which can be employed as the terminals of bigdata based health service to collect information for electronic medical records (EMRs).
The first one is a hybrid system for improving the user experience in the hyperbaric oxygen chamber by 3D stereoscopic virtual reality glasses and immersive perception.
Several HMDs have been tested and compared.
The second application is a voice interactive serious game as a likely solution for providing assistive rehabilitation tool for therapists.
The recorder of the voice of patients could be analysed to evaluate the long-time rehabilitation results and further to predict the rehabilitation process.
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras.
To address this task, one typically requires a large amount labeled data for training an effective Re-ID model, which might not be practical for real-world applications.
To alleviate this limitation, we choose to exploit a sufficient amount of pre-existing labeled data from a different (auxiliary) dataset.
By jointly considering such an auxiliary dataset and the dataset of interest (but without label information), our proposed adaptation and re-identification network (ARN) performs unsupervised domain adaptation, which leverages information across datasets and derives domain-invariant features for Re-ID purposes.
In our experiments, we verify that our network performs favorably against state-of-the-art unsupervised Re-ID approaches, and even outperforms a number of baseline Re-ID methods which require fully supervised data for training.
End-to-end trained Recurrent Neural Networks (RNNs) have been successfully applied to numerous problems that require processing sequences, such as image captioning, machine translation, and text recognition.
However, RNNs often struggle to generalise to sequences longer than the ones encountered during training.
In this work, we propose to optimise neural networks explicitly for induction.
The idea is to first decompose the problem in a sequence of inductive steps and then to explicitly train the RNN to reproduce such steps.
Generalisation is achieved as the RNN is not allowed to learn an arbitrary internal state; instead, it is tasked with mimicking the evolution of a valid state.
In particular, the state is restricted to a spatial memory map that tracks parts of the input image which have been accounted for in previous steps.
The RNN is trained for single inductive steps, where it produces updates to the memory in addition to the desired output.
We evaluate our method on two different visual recognition problems involving visual sequences: (1) text spotting, i.e. joint localisation and reading of text in images containing multiple lines (or a block) of text, and (2) sequential counting of objects in aerial images.
We show that inductive training of recurrent models enhances their generalisation ability on challenging image datasets.
In order to generate prime implicants for a given cube (minterm), most of minimization methods increase the dimension of this cube by removing one literal from it at a time.
But there are two problems of exponential complexity.
One of them is the selection of the order in which the literals are to be removed from the implicant at hand.
The latter is the mechanism that checks whether a tentative literal removal is acceptable.
The reduced Offset concept has been developed to avoid of these problems.
This concept is based on positional-cube representation where each cube is represented by two n-bit strings.
We show that each reduced Off-cube may be represented by a single n-bit string and propose a set of bitwise operations to be performed on such strings.
The experiments on single-output benchmarks show that this approach can significantly speed up the minimization process, improve the quality of its results and reduce the amount of memory required for this aim.
The goal of compressive sensing is efficient reconstruction of data from few measurements, sometimes leading to a categorical decision.
If only classification is required, reconstruction can be circumvented and the measurements needed are orders-of-magnitude sparser still.
We define enhanced sparsity as the reduction in number of measurements required for classification over reconstruction.
In this work, we exploit enhanced sparsity and learn spatial sensor locations that optimally inform a categorical decision.
The algorithm solves an l1-minimization to find the fewest entries of the full measurement vector that exactly reconstruct the discriminant vector in feature space.
Once the sensor locations have been identified from the training data, subsequent test samples are classified with remarkable efficiency, achieving performance comparable to that obtained by discrimination using the full image.
Sensor locations may be learned from full images, or from a random subsample of pixels.
For classification between more than two categories, we introduce a coupling parameter whose value tunes the number of sensors selected, trading accuracy for economy.
We demonstrate the algorithm on example datasets from image recognition using PCA for feature extraction and LDA for discrimination; however, the method can be broadly applied to non-image data and adapted to work with other methods for feature extraction and discrimination.
In this paper, the author explore the challenges with respect to the security aspect in MANETs and propose a new approach which makes use of a bio-inspired methodology.
This paper elaborates various attacks which can be perpetrated on MANETs and current solutions to the aforementioned problems, and then it describes a Bio-Inspired Method which could be a possible solution to security issues in MANETs.
Internet of Things (IoT) systems have aroused enthusiasm and concerns.
Enthusiasm comes from their utilities in people daily life, and concerns may be associated with privacy issues.
By using two IoT systems as case-studies, we examine users' privacy beliefs, concerns and attitudes.
We focus on four major dimensions: the collection of personal data, the inference of new information, the exchange of information to third parties, and the risk-utility trade-off posed by the features of the system.
Altogether, 113 Brazilian individuals answered a survey about such dimensions.
Although their perceptions seem to be dependent on the context, there are recurrent patterns.
Our results suggest that IoT users can be classified into unconcerned, fundamentalists and pragmatists.
Most of them exhibit a pragmatist profile and believe in privacy as a right guaranteed by law.
One of the most privacy concerning aspect is the exchange of personal information to third parties.
Individuals' perceived risk is negatively correlated with their perceived utility in the features of the system.
We discuss practical implications of these results and suggest heuristics to cope with privacy concerns when designing IoT systems.
Economical and environmental concerns necessitate network engineers to focus on energy-efficient access network design.
The optical network units (ONUs), being predominantly responsible for the energy consumption of Ethernet Passive Optical Network (EPON), motivates us towards designing a novel protocol for saving energy at ONU.
The proposed protocol exploits different low power modes (LPM) and opts for the suitable one using traffic prediction.
This scheme provides a significant improvement of energy-efficiency especially at high load (~ 40%) over existing protocols.
A better understanding of the performance and a deeper insight into several design aspects can only be addressed through a detailed mathematical analysis.
The proposed protocol involves traffic prediction which infringes Markovian property.
However, some pragmatic assumptions along with a proper selection of observation instances and state descriptions allow us to form a Discrete Time Markov Chain (DTMC) of the proposed algorithm.
Thus, the primary objective of this paper is to propose a novel scheme for achieving energy-efficiency at ONU and mathematically analyze the performance of it with the help of a DTMC.
The analysis reveals that the energy-efficiency is more sensitive to the power consumption of doze mode as compared to other LPM while the effect of sleep-to-wake-up time is minor.
Unlike classification, position labels cannot be assigned manually by humans.
For this reason, generating supervision for precise object localization is a hard task.
This paper details a method to create large datasets for 3D object localization, with real world images, using an industrial robot to generate position labels.
By knowledge of the geometry of the robot, we are able to automatically synchronize the images of the two cameras and the object 3D position.
We applied it to generate a screw-driver localization dataset with stereo images, using a KUKA LBR iiwa robot.
This dataset could then be used to train a CNN regressor to learn end-to-end stereo object localization from a set of two standard uncalibrated cameras.
We investigate the fundamental capacity limits of space-time journeys of information in mobile and Delay Tolerant Networks (DTNs), where information is either transmitted or carried by mobile nodes, using store-carry-forward routing.
We define the capacity of a journey (i.e., a path in space and time, from a source to a destination) as the maximum amount of data that can be transferred from the source to the destination in the given journey.
Combining a stochastic model (conveying all possible journeys) and an analysis of the durations of the nodes' encounters, we study the properties of journeys that maximize the space-time information propagation capacity, in bit-meters per second.
More specifically, we provide theoretical lower and upper bounds on the information propagation speed, as a function of the journey capacity.
In the particular case of random way-point-like models (i.e., when nodes move for a distance of the order of the network domain size before changing direction), we show that, for relatively large journey capacities, the information propagation speed is of the same order as the mobile node speed.
This implies that, surprisingly, in sparse but large-scale mobile DTNs, the space-time information propagation capacity in bit-meters per second remains proportional to the mobile node speed and to the size of the transported data bundles, when the bundles are relatively large.
We also verify that all our analytical bounds are accurate in several simulation scenarios.
We present the University at Buffalo's Airborne Networking and Communications Testbed (UB-ANC Drone).
UB-ANC Drone is an open software/hardware platform that aims to facilitate rapid testing and repeatable comparative evaluation of airborne networking and communications protocols at different layers of the protocol stack.
It combines quadcopters capable of autonomous flight with sophisticated command and control capabilities and embedded software-defined radios (SDRs), which enable flexible deployment of novel communications and networking protocols.
This is in contrast to existing airborne network testbeds, which rely on standard inflexible wireless technologies, e.g., Wi-Fi or Zigbee.
UB-ANC Drone is designed with emphasis on modularity and extensibility, and is built around popular open-source projects and standards developed by the research and hobby communities.
This makes UB-ANC Drone highly customizable, while also simplifying its adoption.
In this paper, we describe UB-ANC Drone's hardware and software architecture.
This paper presents the IMS contribution to the PolEval 2018 Shared Task.
We submitted systems for both of the Subtasks of Task 1.
In Subtask (A), which was about dependency parsing, we used our ensemble system from the CoNLL 2017 UD Shared Task.
The system first preprocesses the sentences with a CRF POS/morphological tagger and predicts supertags with a neural tagger.
Then, it employs multiple instances of three different parsers and merges their outputs by applying blending.
The system achieved the second place out of four participating teams.
In this paper we show which components of the system were the most responsible for its final performance.
The goal of Subtask (B) was to predict enhanced graphs.
Our approach consisted of two steps: parsing the sentences with our ensemble system from Subtask (A), and applying 12 simple rules to obtain the final dependency graphs.
The rules introduce additional enhanced arcs only for tokens with "conj" heads (conjuncts).
They do not predict semantic relations at all.
The system ranked first out of three participating teams.
In this paper we show examples of rules we designed and analyze the relation between the quality of automatically parsed trees and the accuracy of the enhanced graphs.
The paper presents a technique to improve human detection in still images using deep learning.
Our novel method, ViS-HuD, computes visual saliency map from the image.
Then the input image is multiplied by the map and product is fed to the Convolutional Neural Network (CNN) which detects humans in the image.
A visual saliency map is generated using ML-Net and human detection is carried out using DetectNet.
ML-Net is pre-trained on SALICON while, DetectNet is pre-trained on ImageNet database for visual saliency detection and image classification respectively.
The CNNs of ViS-HuD were trained on two challenging databases - Penn Fudan and TUD-Brussels Benchmark.
Experimental results demonstrate that the proposed method achieves state-of-the-art performance on Penn Fudan Dataset with 91.4% human detection accuracy and it achieves average miss-rate of 53% on the TUDBrussels benchmark.
We present a framework for efficient inference in structured image models that explicitly reason about objects.
We achieve this by performing probabilistic inference using a recurrent neural network that attends to scene elements and processes them one at a time.
Crucially, the model itself learns to choose the appropriate number of inference steps.
We use this scheme to learn to perform inference in partially specified 2D models (variable-sized variational auto-encoders) and fully specified 3D models (probabilistic renderers).
We show that such models learn to identify multiple objects - counting, locating and classifying the elements of a scene - without any supervision, e.g., decomposing 3D images with various numbers of objects in a single forward pass of a neural network.
We further show that the networks produce accurate inferences when compared to supervised counterparts, and that their structure leads to improved generalization.
The SINTAGMA information integration system is an infrastructure for accessing several different information sources together.
Besides providing a uniform interface to the information sources (databases, web services, web sites, RDF resources, XML files), semantic integration is also needed.
Semantic integration is carried out by providing a high-level model and the mappings to the models of the sources.
When executing a query of the high level model, a query is transformed to a low-level query plan, which is a piece of Prolog code that answers the high-level query.
This transformation is done in two phases.
First, the Query Planner produces a plan as a logic formula expressing the low-level query.
Next, the Query Optimizer transforms this formula to executable Prolog code and optimizes it according to structural and statistical information about the information sources.
This article discusses the main ideas of the optimization algorithm and its implementation.
Hash-based message authentication codes are an extremely simple yet hugely effective construction for producing keyed message digests using shared secrets.
HMACs have seen widespread use as ad-hoc digital signatures in many Internet applications.
While messages signed with an HMAC are secure against sender impersonation and tampering in transit, if used alone they are susceptible to replay attacks.
We propose a construction that extends HMACs to produce a keyed message digest that has a finite validity period.
We then propose a message signature scheme that uses this time-dependent MAC along with an unique message identifier to calculate a set of authentication factors using which a recipient can readily detect and ignore replayed messages, thus providing perfect resistance against replay attacks.
We further analyse time-based message authentication codes and show that they provide stronger security guarantees than plain HMACs, even when used independently of the aforementioned replay attack resistant message signature scheme.
In many cases, government data is still "locked" in several "data silos", even within the boundaries of a single (inter-)national public organization with disparate and distributed organizational units and departments spread across multiple sites.
Opening data and enabling its unified querying from a single site in an efficient and effective way is a semantic application integration and open government data challenge.
This paper describes how NARA is using Semantic Web technology to implement an application integration approach within the boundaries of its organization via opening and querying multiple governmental data sources from a single site.
The generic approach proposed, namely S3-AI, provides support to answering unified, ontology-mediated, federated queries to data produced and exploited by disparate applications, while these are being located in different organizational sites.
S3-AI preserves ownership, autonomy and independency of applications and data.
The paper extensively demonstrates S3-AI, using the D2RQ and Fuseki technologies, for addressing the needs of a governmental "IT helpdesk support" case.
The governing equations for electromagneto-thermomechanical systems are well established and thoroughly derived in the literature, but have been limited to small deformations.
This assumption provides an "ease" in the formulation: electromagnetic fields are governed in a Eulerian frame, whereby the thermomechanics is solved in a Lagrangean frame.
It is possible to map the Eulerian frame to the current placement of the matter and the Lagrangean frame to a reference placement.
The assumption of small deformations eliminates the distinction between current and initial placement such that electromagnetism and thermomechanics are formulated in the same frame.
We present a rigorous and thermodynamically consistent derivation of governing equations for fully coupled electromagneto-thermomechanical systems properly handling finite deformations.
A clear separation of the different frames is necessary.
In this work, we solve thermomechanics in the Lagrangean frame and electromagnetism in the Eulerian frame and manage the interaction between the fields.
The approach is similar to its analog in fluid structure interaction, but additionally challenging because the electromagnetic governing equations must also be solved within the solid body while following their own different set of transformation rules.
We further present a mesh-morphing algorithm necessary to accommodate finite deformations to solve the electromagnetic fields outside of the material body.
We illustrate the use of the new formulation by developing an open-source implementation using the FEniCS package and applying this implementation to several engineering problems in electromagnetic structure interaction undergoing large deformations.
The task of drug-target interaction prediction holds significant importance in pharmacology and therapeutic drug design.
In this paper, we present FRnet-DTI, an auto encoder and a convolutional classifier for feature manipulation and drug target interaction prediction.
Two convolutional neural neworks are proposed where one model is used for feature manipulation and the other one for classification.
Using the first method FRnet-1, we generate 4096 features for each of the instances in each of the datasets and use the second method, FRnet-2, to identify interaction probability employing those features.
We have tested our method on four gold standard datasets exhaustively used by other researchers.
Experimental results shows that our method significantly improves over the state-of-the-art method on three of the four drug-target interaction gold standard datasets on both area under curve for Receiver Operating Characteristic(auROC) and area under Precision Recall curve(auPR) metric.
We also introduce twenty new potential drug-target pairs for interaction based on high prediction scores.
Codes Available: https: // github. com/ farshidrayhanuiu/ FRnet-DTI/ Web Implementation: http: // farshidrayhan. pythonanywhere. com/ FRnet-DTI/
Intelligent Transportation Systems (ITS) use data and information technology to improve the operation of our transportation network.
ITS contributes to sustainable development by using technology to make the transportation system more efficient; improving our environment by reducing emissions, reducing the need for new construction and improving our daily lives through reduced congestion.
A key component of ITS is traveler information.
The Oregon Department of Transportation (ODOT) recently implemented a new traveler information system on selected freeways to provide drivers with travel time estimates that allow them to make more informed decisions about routing to their destinations.
The ODOT project aims to improve traffic flow and promote efficient traffic movement, which can reduce emissions rates and improve air quality.
The new ODOT system is based on travel data collected from a recently-increased set of sensors installed on its freeways.
Our current project investigates novel data cleaning methodologies and the integration of those methodologies into the prediction of travel times.
We use machine learning techniques on our archive to identify suspect data, and calculate revised travel times excluding this suspect data.
We compare the resulting travel time predictions to ground-truth data, and to predictions based on simple, rule-based data cleaning.
We report on the results of our study using qualitative and quantitative methods.
The term Big Data is usually used to describe huge amount of data that is generated by humans from digital media such as cameras, internet, phones, sensors etc.
By building advanced analytics on the top of big data, one can predict many things about the user such as behavior, interest etc.
However before one can use the data, one has to address many issues for big data storage.
Two main issues are the need of large storage devices and the cost associated with it.
Synthetic DNA storage seems to be an appropriate solution to address these issues of the big data.
Recently in 2013, Goldman and his collegues from European Bioinformatics Institute demonstrated the use of the DNA as storage medium with capacity of storing 2.2 peta bytes of information on one gram of DNA and retrived the data successfully with low error rate.
This significant step shows a promise for synthetic DNA storage as a useful technology for the future data storage.
Motivated by this, we have developed a software called DNACloud which makes it easy to store the data on the DNA.
In this work, we present detailed description of the software.
The effects of molecularly targeted drug perturbations on cellular activities and fates are difficult to predict using intuition alone because of the complex behaviors of cellular regulatory networks.
An approach to overcoming this problem is to develop mathematical models for predicting drug effects.
Such an approach beckons for co-development of computational methods for extracting insights useful for guiding therapy selection and optimizing drug scheduling.
Here, we present and evaluate a generalizable strategy for identifying drug dosing schedules that minimize the amount of drug needed to achieve sustained suppression or elevation of an important cellular activity/process, the recycling of cytoplasmic contents through (macro)autophagy.
Therapeutic targeting of autophagy is currently being evaluated in diverse clinical trials but without the benefit of a control engineering perspective.
Using a nonlinear ordinary differential equation (ODE) model that accounts for activating and inhibiting influences among protein and lipid kinases that regulate autophagy (MTORC1, ULK1, AMPK and VPS34) and methods guaranteed to find locally optimal control strategies, we find optimal drug dosing schedules (open-loop controllers) for each of six classes of drugs and drug pairs.
Our approach is generalizable to designing monotherapy and multi therapy drug schedules that affect different cell signaling networks of interest.
We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent.
The key idea is to focus on those parts of the image that contain richer information and zoom on them.
We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows).
This procedure is iterated providing a hierarchical image analysis.We compare two different candidate proposal strategies to guide the object search: with and without overlap.
Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.
Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution.
We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.
The number of optimization techniques in the combinatorial domain is large and diversified.
Nevertheless, real-world based benchmarks for testing algorithms are few.
This work creates an extensible real-world mail delivery benchmark to the Vehicle Routing Problem (VRP) in a planar graph embedded in the 2D Euclidean space.
Such problem is multi-objective on a roadmap with up to 25 vehicles and 30,000 deliveries per day.
Each instance models one generic day of mail delivery, allowing both comparison and validation of optimization algorithms for routing problems.
The benchmark may be extended to model other scenarios.
Herein, the problem of simultaneous localization of two sources given a modest number of samples is examined.
In particular, the strategy does not require knowledge of the target signatures of the sources a priori, nor does it exploit classical methods based on a particular decay rate of the energy emitted from the sources as a function of range.
General structural properties of the signatures such as unimodality are exploited.
The algorithm localizes targets based on the rotated eigenstructure of a reconstructed observation matrix.
In particular, the optimal rotation can be found by maximizing the ratio of the dominant singular value of the observation matrix over the nuclear norm of the optimally rotated observation matrix.
It is shown that this ratio has a unique local maximum leading to computationally efficient search algorithms.
Moreover, analytical results are developed to show that the squared localization error decreases at a rate faster than the baseline scheme.
Women are dramatically underrepresented in computer science at all levels in academia and account for just 15% of tenure-track faculty.
Understanding the causes of this gender imbalance would inform both policies intended to rectify it and employment decisions by departments and individuals.
Progress in this direction, however, is complicated by the complexity and decentralized nature of faculty hiring and the non-independence of hires.
Using comprehensive data on both hiring outcomes and scholarly productivity for 2659 tenure-track faculty across 205 Ph.D.-granting departments in North America, we investigate the multi-dimensional nature of gender inequality in computer science faculty hiring through a network model of the hiring process.
Overall, we find that hiring outcomes are most directly affected by (i) the relative prestige between hiring and placing institutions and (ii) the scholarly productivity of the candidates.
After including these, and other features, the addition of gender did not significantly reduce modeling error.
However, gender differences do exist, e.g., in scholarly productivity, postdoctoral training rates, and in career movements up the rankings of universities, suggesting that the effects of gender are indirectly incorporated into hiring decisions through gender's covariates.
Furthermore, we find evidence that more highly ranked departments recruit female faculty at higher than expected rates, which appears to inhibit similar efforts by lower ranked departments.
These findings illustrate the subtle nature of gender inequality in faculty hiring networks and provide new insights to the underrepresentation of women in computer science.
In this paper we present the state of advancement of the French ANR WebStand project.
The objective of this project is to construct a customizable XML based warehouse platform to acquire, transform, analyze, store, query and export data from the web, in particular mailing lists, with the final intension of using this data to perform sociological studies focused on social groups of World Wide Web, with a specific emphasis on the temporal aspects of this data.
We are currently using this system to analyze the standardization process of the W3C, through its social network of standard setters.
We have witnessed the discovery of many techniques for network representation learning in recent years, ranging from encoding the context in random walks to embedding the lower order connections, to finding latent space representations with auto-encoders.
However, existing techniques are looking mostly into the local structures in a network, while higher-level properties such as global community structures are often neglected.
We propose a novel network representations learning model framework called RUM (network Representation learning throUgh Multi-level structural information preservation).
In RUM, we incorporate three essential aspects of a node that capture a network's characteristics in multiple levels: a node's affiliated local triads, its neighborhood relationships, and its global community affiliations.
Therefore the framework explicitly and comprehensively preserves the structural information of a network, extending the encoding process both to the local end of the structural information spectrum and to the global end.
The framework is also flexible enough to take various community discovery algorithms as its preprocessor.
Empirical results show that the representations learned by RUM have demonstrated substantial performance advantages in real-life tasks.
State-of-the-art slot filling models for goal-oriented human/machine conversational language understanding systems rely on deep learning methods.
While multi-task training of such models alleviates the need for large in-domain annotated datasets, bootstrapping a semantic parsing model for a new domain using only the semantic frame, such as the back-end API or knowledge graph schema, is still one of the holy grail tasks of language understanding for dialogue systems.
This paper proposes a deep learning based approach that can utilize only the slot description in context without the need for any labeled or unlabeled in-domain examples, to quickly bootstrap a new domain.
The main idea of this paper is to leverage the encoding of the slot names and descriptions within a multi-task deep learned slot filling model, to implicitly align slots across domains.
The proposed approach is promising for solving the domain scaling problem and eliminating the need for any manually annotated data or explicit schema alignment.
Furthermore, our experiments on multiple domains show that this approach results in significantly better slot-filling performance when compared to using only in-domain data, especially in the low data regime.
Real-time semantic segmentation plays an important role in practical applications such as self-driving and robots.
Most semantic segmentation research work focuses on improving estimation accuracy with little consideration on the efficiency.
Several previous studies that emphasize high-speed inference often cannot produce high-accuracy segmentation results.
In this paper, we propose a novel convolutional network named Efficient Dense modules with Asymmetric convolution (EDANet), which employs an asymmetric convolution structure and incorporates the dilated convolution and the dense connectivity to achieve high efficiency at low computational cost and model size.
EDANet is 2.7 times faster than the existing fast segmentation network ICNet, while it achieves a similar mIoU score without any additional context module, post-processing scheme, and pretrained model.
We evaluate EDANet on Cityscapes and CamVid datasets and compare it with the other state-of-art systems.
Our network can run with the high-resolution inputs at the speed of 108 FPS on a single GTX 1080Ti card.
We propose a neural embedding algorithm called Network Vector, which learns distributed representations of nodes and the entire networks simultaneously.
By embedding networks in a low-dimensional space, the algorithm allows us to compare networks in terms of structural similarity and to solve outstanding predictive problems.
Unlike alternative approaches that focus on node level features, we learn a continuous global vector that captures each node's global context by maximizing the predictive likelihood of random walk paths in the network.
Our algorithm is scalable to real world graphs with many nodes.
We evaluate our algorithm on datasets from diverse domains, and compare it with state-of-the-art techniques in node classification, role discovery and concept analogy tasks.
The empirical results show the effectiveness and the efficiency of our algorithm.
Compute-and-forward (CAF) relaying is effective to increase bandwidth efficiency of wireless two-way relay channels.
In a CAF scheme, a relay is designed to decode a linear combination composed of transmitted messages from other terminals or relays.
Design for error-correcting codes and its decoding algorithms suitable for CAF relaying schemes remain as an important issue to be studied.
As described in this paper, we will present an asymptotic performance analysis of LDPC codes over two-way relay channels based on density evolution (DE).
Because of the asymmetric characteristics of the channel, we use the population dynamics DE combined with DE formulas for asymmetric channels to obtain BP thresholds.
Additionally, we also evaluate the asymptotic performance of spatially coupled LDPC codes for two-way relay channels.
The results indicate that the spatial coupling codes yield improvements in the BP threshold compared with corresponding uncoupled codes for two-way relay channels.
Finally, we will compare the mutual information rate and rate achievability between the CAF scheme and the MAC separation decoding scheme.
We demonstrate the possibility that the CAF scheme has higher reliability in the high-rate region.
