Textual style transfer involves modifying the style of a text while preserving its content. This assumes that it is possible to separate style from content. This paper investigates whether this separation is possible. We use sentiment transfer as our case study for style transfer analysis. Our experimental methodology frames style transfer as a multi-objective problem, balancing style shift with content preservation and fluency. Due to the lack of parallel data for style transfer we employ a variety of adversarial encoder-decoder networks in our experiments. Also, we use of a probing methodology to analyse how these models encode style-related features in their latent spaces. The results of our experiments which are further confirmed by a human evaluation reveal the inherent trade-off between the multiple style transfer objectives which indicates that style cannot be usefully separated from content within these style-transfer systems.
Despite success on a wide range of problems related to vision, generative adversarial networks (GANs) often suffer from inferior performance due to unstable training, especially for text generation. To solve this issue, we propose a new variational GAN training framework which enjoys superior training stability. Our approach is inspired by a connection of GANs and reinforcement learning under a variational perspective. The connection leads to (1) probability ratio clipping that regularizes generator training to prevent excessively large updates, and (2) a sample re-weighting mechanism that improves discriminator training by downplaying bad-quality fake samples. Moreover, our variational GAN framework can provably overcome the training issue in many GANs that an optimal discriminator cannot provide any informative gradient to training generator. By plugging the training approach in diverse state-of-the-art GAN architectures, we obtain significantly improved performance over a range of tasks, including text generation, text style transfer, and image generation.
Self-driving is one of the most challenging tasks of the current time. Human or ai driver needs to localize the car on the map, detect static obstacles and dynamic objects, predict their moves, and choose a good vehicle trajectory. I will tell you about modern approaches to these problems. During my presentation, I’m going to talk about algorithms for 3D object detection, prediction, localization and motion planning, and how those algorithms help us to overcome the challenges we’re facing on the way to full autonomy.
Suppose you are monitoring discrete events in real time. Can you predict what events will happen in the future, and when? Can you fill in past events that you may have missed? A probability model that supports such reasoning is the neural Hawkes process (NHP), in which the Poisson intensities of K event types at time t depend on the history of past events. This autoregressive architecture can capture complex dependencies. It resembles an LSTM language model over K word types, but allows the LSTM state to evolve in continuous time.
This talk will present the NHP model along with methods for estimating parameters (MLE and NCE), sampling predictions of the future (thinning), and imputing missing events (particle smoothing). I'll then show how to scale the NHP or the LSTM language model to large K, beginning with a temporal deductive database for a real-world domain, which can track how possible event types and other facts change over time as new events occur. We take the system state to be a collection of vector-space embeddings of these facts, and derive a deep recurrent architecture from the temporal Datalog program that specifies the database. We call this method "neural Datalog through time."
This work was done with Hongyuan Mei and other collaborators including Guanghui Qin, Minjie Xu, and Tom Wan.
The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by few explanatory factors of variation (e.g. content + position of objects in an image) which can be recovered by unsupervised learning algorithms. A recent line of work argued that learning such representations offer several benefits, ranging from reduced sample complexity of downstream tasks to interpretability.
In this talk, I will discuss the recent progress in the field and challenge some common assumptions. I will address the role of inductive biases in the theoretical impossibility of the unsupervised learning of disentangled representations and provide a sober look at the performances of state-of-the-art approaches. I will further present how to go beyond the purely unsupervised setting in both theory and practice and discuss concrete applications to fairness and abstract visual reasoning.
Advances in information extraction have enabled the automatic construction of large knowledge graphs (KGs) like DBpedia, Freebase, YAGO and Wikidata. Learning rules from KGs is a crucial task for KG completion, cleaning and curation. This tutorial presents state-of-the-art rule induction methods, recent advances, research opportunities as well as open challenges along this avenue. We put a particular emphasis on the problems of learning exception-enriched rules from highly biased and incomplete data. Finally, we discuss possible extensions of classical rule induction techniques to account for unstructured resources (e.g., text) along with the structured ones.
In traditional computational complexity, there is a convention that if a problem has a polynomial time algorithm it is already a simple problem. Obviously, in practice there is a huge difference between an algorithm with running time n^2 and one with running time n^100. This observation gave rise to a more modern approach: fine-grained complexity. The goal of it is to pinpoint the hardness of a problem as precisely as possible. In this talk, we will cover some basic techniques used in fine-grained complexity and the limitations of these techniques. Then we will show that under widely believed assumptions we already have optimal algorithms for some of the cornerstone problems in Machine learning.
Domain adaptation problem aims at learning a well performing model, trained on a source data S (images, vectors, e.t.c), applied then to different (but related) target sample T. Aside from being attractive due to obvious practical utility, the setting is challenging from theoretical point of view. In this work we introduce a novel approach to supervised domain adaptation consisting in a class-dependent fitting based on ideas from optimal transportation (OT) theory which considers S and T as two mixtures of distributions. A parametrized OT distance is used as a fidelity measure between S and T, providing a toolbox for modelling of possibly independent perturbations of mixture components. The method is than used for describing the adaptation of immune system in humans after moving to another climatic zone.