Bag om End-user programming of virtual assistant skills and graphical user interfaces
Virtual assistants give end-users the capability to access their devices and web data using hundreds of thousands of predefined skills. Nonetheless, there is still a long-tail of personal digital tasks that people want to automate. This dissertation explores how end-users can define useful personalized skills without learning any formal programming languages. We empower an end user to create a web-based skill by demonstrating their skill in a web browser using natural language, their mouse, and keyboard. Our tool is the first program-by-demonstration system that produces programs with control constructs. The system gives the user an easy-to-learn multimodal interface and generates code in a formal programming language which supports parameterization, function invocation, conditionals, and iterative execution. We show that a virtual assistant skill can greatly benefit from having a graphical interface as users can monitor multiple queries simultaneously, re-run skills easily, and adjust settings using multiple modes of interaction. We developed a system that automatically translates a user's voice command into a reusable skill with a graphical user interface. Unlike the formulaic interfaces generated by previous work, we generate interfaces that are interesting and diverse by using a novel template-based approach. To improve the aesthetics of graphical user interfaces we use a technique called style transfer. We show that the previous formulation of style transfer cannot retain structure in an image, which causes the output result to lack definition and legibility, and renders restyled interfaces not usable. Our purely neural solution captures structure by the uncentered cross-covariance between features across different layers of a convolutional neural network. By minimizing the squared error in the structure between the style and output images, our technology retains structure while generating results with texture in the background, shadow and contrast in the borders, consistency of design across edges, and an overall cohesiveness to the design. In summary, our system enables end-users to create web-based skills with automatically generated graphical user interfaces.
Vis mere