In this paper, we present a novel motion retargeting system by using the deep autoencoder combining the Deep Convolution Inverse Graphics Network (DC-IGN) ([Kulkarni et al. 2015]) and the U-Net ([Long et al. 2015]) to produce high-quality human motion. The retargeted motion is fully-automatically and naturally generated from the given input motion and bone length ratios. To validate the proposed motion retargeting system, we conduct several experiments and achieve more accuracy and less computational burden when compared with the conventional motion retargeting approach and other neural network architectures.
Recently, the targets of projection mapping have included various objects ranging from buildings to fish. However, it is difficult to project onto dynamic and deformable objects, such as leaves. Therefore, we propose a semi-automatic system to calculate the image registrations of projections for leaves and to interactively track the projection area. We describe our results with some animated effects on various shapes of leaves (Fig. 1).
This talk introduces specific blurring algorithms tailored for cartoon animation. With our tools, the artists create various effects as shown in Figure 1: smooth transitions from color to shadow color, moody effects and cartoon motion lines.
It is known that a virtual static object on a display is perceived as moving when the object, to which the particular pattern of luminance contours is given, is presented against the background with dynamic luminance change. The present study reports a novel technique in which printed objects having particular luminance contours apparently move against a background with dynamic luminance modulations. Our technique can illusorily express not only horizontal translations but also the expansion-compression and rotation of the objects. We believe that our technique can be used to highlight static objects in paper advertisements by placing the objects against a background with dynamic luminance changes.
Recently, 2D animation (referring to Japanese animation in this paper) images have usually been created from 3DCG. However, even if 3DCG looks like 2D animation, the movements of the objects still seem like that of 3DCG. One of the reasons is that 3DCG movement has three-dimensionality. It was assumed that the frame rate and the change of object size according to the depth coordinates give three-dimensionality to 3DCG. Therefore, a subjective evaluation was conducted to determine whether 3DCG looks like 2D animation when the three-dimensionality caused by frame rate and the change of size according to the depth coordinates to change field of view (FoV) and projection method was decreased. As a result, it is possible to reduce the three-dimensionality, and it could be confirmed that the movement of 3DCG was more similar to that of 2D animation.
In this paper, we propose a method that allows users to deform 3D facial model projection images with intuitive parameters for the axes of feature spaces that represent the style used for drawing faces for anime. The spaces are derived from hand-drawn and 3DCG face images with Non-negative Matrix Factorization. The results show typical anime-style features are applied successfully to 3DCG face images.
Effect animations are used in many application, such as music videos or games. So, there are many templates of effects in softwares. However, it is difficult for an amateur to make an original animation. In this paper, we propose a deep learning based approach for generating an effect animation. This approach uses a next-frame prediction model which is conditional Generative Adversarial Networks (cGAN) [Goodfellow et al. 2014] and let users make a new animation easily by preparing for referenced effect videos. On the contrary, users can make the animation that it is difficult for even professional designers to make. The model is trained by loss between a ground-truth frame and a predicted frame, and the trained model can repeatedly predict the next frame with generated frames in order to make animation video. In experiments, we show several results and that we can partly control what frames are generated.
Dynamic Chinese Painting is a rendering style for animations. However, the production method today which relies on the artists to generate suitable textures and sceneries is time-consuming. In this paper, we propose a system to generate a realistic appearance of Yangzhou School Chinese Painting koi (decorative carp) animation in 3D space. Our system includes repainting the original texture, enhancing the contour of the 3D input models, and adding fin-stripes to fit the target style. For the interactive part, our system not only allows users to add ripples to the scene as the user-scene interaction but also generates water streamline as the koi-scene interaction. Through our system, a new Dynamic Chinese Painting scene can be automatically generated based on the 3D input models and user-inputs.
Augmented FPV Drone Racing is a system that allows spectators to understand the situation of drone races easily by using augmented feedback techniques, including projection mapping and autonomous commentaries. In this project, we have been developing visualization solutions for FPV (first person view) drone racing to allow the spectators/pilots to understand easily (or intuitively) the race situation.
In this paper, we present a new SPH (Smoothed Particle Hydrodynamics) method for multi-fluid. Our idea is to extend an Implicit Incompressible SPH (IISPH) for multi-fluid using a particle density. By replacing the formulation to solve the pressure poisson equation (PPE) in IISPH with a new equation based on particle densities, it realized to simulate more stable multi-fluid which considers of higher density ratios, incompressibility, and large time step. This paper shows the simulation of bubble behaviors such as rising, merging, floating, and rupturing, using our proposed method. Additionally, we also present a method to represent a thin film generated by the floating bubbles on the liquid surface by introducting an anisotropic filter for the computation of the surface tension.
The human visual system uses cast shadows to judge the three-dimensional layout of an object. The purpose of this installation is to demonstrate novel visual illusions of depth and transparency for paper materials, which is induced by the conventional light projection of cast shadow patterns. Thus, this installation focuses on perceptual rather than technical aspects. Illuminating a target object, the spatial vicinity of the object is darkened to produce the visual impression of a shadow of the object. By controlling the blurriness of the cast shadow and/or spatial distance between the object and its shadow, a perceptual change in the layout of the object is induced. The audience can interactively enjoy visual experiences wherein objects and letters on a paper perceptually float in the air. We also demonstrate that it is possible to edit the material appearance of a real object by manipulating the shape of the shadow; an opaque colored paper appear to be a transparent color film floating in the air.
Spectral rendering is necessary for rendering a scene with fluorescence, because fluorescence is a strongly wavelength dependent phenomenon. We propose a method for rendering fluorescence under global illumination environment efficiently by using importance sampling of wavelength considering both spectra of fluorescent materials and light sources.
We create a digital and artistic area that aims to illustrate and humanize, through visuals, sounds and words, the way in which westerners of the 21st century meet in the digital age. Based on an artificial intelligence, the work interacts directly with the viewer by analyzing data such as its position, its proximity to other people located in the area, etc.
Massive Collaborative Animation Projects (MCAP) are a series of innovative international and intercollegiate student productions. MCAP allows students from all over the world to collaborate beyond their institutions, work with peers in other fields and cultures, and learn animation techniques by working on large-scale productions. [Joel et al., 2018] Students participate in MCAP either through taking animation courses that include MCAP as part of the curriculum or through joining MCAP as an extracurricular activity. MCAP started in 2016 and had thus far involved more than 160 students from various schools across six regions, including Mainland China, Taiwan, Japan, and South Korea. Combining findings from interviews with MCAP participants, this poster focuses on the unique experience of Asian students working on the first MCAP project, MCAP 1, which makes a traditional 3D animated short film. It discusses the challenges, benefits, and lessons gained from working with students from different cultures, inspiring novel ways in which animation education in Asia can be conducted.
Manga character drawing styles differ greatly among artists. To accurately cluster faces within an individual manga, we propose a method to adapt manga face representations to an individual manga. We use deep features trained for generic manga face recognition, and adapt them by deep metric learning (DML) for the target manga volume. DML uses pseudo positive and negative pairs defined by considering page and frame information. We performed experiments using a dataset comprising 104 manga volumes and found that our feature adaptation significantly improved the accuracy of manga face clustering.
Object pose estimation based on a RGB image is essential in accomplishing many computer vision tasks, such as augmented reality and robot vision for grasping. Using structure from motion and domain randomization, we propose a method that, from a set of images, allows us to quickly generate large datasets to train a Convolutional Neural Network (ConvNet) for object pose estimation.
We introduced color enhancement factors to control the spectral power distribution of illumination, which enabled us to enhance one or more colors at once while retaining the color appearance of white. In experiments, color enhancement factors corresponding to red, green, and blue were calculated using color patches of a color chart and used for controlling a sixteen-color LED lighting system. The color chart and an old wood-block print were illuminated by the modulated light from the lighting system. By changing only three parameters, each color was enhanced continuously and independently with metameric white and the color balance under daylight maintained.
Adding multiple tactile sensations is a quick and useful method of enhancing immersion in virtual reality (VR) games by providing strong feedback to players, such as kinesthesia and cutaneous feedback. Our previous work on SoEs [Chen et al. 2016] and BoEs [Han et al. 2017] has shown the potential of using haptic feedback for VR controllers that provide feedback through heat, air, vibration, and reaction force. Additionally, our another work on AoEs [Han et al. 2017] and AoEs+ [Han et al. 2018] employed integration devices to simulate two virtual environments simultaneously in a room-scale physical space with each offering multiple tactile sensations.
We investigate deep learning-based super-resolution of digital comic images. Techniques effective for comic images and a new efficient model are presented.
It is natural to use our hands for interacting with a virtual world, but this remains to be widely available. The Leap Motion controller has brought 3D hand tracking to consumers, but its high cost prohibits its mass adoption, especially for users in developing countries. To facilitate mass adoption, we present a do-it-yourself wearable that has a material cost of only 1 US Dollar, and which coupled with a webcam, can provide 6-DoF(degrees of freedom) per fingertip tracking in real-time. We also propose a novel solution to the pose ambiguity problem of a single square planar fiducial marker in monocular view.
Deep generative model such as generative adversarial networks (GAN) has shown impressive achievements in computer graphics applications. GAN is trained to learn the distribution of target data and is able to generate new samples similar to the original target data. However, most GAN based networks encounter mode collapse problem resulting in the generation of samples only from a single or a few modes of target data distribution. In order to address mode collapse problem, we propose to adopt autoencoder to learn target data distribution in encoded space. An importance sampling scheme is used to collect fake and real data samples in the encoded latent space and calculate the similarity of two distributions in real data space. Experimental evaluation compared to state-of-the-art method on synthetic and MNIST datasets shows the potential of our approach in reducing mode collapse problem and generating samples from diverse aspect of target data.
A large dataset of outdoor panoramas with ground truth labels of sun position (SP) can be a valuable training data for learning outdoor illumination. In general, the sun position (if exists) in an outdoor panorama corresponds to the pixel with highest luminance and contrast with respect to neighbor pixels. However, both image-based estimation and manual annotation can not obtain reliable SP due to complex interplay between sun light and sky appearance. Here, we present an efficient and reliable approach to estimate a SP from an outdoor panorama with accessible metadata. Specifically, we focus on the outdoor panoramas retrieved from Google Street View and leverages built-in metadata as well as a well-established Solar Position Algorithm to propose a set of candidate SPs. Next, a custom made luminance model is used to rank each candidate and a confidence metric is computed to effectively filter out trivial cases (e.g., cloudy day, sun is occluded). We extensively evaluated the efficacy of our approach by conducting an experimental study on a dataset with over 600 panoramas.
With recent graphics technology creates surprisingly realistic contents, most of such artificial creatures help immersive virtual experience. On the other hand, still human can recognize whether an observed visual information is real or graphic model. In this work, we propose a deep learning based graphic and real image classification method to hunt out a graphic image from real images. In order to employ a deep learning approach, we have built graphic-real image data set consists of around 25K images. Quantitative classification and qualitative graphic image hunting results are presented that helps interesting applications such as fake image detection or image realism enhancement.
Motion analysis and recognition frequently suffer from noisy motion capture data not only because of systematic noises of imaging devices but also because of motion dependent non-systematic errors such as self occlusions and motion dynamics extraction failure from visual data. In this work, we propose a motion regeneration method that extracts only statistically significant and distinct characteristics of human body motion and synthesizes a new motion data. To this end, we convert 3D human body motion to 2D motion texture that is easily applicable to well-trained deep convolutional network. An autoencoder is trained with our 2D motion textures to learn only essential characteristics of human body motion in encoded space discarding systematic noises and unexpected non-systematic errors that are nothing to do with the description of particular motion. For the verification of the effectiveness of our regenerated motion, we perform motion classification test on public body motion dataset using our Long-Short Term Memory(LSTM) based method.
Open Illimitable Space System is a suite of configurable tools and the open source core for Illimitable Space System v2, which exhibits multimodal interaction and provides a platform for artists to enhance their performance by leveraging modern day technology. It is an experimental C/C++ framework with wrappers and API for motion capture and other interactions. Recently, we extended the OpenISS by enabling SOAP and REST APIs, which brought us closer to a more flexible and scalable architecture for being available as an interactive broadcast service over the Internet to a wider audience.
In April 2018, we did a realtime live streaming demonstration exhibit to a team of Ubisoft representatives along with the ISSv2 project, making as an example Microsoft Kinect-as-a-service prior to Microsoft's announcement for Kinect for Azure. During the demo, we served 43032 requests, up to 1474 requests per min and we had a total of 19 visitors. A subsequent real time-live streaming demonstration was done at the ChineseCHI workshop later in April at the CHI2018 conference as well with a live application of the service in a dance performance.
To build efficient processing scheme of moving object detection function using camera images, we propose a method that evaluates acceleration tolerance of camera's motion. The motion of camera should cause background optical flow inside camera images that prevents detecting actually moving object. For effective background optical flow cancellation, precise motion of the camera is required. The motion of the camera is estimated from position and attitude of a vehicle that mounts the camera. In a general measurement subsystem, these kinds of information are fetched in a certain cycle that is independent of moving object detection process. Gap between the fetch cycle and the process cycle can affect background optical flow cancellation result. In this paper, we analyze relationship between the gaps of cycles and allowed motion acceleration of the camera, and then verify the result on the actual moving object detection system.
In this study, we developed a historical streetscape simulation system for local areas. In recent years, the loss or replacement of regional history and culture has become a pertinent issue in Japan owing to urbanization, depopulation, declining birthrate, and aging population. Based on the cultural property law, measures are being taken to conserve and use cultural properties depending on characteristics such as tangible cultural property, intangible cultural property, folk cultural property, monument, cultural landscape, and traditional building group. However, with changes in social environments, the historical culture of an area that was previously cultivated and conveyed through the long history of local residents has now become difficult to inherit. In particular, things that have not been designated as cultural properties get buried or lost in society. To mitigate this problem, this study focuses on the historical cultural landscape of local areas and develops a landscape simulation system for communicating and inheriting the same in an easy-to-understand manner. Attempts have previously been made to reproduce historical landscapes by real-time rendering with 3D computer graphics (CG) and virtual reality technology [Fukuda et al. 2015; Boeykens. 2011]. However, owing mainly to hardware limitations for drawing, these systems focused on single buildings; they did not reproduce city-level dynamics. Meanwhile, studies were conducted to achieve urban-scale reproductions [Dylla et al. 2008; Jacobson. 2005]. Because such large-scale developments are costly, only famous places are typically selected as target areas. In a previous study, we have already developed a historical landscape simulation system for the streetscape of the late Edo period. This system was installed as a permanent exhibition at regional museums, where it has been running stably. In the present study, we aim to extend the scale of this system in response to user requests and feedback.
To design disaster prevention plans during a tsunami, it is necessary to achieve consensus through collaboration between local governments and residents in terms of evacuation sites and evacuation routes. In this study, we implement a system that simulates evacuee behavior during a tsunami using massive agents and develop free and easily accessible tools that can be used by local governments and residents for disaster prevention.
Architectural decorative painting of floral patterns on the surface of the Chinese traditional palace buildings is a sophisticated and time-consuming process. An ancient Chinese building code, titled "Yingzao Fashi", describes a variety of decorative floral patterns in the Song Dynasty (960--1279). It is difficult to draw those floral patterns using existing digital vector graphics software because of their complexity. We developed a floral pattern synthesis system to ease the drawing process of decorative floral patterns. In this paper, a case study is presented to demonstrate that the proposed system is used to automatically draw the traditional Sea Punica Granatum architectural decorative patterns. The user moves the mouse cursor to draw the path of a stem. Then the leaves and flowers are synthesized. In addition, collision detection is implemented to control leaf density. Vector graphics pre-loading and threading techniques are used to achieve 0.3 ms rendering speed for user interaction.
In 3D printing, using supporting materials as few as possible is critical for efficiency and material saving. Although both model partition methods and 3D printers with multi-DOF(degree of freedom) have been developed independently to tackle the problem, only a few existing approaches combined partition methods with multi-DOF printers. We present a global-optimization-based model decomposition method for multi-DOF 3D printers to achieve consistent printing with the least supporting materials. We solve the minimization of the surface area that need supporting where the printing sequence is determined inherently by global single-objective optimization. We first describe the printing constraints for using a multi-DOF 3D printer, then propose an optimization framework that meets the constraints. Experiments show the merits of our method.
Recently, research of modeling realistic hair from a real-life portrait image has achieved extraordinary results. Modeling cartoon hair from an anime-style image, in contrast, has not drawn much attention from researchers. Yeh et al. [2015] proposed a 2.5D approach which determines a layering order and estimates hidden portions of hair segments from an anime-style image. However, their results are only suitable for a single view angle. In this work, we propose a novel approach to model cartoon hair from an anime-style image which simulates an artist's workflow and has a broader range of view angles than the method proposed by Yeh et al. [2015].
Although traditional computer-aided design (CAD) systems are mainly intended for expert users, research involving systems incorporating CG and interactive techniques that are easy to use by novices is also active. In this paper, we propose a design support system that can be used by a novice to easily design a craft band object of his or her desired pattern. We propose an algorithm that can automatically calculate geometric shapes based on rectangular parallelepipeds and cylinders according to the sizes desired by users (Fig. 1).
We propose a novel deformation structure that expands along the x-, y-, and z-axes simultaneously using a Ron Resch pattern. The Ron Resch pattern, devised by Ron Resch, is a folding pattern that uses a combination of triangle pattern that simultaneously expands along the x-, y-, and z-axes. This pattern shows particularly high expansion coefficient along the z-axis [Resch et al. 1974; Resch and Christiansen 1970].
We present PanoAnnotator, a semi-automatic system that facilitates the annotation of 2D indoor panoramas to obtain high-quality 3D room layouts. Observing that fully-automatic methods are often restricted to a subset of indoor panoramas and generate room layouts with mediocre quality, we instead propose a hybrid method to recover high-quality room layouts by leveraging both automatic estimations and user edits. Specifically, our system first employs state-of-the-art methods to automatically extract 2D/3D features from input panorama, based on which an initial Manhattan world layout is estimated. Then, the user can further edit the layout structure via a set of intuitive operations, while the system will automatically refine the geometry according to the extracted features. The experimental results show that our automatic initialization outperforms a selected fully-automatic state-of-the-art method in producing room layouts with higher accuracy. In addition, our complete system reduces annotation time when comparing with a fully-manual tool for achieving the same high quality results.
3D point cloud is a collection of unordered sparse 3D points that is different from densely structured color image. Therefore, applying a fixed structure of deep learning network on 3D point cloud is a challenging task in computer vision and graphics problems. Recently, researchers have proposed deep learning methods for 3D point cloud based on data conversion or simplification. However, they lose either local 3D shape information for the simplicity of method or geometric locality for using array as an input. In this paper we propose a new convolution technique, named Polypod convolution, for 3D point cloud description that is distribution independent and maintains both local and global 3D shapes. Quantitative and qualitative evaluation results show the potential of our new network for 3D point cloud based deep learning applications.
Because of their variety in shapes, textures, and ecologies, insects have been an important subject of natural science for a long time. Reconstructing three dimensional (3D) shapes and textures of insects in digital format has various benefits such as deterioration free, high space efficiency and high accessibility. In addition to the scientific purpose, the highly realistic insect models are also useful for creating graphics scenes, since they are commonly seen in our daily life.
Time-of-flight depth camera assumes that target scene consists of opaque surfaces. Therefore, any translucent object causing multipath problem in depth calculation cannot be appropriately reconstructed. If we are able to detect translucent surface under time-of-flight principle, depth map obtained from opaque region will be more reliable. Translucent surface can be recovered by separate approach afterhand. In this paper, we propose a translucent surface detection method from multiple depth images obtained at different viewpoints. First, multiple depth maps are registered in a 3D space based on camera transformations. Our method classifies surfaces into three types: opaque, translucent, and not determined surface. Raycasting through registered depth maps investigates overlapped surfaces identifying respective surface types. Experimental evaluation on both synthetic 3D models and real translucent object shows promising translucent surface detection results.
Decorative patterns are common in traditional Chinese architectures as shown in Figure 1. However, scalable vector graphics (SVG) is not capable of representing interweaving and penetrating patterns. In this paper, we develop a web-based vector graphics interweaving and penetrating editing system. We propose a data structure to dealing with interweaving and penetrating, allowing users to assign depth value for each edge of a polygon. As a result, when we click on a polygon and move it to interweave with another one, the intersecting edge is calculated using linear interpolation of the depth values. In contrast, the conventional SVG format arranges layers to separate two polygons for interweaving and penetrating. In other words, users need to split a polygon into multiple polygons and assigning them to different layers to achieve interweaving and penetrating. As shown in Figure 2, the proposed system handles the interweaving and penetrating problem intuitively and maintains the topology of the polygons. After finish editing, the proposed system allows the user to save the drawing in both standard SVG format and the proposed augmented depth value format for future editing.
We present a load sensitive surface which provides exact contact position in 3D and a force vector of user's touch from low cost load sensors without any prior knowledge about shape of objects. Load based approach is advantageous since they provide a force vector, however, there have been mathematical ambiguity of the touch position along the line of action.
To address this problem, we develop algorithm based on pseudo-inverse matrix framework, utilizing the notion that user's hands are always fluctuating slightly and unconsciously during touch interaction. In this paper, we introduce the design space of the surface, outline of the algorithm and highlighted applications. This technique to localize the touch position in physical spaces enables us to design tangible interaction with inert everyday objects and analyze user's activities happening on the surfaces.
Projection-based mixed reality (MR) is a method of superimposition that projects virtual information on real environment using a projector. In previous studies, a method using a pico projector as an operation device has been proposed. However, the projection range and the continuity of information presentation are limited. In this research, we replace the projection of the pico projector with a ceiling projector installed in the environment and leave it in place. In this manner, a plurality of virtual information is realized simultaneous with the continuous presentation to the real environment.
In this paper, for the purpose of preventing obesity, we developed a chewing amount control system which adds visual information and auditory information at meal. By using this system, it was suggested that the quality of chewing was improved and the sense of fullness was promoted.
Zheng is one of the most representative plucked-stringed musical instruments in East Asia over three thousand years as illustrated in Figure 1. Up to now, although it is still popular, there are several factors blocking its spreading. The musical instrument is precious and not suitable to carry along. Moreover, it is hard to learn on one own.
This paper presents a study aimed at creating an aerial mixed reality environment by using a first-person-view drone. We develop a drone with a stereoscopic camera based on human binocular vision. Then, we present a novel mixed reality environment with bidirectional interaction between the real and virtual worlds. Our approach is effective to perceive the depth of obstacles and provides a safe and exciting environment for drone flying.
A Virtual Reality project of the Fukushima Dai'ichi nuclear power plant was designed, modelled and programmed as a Synthetic Learning Environment (SLE) for education purposes, motivated by the fact that the disaster of March 2011 revealed much about Japan's lack of preparedness for nuclear accidents. An iterative process of design, make, share and reflect was adopted by the student developers. In Japan, the creative process is termed TKF: Tsukutte つくって; Katatte かたって; Furikaeru ふりかえる.
The motion of the finger is made up of a combination of forearm part (extrinsic) muscles and hand part (intrinsic) muscles. We have created a wearable fingerless glove controller to sense sEMG(surface Electromyography) from intrinsic muscles using dry electrodes [Tsuboi et al. 2017]. Recognition of air-tapping gesture with a sensor attached to wearable finger-less glove controller is a challenging problem. In this study, we focused on motion recognition of air-tapping and performed motion recognition using CNN and evaluated its accuracy. As a result, the accuracy in intra-subject identification was 85.05%. Also, experiments are currently being conducted in anticipation of character input in VR space [Grubert et al. 2018]. Character input experiment in VR space was carried out using sEMG wearable fingerless glove controller, as a primitive experiment of the use of sEMG glove in VR space. Based on the results, we discussed the efficiency of character input using sEMG glove in VR space.
Most people can voluntarily control vergence eye movements. However, the interaction possibility of using vergence as an active input remain largely unexplored. We present a novel human-computer interaction technique which allows a user to control the depth position of an object based on voluntary vergence of the eyes. Our technique is similar to the mechanism for seeing the intended 3D image of an autostereogram, which requires cross-eyed or walleyed viewing. We invite the user to look at a visual target that is mounted on a linear motor, then consciously control the eye convergence to focus at a point in front of or behind the target. A camera is used to measure the eye convergence and control the motion of the linear motor dynamically based on the measured distance. Our technique can enhance existing eye-tracking methods by providing additional information in the depth dimension, and has great potential for hands-free interaction and assistive applications.
In this paper, we propose a human-marionette interaction system to enhance the interactivity of marionette show. The proposed system is composed of a mechanical arm, an L-shaped screen, a Kinect, a computer, and audio equipment. Using gesture recognition and voice recognition, this system is designed to recognize the audience's gestures and voice to control the marionette and the audiovisual effects of stage. Our system enables audience to enjoy a personalized human-marionette interaction puppetry and transform their role into performer.
We propose a novel system that enables a user to see stereoscopic 3DCG images in mid-air and interact with them directly as shown in Figure 1. This system displays 3DCG objects with motion parallax. Thus the user can observe them in mid-air while feeling a stereoscopic effect by the motion parallax. It is also possible to interact with the mid-air 3DCG objects by fingers. The user can move, deform and draw 3DCG objects as if they were there.
In this paper, a "gaze navigation" method for an interactive visual narrative application is proposed, and a prototype system, developed for touchscreen computer devices, such as the iPad, is described. An interactive narrative application called "Cloudy Lady" authored by Negar Kaghazchi has an ethic story and the user naturally follows the visual cues applied in every scene, to unravel the story by exploring the surface area using omnidirectional scrolling. (Figure 1)
Unlike comic/picture books that run over many pages, or video/computer games with pre-designed paths with arrows and symbols, the story of this narrative unfolds following the natural gaze navigation and its structure, allow the user to advance the story in any phase and any direction with no restriction.
In this paper, we propose a rapid prototyping system with a combination of blocks that change shape and firmly adhere to each other. We focused on PCL (polycaprolactone), which is a plastic characterized by a low melting point, as a block material. PCL blocks can be transformed and bonded many times by melting them with hot water. In this research, we implemented a system that transforms external data into a block diagram and creates blocks. It enables rapid prototyping with more flexibility and stronger adhesion than ordinary block assembly.
This article explores the possibilities of a light and inexpensive way of doing haptic feedback through texture simulation using tactile vibration. With the popularization of virtual reality, the field of haptic feedback is in turmoil. The goal in this study is to expose a moderately realistic but cheap way of simulating haptic feedback including the texture of a surface, to propose a system accessible by the great majority.
Jogging is a fundamental activity integrated various sports. It is known familiarly as one of daily exercises. Being able to do alone anytime and anywhere is important, and motivation is a key issue for initiating and maintaining such exercise. Presence of others tends to affect physical activity. It is called Social Facilitation. Individuals perform better on simple or well-rehearsed task with the social facilitation. Presence of attentive spectator makes the jogger's pace faster than no-spectator condition [Strube et al. 1981]. Furthermore, competition with a superior partner causes the jogger to get motivated and increase effort, which is called Köhler effect. Presence of others is effective for exercise; however, it is difficult to find such partner. Some virtual reality exercises have been researched to overcome this gap between individual and pair/group exercise. Software-generated virtual partners have already applied for stationary bike, plank exercise, rowing exercise with an immersive head-mounted or mere stationary display. People prefer change in their surroundings while jogging. Mueller et al. tried to realize outdoor jogging experience with a drone [Mueller and Muirhead 2015] and a distant jogger felt through his/her voice [Mueller et al. 2010]. Jogging with an autonomous robot and a runner participating from other place could connect with appropriate partner anytime and anywhere. However, whole control of the drone in crowded environment with people and buildings is difficult and voice communication gets much attention from the environment.
This study aims to propose a new educational tool for interacting with artificial life (ALife). The proposed system consists of an ALife behavior simulation and a tactile aerial display with a combination of aerial imaging by retro-reflection and a force-feedback device. The content is mid-air interaction with a simulated school of fish-like ALife. The behavior is based on a modified BOIDs algorithm with a predator-prey relationship between different species. A user can enjoy interaction with the fish through aerial imaging and haptic feeling. The user can feel the activity of ALife as a force field using haptics.
Tangible interaction with customized products integrating sensors and actuators recently grows into an interdisciplinary research area in computer graphics and human-robot interaction (e.g., [Groeger et al. 2016; Yu et al. 2018]). In this paper, we introduce tangible interaction into 3D printed modular robots. Our user study demonstrates that interacting with our robots can effectively improve human spatial ability, which plays an important role during a person's development in science, technology, engineering or math (STEM).
The Tentacle Flora is a robotic sculpture inspired by a vision of a colony of the sea anemone growing on the coral. A shape-memory alloy actuator is used as tentacles and is composed of a BioMetal Fiber such that it can bend in three directions. The top of the actuator glows softly mimicking a bioluminescent organism using a full colored LED. The Tentacle Flora induces the beauty, wonder, and existence of living sea anemones in the depths of the ocean.
We propose a novel method to generate alphabet collage art from a single input image by replacing the partial curves of the image with the best-matched shape of alphabet letters. The salient structure of the image is preserved, and the contour is reconstructed with letters. In our framework, we first segment the input image into regions and extract the primary curve from each letter. Second, we analyze the structure of the region contour and the curve of a letter for finding the relationship between the salient contour in the image and the structure of the glyph of letters. We propose a modified partial curve matching to generate a stylized collage result with alphabet letters.
Dancheong is designed to decorate various parts of wooden buildings with beautiful and majestic colors. The painting process involves a stage called Cheoncho, where various holes are drilled into the paper to copy distinctive design patterns drawn on the paper in which the design pattern will be applied to a building part. To perform the process of Cheoncho, a craftsman punches holes one by one on the paper with a needle and repeats this action over millions of times. In order to reduce those kinds of time consuming job, we propose a system that automatically performs Cheoncho process to assist a craftsman in copying the desired pattern to the target building part with easy and accurate manner.manner.
Lossy GPU-native compression formats such as BCn, ETC and ASTC have been widely used to reduce video memory footprint of textures. A downloadable GPU application like a mobile 3D game should select the most suitable compression format and resolution per texture, while taking a variety of the target device's capabilities and download volume into consideration.
Recent advances in computational photography have enabled the creation of images that contain additional attributes. However, capturing images of objects concealed behind obstructions or out of a camera's field of view is a challenge. We designed an optical transformation system that utilizes a conventional camera, a concave lens, and Micro-Mirror Array Plates (MMAPs) to enable images to be captured through small holes in walls or other obstructions. Our experimental prototype demonstrated that it was possible to capture images of the area on the other side of a wall through a 3-mm hole. Our system could be used to capture images from places difficult to position a camera, such in rubble in disaster areas.
Recently, the immersive virtual reality (VR) environment using the head mounted display (HMD) has attracted attention as a new growth market due to the reasonable consumer price and high accessibility compared to other VR devices. However, users feel the cognitive heterogeneity caused by low resolution images, and hence, it is difficult to use it for a long time. To solve it, transmission techniques based on image resolution conversion have studied. In this paper, we propose a novel foveated super-resolution convolutional neural network (SRCNN) for HMD using an object tracking algorithm to reduce computation load for rendering high resolution images. We implement the object tracking on the region to compensate for a frame processing speed of eye-tracking devices, relatively slow to apply the resolution conversion. SRCNN applies to cognitive regions, and typical interpolation applies to other regions to reduce the rendering cost. As a result, the computation is decreased by 90.4059%, and PSNR is higher than the conventional foveated rendering algorithm.
The practical use of virtual reality (VR) is expected to be extended to a wide range of applications after its successful deployment in the entertainment sector. Currently, the domain of VR applications crosses the boundary between the real world and the virtual world. Therefore, we propose a real-virtual bridge, a conceptual model that can be used to mediate between real and virtual objects. We introduce the concept and architecture of a real-virtual bridge and describe two implementations of the bridge on smartphones and microscopes. Although concepts simlar to a real-virtual bridge exist in science fiction works, we explicitly define its architecture in this paper. We believe that a real-virtual bridge promotes emerging applications by integrating real and virtual objects.
The scientific visualization of a chromatophore, a photosynthetic organelle, required the creation of a software integration pipeline to combine scientific software, visual effects tools, and custom camera choreography software. Furthermore, the rendering of this visualization in both fulldome and 4K3D format was done using a custom supercomputer rendering pipeline.
Yuki-tsumugi is a traditional Japanese silk fabric. In the production of this silk fabric, a splashed pattern based on a picture is first created on a sheet of exclusive grid paper. Second, piece goods is woven based on the pattern plan. Finally, a kimono is produced from the piece goods. However, estimating the appearance of changes to a kimono during each step of the production is difficult. This is due to the fact that a pattern plan is fabricated by combining basic Kasuri patterns. In addition, yarn weaving rules must be considered and the cutting of a piece goods pattern can be done in an infinite number of ways.
Vulkan is the most recent graphics rendering API, designed mainly for the benefit of gaming and highly intensive workloads where the CPU becomes the bottleneck. While Vulkan is quite amenable to games and other heavy rendering tasks, its use is often discouraged for 2D applications rendering and editing videos with animated effects. In this paper, we present the first Vulkan-based animation and effects engine for mobile video rendering to test these claims. We compare our solution with the preloaded video editor applications in mobile phones. We see a significant improvement in all regards, with a 30 FPS playback for 4K videos achieved using 30% less memory and 20% less power.
The analysis of bipartite networks is critical in many application domains, such as studying gene expression in bio-informatics. One important task is missing link prediction, which infers the existence of new links based on currently observed ones. However, in practice, analysts need to utilize their domain knowledge based on the algorithm outputs in order to make sense of the results. We propose a novel visual analysis framework, MissBi, which allows for examining and understanding missing links in bipartite networks. Some initial feedback from a management school professor has demonstrated the effectiveness of the tool.
Heat map is an important tool for eye tracking data analysis and visualization. It is very intuitive to express the area watched by the observer, but conventional methods ignore saccade information that express gaze shifts. Based on the conventional method, this paper presents a novel heat map generation method for eye tracking data. Proposed method uses a mixed data structure of fixation points and saccades, and also adds heat map deformation calculation with saccade type data. The proposed method has advantages on indicating gaze transition path while also visualizing gaze region. Related applications will benefit from this work.
We propose two simple and efficient visualization techniques for assisting understanding of complex three-dimensional structures like human anatomy: (1) applying Screen Space Ambient Occlusion (SSAO), Depth of Field (DoF), and Depth Cueing (DC) to an original rendering result image in real-time, and (2) adding "caps" to thin polygonal tube structures which results in pseudo-thick, hollow structures with a small amount of additional polygons.
City scale scenes often contain large amounts of geometry and texture that cannot altogether fit on GPU memory. Our ongoing work seek to minimise texture memory usage by streaming only view-relevant textures and to improve rendering performance using parallel opportunities offered by Vulkan, the latest generation of graphics API. Our result presents a high performance rendering of a city with streaming textures after CPU and GPU culling.
Students commencing tertiary studies in life sciences over the next decade are going to arrive at universities with a more advanced digital literacy and shifting expectations of how technology supports learning. This poses a challenge to traditional approaches to tertiary science education. As the increasingly digitally literate students arrive, and the barrier to entry to Virtual Reality (VR) drops, there is an opportunity to develop new learning activities in VR. Students may learn by using VR, but students can also explore complex scientific concepts by creating their own VR content, while developing digital and creation fluencies. We describe how first-year students explored biological science concepts with a series of learning activities based in content creation for VR. We outline the feedback students provided on these activities and how this feedback informed further development of the learning activities.
Automatic colorization is a significant task especially for Anime industry. An original trace image to be colorized contains not only outlines but also boundary contour lines of shadows and highlight areas. Unfortunately, these lines tend to decrease the consistency among all images. Thus, this paper provides a method for a cleaning pre-process of anime dataset to improve the prediction quality of a fully convolutional network, and a refinement post-process to enhance the output of the network.
We propose a system of real-time measurement and visualization for three-dimensional (3D) sound field by using the optical see-through head mounted display (OSTHMD) with simultaneous localization and mapping (SLAM). By using an estimation of spatial mapping, the system achieves free movement of measurement positions in a broad area without multiple AR markers. Visualizing the 3D sound intensity of an entire room by the proposed system helps us to design the sound field within a space.
We present a new and novel graph visualization technique designed specifically for virtual reality (VR). Ring graphs organize graph nodes by categorical attributes along a ring that are placed in a sphere layout. Links between nodes are drawn within the rings using an edge bundling technique. This 3D placement of data takes advantage of the stereoscopic environment that VR offers. We conducted a user study that compared our ring visualization to a traditional node-based graph visualization and found that our ring graph method had higher usability, both in terms of accuracy in completing a set of tasks as well as lower task completion time.
Even in animation production, which was once a hand-drawn creation, it has been increased speed and reduced costs by applying NPR technology. For example, there are researches and products for generating animations of waving hairs from a single image in which a still scene of the animated characters are drawn. However, animations created by previous methods often provide a feeling of strangeness compared to the hand-drawn animations.
Because, in the previous method, tequniques of experienced animators are not considered, a huge amount of time and high skill are necessary to make the generated scene closer to the hand-drawn animation. Therefore, we developed a tool that can reproduce the tequniques used by animators to discribe the waving behavior of hair, for the purpose of generating an animation without the strangeness. The parameters need for our method can be automatically determined by using a kind of genetic algorithm.
In this paper, we propose an interface to support post-match play-by-play analysis of a hand-to-hand fighting game based on the two players' eye movements. In the domain of e-Sports, the "fighting game" genre refers to hand-to-hand combat games in which two players fight each other by manipulating their respective martial artist characters within the same game screen. An e-Sports match, like a professional chess match, is followed by analysis and commentary about the performance of the players. In this study, we constructed an interface for visualizing information about the match based on the players' eye movements to facilitate post-match play-by-play analysis and commentary. Our interface highlights commonalities and differences in the areas on the screen where the players focus their attention, as well as commonalities and differences in the direction of their eye movements.
In this paper, the image photographed by the light field camera is reconstructed by the light field display, and projected in the air by Fresnel lens and an aerial imaging by retro-reflection (AIRR) technique. This allows you to face the 3D aerial image of the user in another location and perform a realistic conversation.
Recent Japanese animation is progressively using three-dimensional computer graphics (3DCG). However, Japanese animation created according to the Japanese traditional method called "limited animation" is different from photorealistic motion of 3DCG. In particular, hair motion obtained via this method is different from that obtained by physical calculation. In this study, we formulate a method of hair motion of traditional Japanese hand-drawn animation.
dongSpace is a wide area multiplayer interactive game system based on mixed reality technology. The system uses a head-mounted display to track user's posture, allowing user to perform a wide range of movement in more than 2,000 square meters of outdoor environments. The laptop that the user carries with him will render the game screen in real time. The computer performs computational rendering to provide high-definition, large-angle-of-view content display. The system also provides users with a hand-held simulation gun with tracking that can be used to shoot or interact with virtual environment objects. The system integrates display devices, tracking devices, interactive devices, computers and other devices, which breaks through the limitations of traditional MR systems.
In the world of virtual reality, the task of constructing a playback data set to navigate through a scene has traditionally required a particularly inefficient procedure. The conventional method of taking pictures and videos with a pinhole camera model is costly due to the slow run time and memory space required. We propose a method which takes advantage of a less costly setup and improves the visual quality of the final images. This method allows users to choose the desired viewpoint, as well as whether the output should be computed as a panoramic or perspective image. This entire procedure consists of four steps: structure from motion (SfM), image rectification and depth estimation, 3D reconstruction, and view synthesis.
This paper proposes an optical system that can capture a user from the viewpoint of a mid-air CG character. The mid-air imaging system enables us to display a CG character in real space. In order for users to interact with this character, we must observe their behavior from the character's viewpoint. Therefore, we propose a method of capturing from the mid-air image position by arranging a light source display and a camera at a conjugate position using a half mirror, optically transferring them with micro-mirror array plates. The contribution of this system is capturing the full face of a user from the position of the mid-air image.
Depth estimation in scene reconstruction remains one of the main issues in the world of virtual reality. We propose a method that uses low cost camera facilities from previous papers and their specified procedures for stereoscopic 360 imaging. However, instead of using angular disparity, we use the spherical radius for labeling depth values for scene reconstruction. The experimental results show that the reconstructed shape is less distorted by directly optimizing spherical radii than optimizing the angular disparities.
We propose a novel approach to the application of realistic makeup over a diverse set of skin tones in mobile phones using augmented reality. The method we developed mimics the real world layering techniques and application that makeup artists use. We can accurately represent the five most commonly used materials found in commercial makeup products- Matte, Velvet, Glossy, Glitter, and Metallic. We apply skin smoothing to even out the natural skin tone and tone-mapping to further blend source and synthetic layers.
The dynamic range of display is much lower than the one perceived by human eye. This problem has been studied in both aspects of photography and display [Debevec and Malik 1997; Hirsch et al. 2014].
In this work, we propose a novel VR s/w system aiming to disrupt the healthcare training industry with the first Psychomotor Virtual Reality (VR) Surgical Training solution. Our system generates a fail-safe, realistic environment for surgeons to master and extend their skills in an affordable and portable solution. We deliver an educational tool for orthopedic surgeries to enhance the learning procedure with gamification elements, advanced interactability and cooperative features in an immersive VR operating theater. Our methodology transforms medical training to a cost-effective, easily and broadly accessible process. We also propose a fully customizable SDK platform able to generate educational VR simulations with minimal adaptations. The latter is accomplished by prototyping the learning pipeline into structured, independent and reusable segments, which are used to generate more complex behaviors. Our architecture supports all current and forthcoming VR HMDs and standard 3D content generation.