SPONSORED WHITE PAPER

Creating Entertainment Robots that
Bond with Humans

An ARM Interview by Jan Howells, Editor IQ online

BN-17 home robot

Doraemon to Become a Family Member in 2010?

IQ So what is your definition of an “entertainment robot?”

Haga We refer to low-cost products that have features for interacting with humans as “entertainment robots.” Toy manufacturers have been developing these products based on appealing characters from cartoons and comic books. Going forward, however, Bandai plans to create its own characters and develop them into mainstay products.
Entertainment robots hold great potential as an exciting new product line, with their unique character identities.

IQ One of those is Doraemon the Robot Version 1.0 (19,800 yen plus tax) released in March 2004. First I’d like to ask you about your aims for the Real Dream Doraemon Project (RDDP), the general-purpose robot platform to be finalized in 2010, and how the project is going.

Haga The development team is pursuing two major themes in RDDP. One is the development of a general-purpose platform for making robots based on characters that appear in comics and cartoons. The other is the establishment of a foundation for making the entertainment robot itself into a character.

Think of the general-purpose platform as a mechanism or vessel for adopting characters from cartoons and comic—in other words, content. One of our objectives is to see what kinds of different products Bandai can develop using the platform as a kind of “content container.” We believe the robots we create with this platform do not necessarily have to look like Doraemon to be successful.

Just so there’s no misunderstanding, RDDP is not a project to make a realistic robot based on the Doraemon cartoon character. The purpose of the project is to create the basic robot technology to bring Doraemon to life by 2010. Making a real Doraemon is something that cannot be done with current technology and will likely take many
more years to accomplish.

BN-17 home robot and Yoshinori Haga, Director of Bandai Robot Laboratory Mechatronics Life Design Manager Technology Development Office, Bandai Co., Ltd.

IQ From what I understand, in addition to speaking about 750 different languages, Doraemon the Robot will have sensors in its head, a tail and hands to sense his environment and will feature smooth movements. What will the mechanism on the inside look like?

Haga The software uses the engine from the BN-1, our earlier cat robot. The BN-1 recognizes objects using an infrared tactile sensor and runs away if it touches something. Doraemon the Robot will have a voice recognition device so it can recognize keywords spoken by humans, and we took special care to ensure that it can respond more flexibly. It will interact closely with humans, asking for help when it falls over, for example.

IQ What type of system is in place at your company for developing and producing robots?

Haga The Technology Development Office in which I work has about 20 members, most of whom are involved in
mechatronics. However, there are no sections in the company that work solely on developing robots. We have a project called the Bandai Robot Laboratory, and there are five main members working on it, including myself. All of us work in other divisions and participate part-time on the project at the same time. There are no robot production plants within Bandai, so everything after designing the mold is done with the cooperation of other manufacturers
in Japan and overseas.

IQ So your strategy, then, is to actively incorporate outside expertise?

Haga We use lots of products and technology from other companies to develop our robots, starting with the engines for voice and image recognition. Take image recognition technology, for example. Recently we adopted the Evolution
Robotics software platform (ERSP). We form alliances to implement superior technologies that are already on the market and develop what isn’t available on our own. Another example is our collaboration with audio manufacturer Kenwood to improve the sound quality in Little Jammer, an entertainment robot that performs music in a band with other robots.

Autonomous behavior that sequential logic control can’t deliver

IQ You’ve been involved in robot development at Bandai from the outset.

Haga I was fortunate in that, when I started working at Bandai, I was put into an environment where ideas were valued and could be developed with very little constraint. This process was used to create the robots in the BN series. The BN-17, the latest in the series, is scheduled for release this year. It is a robot for the home, developed
on the concept of a helping hand that works for the user when given a task. The user can easily write scripts for the
robot to perform such tasks as moving towards an object until it is in sight and taking a picture of it, or downloading email and attaching pictures it took to a reply if a certain keyword is found in the message. No matter how cute a robot is, it can never take the place of a live pet. When we started development on the BN-17 we set out to make not just a pet, but a robot that could actually perform useful tasks for the owner.

Image of Doraemon on Character Street on the
northeast side of Bandai headquarters

IQ What is the background to development on the BN series?

Haga BN is the abbreviation for Bandai New Property, our project for new model development. All of the names of the main prototypes we developed as part of that project begin with the letters BN. Many of these models never make it to market, but are created simply to amass technology and expertise, or as part of our research to gauge the reaction of test users. The shape and specifications for each model in the BN series differ according to the theme. Some of them build upon the acclaimed features of previous models in the series, and others are developed from scratch with a completely different concept. However, all of them are developed with the common purpose of verifying market reaction to new behaviors and new features for recognizing voices or communicating.

IQ What kinds of development themes have been used in the previous models of the BN series?

Haga The themes can be broken down into two major categories: sequential-logic robots, which can be programmed by the user, and autonomous robots, which use subsumption architecture. Some robots are a combination of the two.

TansorBorg, an autonomic programming robot that allows users to experience robot programming

Typical of the programmable robots are the BN-0 WonderBorg robot and its successor models, which are modeled after insects. The TansorBorg robot released in 2005 improves on the WonderBorg, allowing advanced program control. The shape is based on the NASA unmanned exploration craft that explored Mars in 2004. It uses an infrared sensor to find objects like empty cans and carry them to a specific location. The user can create a flow chart and input commands.

IQ What kinds of features are available in autonomous robots?

Haga At the Bandai Robot Laboratory, the two requirements for autonomous robots are that they possess a power source and can make their own decisions on what action to take in a given situation. One entertainment robot that satisfies those requirements is Pichi-Pichi Gonzallez, released in September 2001. At around 1,300 yen, this product sold better than any other robot to date. It reacts to sound and runs away, using time-honored “mystery action” technology to move quickly while randomly changing directions. When voice signals are input through the internal capacitor microphone, the motor’s rotation is transmitted to the gearbox itself and randomly swings it around. At the same time, the motor runs for a set period of time and sends the robot running in unpredictable directions. It’s a simple mechanism, but the autonomous movements are fun to watch.

Subsumption architecture uses a growing set of simple reactive behaviors

IQ When did you first start thinking about getting robots to move autonomously?

Haga That would be around 1995. I suspected that the main toys people played with around the year 2010 would be robots and went about my job with that goal in mind. But I was beginning to suspect that radio-controlled robots that simply respond to commands given by the user would no longer be fun. So I started thinking about how to get robots to communicate with humans and exhibit reactions similar to living creatures. That was how I started looking into autonomous robots. When I decided to use artificial intelligence I became interested in this “subsumption architecture” I was hearing so much about. It was proposed by Rodney Brooks, an assistant professor at the Massachusetts Institute of Technology, in 1986 and later came to be known as “behavior-based artificial intelligence.”

Until then, artificial intelligence systems were mainly virtual creations of externalworld models within computers and were based on the frame theory of having objectives accomplished by powers of deduction. The idea with robot movement is to measure the distance to the target object, locate obstacles in the path and then calculate the course requiring the least amount of energy using certain algorithms and send the commands to the control device. I think that similar processes are performed within the human brain, but I tried studying brain waves, and that method just didn’t seem right.

Subsumption architecture is used in WonderBorg, the BN-1, Doraemon the Robot, the BN-17 and others. The BN-17 has an image-recognition feature, so even if it is given a command to move forward it detects obstacles on its own and automatically slows to a stop in front of them. A stronger stimulus is then required to move it forward. Some people like this “I'll come if I feel like it” personality of the BN-1, saying it behaves like a real cat,
but others say that they would like it more if it grew fond of the owner over time.

IQ Robot movements and behavior have become quite life-like haven’t they?

Haga Yes, and the next theme is to find out whether or not subsumption can be applied to the movements based on communication with humans. Insects and cats don’t need to speak, so that aspect was not required before, but with humanoid robots this obstacle has to be overcome. But, you know, we’ve been performing trial and error on this for about two years, and it’s just not an easy task. We can’t seem to create the feeling that the robot is thinking and speaking on its own. However, our approach is not to create a jam-packed dictionary database. We just want to create something close to a personality where the robot can give various responses to certain keywords depending on the character. The thing is, we can’t do funny comebacks yet, so we are looking into having the robot refer to a dictionary when it receives a stimulus so that it might, for example, be able to determine whether the words it heard mean that it is being scolded or made fun of. Then, we would build on that, determining priority based on subsumption, to have the robot respond accordingly.

Hardware and software issues with controlling robots

IQ Don’t voice recognition processes place quite a load on the system?

Haga The CPU load with types like the BN-17 that have voice recognition functions is indeed high, so we handle those processes using a separate, independent system. Voice recognition has an effect on the speed of responses. When vast amounts of information are input, a quick response is impossible. Even with a short phrase, there is a three-second-or-so delay, and that’s just too slow. First we have to speed up the processing. We have the option of using high performance hardware and software, but that raises the problem of cost. Our premise is based on providing the products at a low cost, so we are willing to live with the delay to a certain extent. To deal with this, we try to hide the time lag by having the robot’s eyes start moving after it hears someone say “hello,” and then the ears pop up if it is a cat or something, all the while performing several processes in the background. That said, there are some processes that need to be fast and others that can be slow, so as I said earlier, we have other ways of dealing
with it as well, such as separating the processors.

IQ I heard that the BN-17 has 3 CPUs, is that correct?

Haga We use the 32-bit ARM9 for the wireless LAN camera, which uses Linux, and a 16-bit H8 Tiny class processor for the virtual control device, which controls reflexes. So there are two computers (CPUs) inside the unit. The BN-17 was the first model to use a 32-bit processor. By separating the web camera, which is a resource hog, we were able to lighten the load on the CPU that controls the robot’s reflexes.

The BN-17 and Doraemon the Robot include a unique feature called “word spotting” for recognizing keywords contained in spoken phrases. This voice-recognition feature is processed on an external computer connected via wireless LAN, so that it does not place a load on the CPU that controls reflexive responses to movements.

IQ What kind of CPU did previous models in the BN series use?

Haga The BN-7 uses the most CPU power. We developed the control circuit for the servo motor in the arm in-house, and used six PIC microcomputers to control the motor. It also has a high end H8 processor to control the reflexes. It’s even equipped with a Windows 2000 computer, which is the system that performs overall controls.

The cat-like BN-1 robot uses three 16-bit CPUs. The main system is an H8. The subsystem uses a processor made by a Taiwanese manufacturer to control the sensors and graphics in the eyes. However, the development objectives for each model in the BN series are different, so a higher model number doesn’t mean more CPUs.

IQ What do you use for the robot engine OS?

Haga Actually, the robot engine is directly coded. Bandai has not yet decided what the OS should be. The reason is that no single robot OS on the market is used throughout the industry, and several theories are being argued. Certainly many these days many would mention Real Time and ITRON OS's, but the OS we want is not really of that type.

When you get right down to it, what we want is an OS that can take in the large volumes of information input through the robot’s sensors all at the same time, process it and provide almost instantaneous output. Let’s say hypothetically that a robot has about 30 physical sensors, and a sensor fusion layer is placed between the sensors
and the robot engine to supply it with 200 items of data once every 10 milliseconds. The OS used to process this data will surely require a level of parallel processing that is different in nature from conventional real-time processors. At present, our engine cannot handle parallel processing, so a scheduling system is used instead. However, as you know, using an OS and processing the data through scheduling is time-consuming, so we use direct coding rather than an OS.

IQ So the most suitable OS depends on the specs of the robot.

Haga Some people have suggested building robots using Real Time Linux. If you’re talking about making a few robots that perform procedural processes, that might be okay. That would be a case of sequential processing doing one thing first and then another next. If we take that path, we could split the modules in two when too much time is required for processing, like we did with the BN-17. The only problem with this solution is that if you are going to
have several processes performed at the same time on a single processor, what kind of OS do you need? Are you going to do operations that require speed on the driver level? Well, then what do we use for the robot driver? The questions never end. From this point on we’ll need to start getting serious about such things. We want to create a general-purpose robot platform, so it will need to be easy to switch characters. It will be like a descriptive engine. In
one sense, it will be very close to a robot OS.

IQ Will there be hardware issues if robots continue to become more advanced?

Haga Yes, one of those issues will be the CPU architecture. To improve the accuracy of voice recognition, the voice input spectrum is broken down according to formant frequency and fed into the CPU, and the robot can determine its reaction based on whether the voice sounds happy or sad. It would be nice when making that kind of mechanism if there was a processor that could process all that input data all at once. The software algorithm itself is simple.
All it does is numerical comparison. However, there is quite a large amount of information involved. For example, if a
large amount of information, say 10 megabytes or more, is input, you would want a processor that could extract something useful within 100 to 200 milliseconds. This would require a highly specialized CPU.

IQ Is a feature to instantly load a matrix necessary?

Haga Sure. Some time ago there was a thing called content-addressable memory, developed to search for patterns similar to those that were input. I guess this feature would be something like an extremely high-speed version of that. If such a feature was available then within a short period of say 0.5 seconds after inputting a formant pattern, the robot could respond almost immediately by changing its expression to a grimace.

The reaction of soccer robots is still too slow. The mechatronics aspect sometimes falls behind, but it’s thought that in order to have the robots respond like humans, the information from the sensors would have to be checked no less than every millisecond. That might be possible if the number of sensors on the robot were reduced, but for the soccer robot to handle spatial recognition it would need to be able to handle a huge number of sensors and a vast amount of information.

IQ Does the human brain serve as a reference for processing?

Haga The individual processes performed by the synapses in the brain are not that fast, but they are massively parallel, so when we look at these processes on the macroscopic level they are performed rather quickly. When it comes to spatial recognition, however, the human eye does not scan like a television, so it is quite different from a robot. As for dynamic vision, the batting robot exhibited at the World Exposition in Aichi is pretty advanced. Its
movements could be even more interesting if it could process other information about its environment besides the ball.

IQ That sort of thing will affect the cost of production, won’t it?

Haga Even if they don’t perform very complicated tasks, robots have an overwhelming number of parts and components compared to home electronics. The implementation and assembly costs for each one are considerable. The motors and sensors must be connected with screws that must be tightened. As long as many workers
have to be involved in the production of each robot they will never be as cheap as home-electronics goods. The price might be more comparable to cars. Even if the cost of each part goes down, by the time the product is assembled it will have become expensive due to the labor costs. Of course, reducing the cost of the parts will contribute to lowering the overall cost. We have set the BN-17 price to well under 100,000 yen, but if we were to use an
advanced processor the man-hours would force the price up. Using an external computer keeps the price of the robot itself down.

IQ What will be the price bracket for the BN-17?

Haga We plan to set the price well below 100,000 yen. We’re not trying to limit the target buyers, but they will likely be between 20 and 40 years of age, either fathers that like computers and robots or young men. The user will need to set up a wireless LAN, though. Our hope is that the robot will be purchased by fathers who will say to their children, “Let’s teach the BN-17 how to clean up the house.” That’s the kind of scene we envision. I believe it will
be a stepping stone for when we introduce our robot characters to the world in the future. We want our robots to be able to communicate in the home with people of different age groups, like parents and children, or of the same age group, like siblings or friends, and respond spontaneously to new types of play. For that reason we have enhanced the application software and plugged in a huge number of details. In other words, we want to create a home
robot that can connect with people by playing with them and doing things for them. We don’t want to build robots just for geeks to stare at.

Taking the program to the next level using genetic algorithms

IQ The BN-17 includes improved communication features such as the ability to use a wireless LAN to establish a constant connection with an external computer. What types of hidden potential does it have?

Haga The communication features can be used to communicate with external computers that have voice and image
recognition software and make remote control of the robot possible as well. The robot platform is a container of sorts in which to place characters like Doraemon. Yet, if the basic data that control Doraemon’s behavior don’t change, the
user may eventually tire of the robot. That means we need come up with a way to update the data. We hope to use the external communication features to accomplish that task.

IQ Let’s consider the robot’s self-repair features. What if a certain part got damaged, like a reflex behavior set (panel)? How does debugging take place?

Haga The robots are equipped with many reflex behavior sets, so if one of them becomes damaged or begins malfunctioning, the robot engine is programmed to look for the next functional panel. We are developing a simple debugging tool in-house to check whether all the panels are functioning. Inspecting a particular panel is relatively simple, but when you have a large number of panels debugging becomes extremely complicated. You just have to gradually combine the minimum required features and repeat the tests. You freeze the robot, retrieve the potential map and go through and think about what to tweak in an analog manner. In normal manufacturing this method
would never be used because it is so inefficient! This sequential debugging method can be used with the current number of panels, which is around 200 or so, but when the number rises into the thousands it will require quite a bit of manpower. We will definitely need to find new debugging methods in the future. That is one issue we will have to tackle.

At the same time, from a product standpoint, it is desirable that the robot continue functioning even if a few panels are missing. In fact, the robot engine will search for the next available panel if a few are missing, so the user won’t really notice.

However, if enough panels are unplugged it will be like taking the memory card out of HAL 9000 from “2001: A Space Odyssey” – functionality will gradually degrade. When enough panels are removed, the robot’s behavior will become awkward, and it will only be able to repeat certain functions over and over again. Technically,
it is possible for the robot to do a self-diagnosis through memory checks or other methods to determine which panel was disconnected, As robots become more advanced, this approach will become increasingly necessary.

What is Subsumption Architecture?
Subsumption architecture is a way of producing complex behavior through the accumulation of simple reactive behaviors. With insect-like robots, examples include the simple behavior of moving to the left
when nudged on the right and to the right when nudged on the left, as well as moving forward when nudged from behind. By combining these behaviors the robot appears to be moving on its own, as if
searching for the exit in a maze. However, the Bandai Robot Laboratory realizes that robots that only react the same way each time are boring as toys, so higher levels are added to change the robots’ behavior, so they may run away instead and even countercharge occasionally. Various efforts have been made to provide the illusion of changing emotions. The sets of reactive behaviors are called panels, and evelopers
now believe that robots can be made to exhibit more life-like behavior by creating an advanced structure to switch these panels.

Roughly speaking, robot mechanisms that use subsumption are those that take in information input via sensors, supply the information to the robot engine for processing, and output the results from the robot engine to the actuator. The robot engine is composed mainly of modules related to reactive behaviors and modules that adjust the priority of reactive behaviors. A subsumption architecture is built into the robot engine so that information input via the multiple sensors can be processed in an integrated fashion allowing the robot to choose the reactive behavior appropriate for each situation. External stimuli are fed to the robot engine through physical sensors, but with the BN-1 and other models, another layer, called sensor fusion, further classifies the information that is input into more meaningful categories before supplying it to the robot engine.

Humans map the list of reactive behaviors, but the robot prioritizes the behavior itself based on the potential at a particular instant. Non-linear behavior is produced by creative touches to the weighting method. The image is not of an entire system controlled centrally but of multiple processes joined up gradually so as to produce natural behavior.

IQ How much autonomy to give robots is perhaps a philosophical issue, but I think it would be scary if, to use an extreme example, the robot quit listening to commands the instant the switch was turned on.

Haga I don’t know if such a robot would ever be made, but I would be interested to see how the market would react to a robot that starts collect information on its own as soon as it is switched on and acts just like the character upon which it is based. I think the user’s impression of the robot depends on the way the robot reacts. For example, some users may not appreciate being told by the robot in response to a command, “You’re going to make me do the same thing again? That’s boring!” or, if it hits an obstacle during the execution of a program, “This is lame!” Conversely,
should the robot instead be completely subservient and apologize? I think it’s okay for the personality of the robot to be different from one unit to the next. You know, like one lot of serious Doraemons and another lot of lazy ones.

IQ So you will have to see how the market reacts.

Haga Well, more than likely the types of robots that users look for will be different from person to person right from the start. Some people will go for home robots that studiously perform tasks, while others will look for goofy robot friends to cheer them up. Robots that are borderline useless may be rejected as products. I do think that robots that perform tasks for the user will be a big hit. To be perfectly honest, though, it is impossible to know what kind of reaction is best without assessing users’ reactions through trial and error. We are trying to increase robots’ interaction with users. For example, we want to use TansorBorg as a springboard to look for ways to make insect-like robots more fun, by holding robot workshops and the like and having people try out various new features. I think this approach will lead to a lot of surprising discoveries. We hope to engage in such activities on an ongoing basis.

IQ Do you think robots will ever be able to learn and grow on their own?

Haga What we’re looking into at present is taking logs. By that, I don’t mean stuff like “the value on Sensor 1 was 3” or“Motor 3 was set to 1.” I’m talking about something totally different. For example, it might be something like a diary where the robot records that it received praise from its owner for a certain action or behavior. At this point we still haven’t been able to create robots with long-term memory or the ability to keep records, but these features will likely be necessary in the future.

These logs will be used as a case history of sorts and at the same time will provide a way of reviewing how the robot responded in a certain situation. For example, let’s say the robot sends a batch of its logs to the central system nightly. The central system would do a comparative review on the logs, decide how the program might be made better and send the new program and data back to the robots. If we can establish this type of environment, we will be one step closer to having robots that can “grow.” It would provide a further level of autonomy on a platform with
updatable contents.

Still, it would be difficult for a single robot to use artificial intelligence to study and improve on its own. However, if we were talking about tens of thousands of robots in different homes, these types of things could be processed statistically. Genetic algorithms would be applied to address successes and failures, mimicking evolution. If we had 10,000 terminals, we could collect a large amount of statistics at once. With just one robot, transition between generations would require a great deal of time, but with numerous robots it can be done all at once, greatly accelerating improvements to the program.

Although the number of elements is small, the genetic algorithm tests we have performed do provide some leads.
For example, with genetic algorithms for sumo, when two robots are matched against each other, one with random
genetic code and the other with programmed code, the one with random code loses to the other at a ratio of about
7 to 3, but as the generations progress the program improves and the robots become equally matched. We had to
have them face off about 200 times in order to produce those results. Now, if we had 10,000 robots to play with daily, and the logs were sent to the central system every day, we could mix the information and make dramatic advances in the program.

IQ It looks like you still have lots of challenges ahead of you in robot development.

Haga Yes, we do. We have had many successes to date with getting the BN-17 to reproduce life-like behavior, but there are still many things we have yet to achieve. We plan to tackle those challenges from here on out. Thank you
for the interesting questions today.

Author: An ARM Interview, by Jan Howells, Editor IQ online

Synopsis: In this issue we take a close look at the“Real Dream Doraemon Project,” a plan by Bandai to develop a general-purpose robot platform by 2010. We talked to Yoshinori Haga, Director of the Bandai Robot Laboratory, about that aim and the background behind it. Bandai, which has been researching autonomous robots in the run up to this project since the 1980’s, selected ARM9 for use in the BN-17, the newest model and flagship of the BN series. In this interview, Haga spoke with us about several topics, including the future of entertainment robots as envisioned by Bandai, the concepts behind the architecture that will form the basis for that vision and current directions at the company. As Haga’s comments make clear, the robot industry, which is attracting widespread attention as a newly emerging key industry, holds high expectations for ARM processors.

Yoshinori Haga
Director of Bandai Robot Laboratory
Mechatronics Life Design Manager
Technology Development Office
Bandai Co., Ltd.

Mr. Haga entered Bandai after graduating from Kogakuin University with a major in electronic engineering. He was assigned to Tochigi Plant, which at the time was developing toys, and later took a leading role in the development of entertainment robots. Mr. Haga has had a strong interest in computers since high school and studied neural networks at the university. At the Tochigi Plant he developed NetBorg, a remote-controlled mobile robot with an
internal camera, which led to the creation of the BN series.

January 3, 2008

[back to top]

Comments on this article? Send them to comments@fpgajournal.com

All material on this site copyright © 2003-2008 techfocus media, inc. All rights reserved.
FPGA and Structured ASIC Journal
Privacy Statement