honoluluadvertiser.com

Sponsored by:

Comment, blog & share photos

Log in | Become a member
The Honolulu Advertiser

Posted on: Monday, February 16, 2004

Mars rover fix similar to PC repair

By Matthew Fordahl
Associated Press

ALAMEDA, Calif. — It's a PC user's nightmare: You're almost done with a lengthy e-mail, or about to finish a report at the office, and the computer crashes for no apparent reason. It tries to restart but never quite finishes booting. Then it crashes again. And again.

Getting caught in such a loop is frustrating enough on Earth. But imagine what it's like when the computer is more than 100 million miles away on Mars. That's what mission controllers faced when the Mars rover Spirit stopped communicating last month.

Ultimately, the fix that saved Spirit wasn't that different from how a PC would be repaired on Earth. It's just that the folks who have their hardware on Mars — and the eyes of the world on them — are better prepared for disaster.

Tech support for an $820 million mission is a cautious affair. Tools to recover from and fix any problem must be built into the system before launch. The systems' behaviors need to be completely understood and predictable.

"Luckily, during the design period, we anticipated that we might get into a situation like this," said Glenn Reeves, who oversees the software aboard the Mars rovers Sprit and Opportunity at NASA's Jet Propulsion Laboratory.

For stability, reliability and predictability, mission designers did not bust the budget and design the hardware or software from scratch. Instead, they turned to hardware and software that's been used in space before and has a proven track record on Earth as well.

"The advantage of using commercial software is it's well-known, and it's well deployed," said Mike Deliman, an engineer at Alameda, Calif.-based Wind River Systems Inc., which made the rovers' operating system. "It has been used throughout the world in hundreds of thousands of applications."

The operating system, VxWorks, has its roots in software developed to help Francis Ford Coppola gain more control over a film editing system. But the developers, David Wilner and Jerry Fiddler, saw a greater potential and eventually formed Wind River, named for the mountains in Wyoming. VxWorks became a formal product in 1987.

The operating system is embedded in systems that control jetliners and atomic colliders, anti-lock braking systems in cars and even heart pacemakers. It's also been used successfully in the Mars Pathfinder lander, Mars Odyssey orbiter and Stardust comet probe.

A key advantage VxWorks has over Microsoft Corp.'s Windows or the Unix operating system is that it is nimble enough to react quickly to any scenario that might crop up.

"If your heartbeat goes irregular, you don't want it to take five minutes to figure out that your heartbeat has gone irregular," Deliman said in his office filled with computers, an empty fish tank and a few dog toys. "You want to be able to catch it right off the bat."

That's simply not available yet in Windows or Unix.

VxWorks operates within only 32 megabytes of random access memory, and parts of it can be modified remotely without having to restart the entire system. (Windows users also can have fixes automatically sent, but restarts are very often required.)

VxWorks also can be tweaked to accommodate different hardware, Deliman said.

In the rovers, the hardware is a single-board computer called the RAD6000. The RAD6000, except for its protection from radiation, is similar to IBM's RS6000 server, which was popular among businesses in the 1990s.

The computer, which costs up to $300,000, runs at a fraction of the speed of today's desktop computers. It also has other limits, such as just 128 megabytes of random access memory.

But Spirit and Opportunity carry more flash memory — the same type used in digital cameras to store pictures — than any other spacecraft.

That turned out to be part of the problem that temporarily halted Spirit in its tracks.

All computers, through the operating system, need to keep track of their files, whether they're on a hard disk or, as in the case of the rovers, in flash memory. And each file requires a little bit of memory.

After seven months of cruising between Earth and Mars as well as a couple weeks on the ground, thousands of files accumulated in flash memory, quickly gobbling up the 32 megabytes allocated for the operating system.

After more than two weeks on the ground, Spirit's computer reset itself. Over and over again. From the perspective of controllers on Earth, the device just stopped communicating.