Survey of head-up display technologies

As cool as my baseball cap headset looks, it is totally impractical in the real world. Not surprising, since it was built using off-the-shelf components.

But I was wondering, if I was working with people who actually knew what they were doing, what would be possible? In particular, what are the options for overlaying distant-focus images on a person’s field of view? A fresnel lens + half-mirrored reflector works, but the image quality isn’t great and it’s kind of bulky, so there must be better options.

Source: Laster Technologies

The method used by Laster Technologies of France looks sensible. The focusing and reflection are handled by a single half-mirrored curved reflecting surface, presumably a parabolic segment. It’s cheap and easy to implement, and there’s not much that can go wrong. I’m not sure if it’s compact enough to be used in glasses, but it could certainly be used in a hat- or helmet-mounted configuration.

I wonder how clearly the contents of the display can be seen by others, given that the image is being projected forwards and not entirely reflected. With a baseball cap you could probably use the bill to block line-of-sight to the OLED, but it could be a problem with glasses.

Source: SBG Labs

The DigiLens by SBG Labs uses some pretty amazing technology. You really need to watch their video to see what’s going on, but here’s the gist of it: essentially, you can duplicate any lens arrangement using a hologram (which is far more compact than the lenses would be). Unfortunately, holograms only work with monochromatic light, so full-colour images haven’t been possible. The SBG solution is to use switchable holograms (which I didn’t know were possible). The red, green, and blue components of the image are cycled in rapid succession and relayed to the reflective element. The reflective element consists of a sandwich of three switchable holograms which cycle between reflective/transparent in sync with the images. If you cycle this fast enough, apparently your eyes see it as a full-colour image. I’m guessing you could use the same technology to generate full-colour 3D holograms, but that’s another topic.

All very impressive, but I have some concerns. First, it sounds expensive. Second, power consumption is going to be higher than Laster’s solution because the reflective element is constantly switching. Third, rapidly-switched RGB is supposed to look like full colour, but I’ll believe it when I see it. And finally, I wonder what sort of image quality you get when you use monochromatic RGB. I know that laser light looks weird, but that may be due to it being coherent rather than monochromatic.

Source: Vuzix

The Vuzix STAR 1200 uses “patented quantum optic see-thru technology”, which means nothing to me. Does anyone know what technique they actually use?

Source: Lumus

Same problem with Lumus. They use a “patented LOE (Light-guide Optical Element) technology”, which might sound good from a marketing perspective, but it tells you nothing about how they actually work. Because it “shatters the perceived laws of conventional optics” I suspect they may be using SBG’s technology or a variant of it, but since I can’t be bothered doing a patent search, I don’t know for sure.

Are there any technologies I’ve missed?


Improved optics

Note the new visor.

Two of the problems with the original baseball cap head-up display were the dimness of the image and the difficulty of aligning the reflective screens. They turned out to be fairly easy to fix.

First, the two reflective screens, one for each eye, were replaced with a single screen. Going from two surfaces, each with two degrees of freedom, to a single surface with one degree of freedom makes it much easier to line up the image.

The original motivation for using two screens was to have the option of varying parallax to control the image distance. Nice in theory, but too hard to use in practice. I also think two screens looks cooler, but given how lame the whole setup looks, I don’t think it matters.

The dimness problem was solved by using a more reflective material. The iPhone case protector film was replaced with 15% VLT mirror-tint film, the stuff you put on windows. 15% VLT means 15 percent of visible light is transmitted, and presumably the remaining 85 percent is reflected. Melbourne has been overcast the past few days, so I haven’t had a chance to test it in strong light, but indoors it works really well.

Now that the screens are lined up properly and I can clearly see the image, I noticed another thing: more by luck than anything else, when the phone is displaying a live video feed, it lines up exactly with the real world. In a dark-ish room where the video is the main source of light, it’s good enough to navigate by. And it means augmented reality applications such as Wikitude will display points of interest in the correct location. Nice.

One final problem with the original design was the weight of the phone on the end of the cap. I have an HTC Desire Z with a slide-out keyboard, and it’s a fairly hefty device. I speculated that a lighter phone such as an iPhone or a keyboard-less Android might be more comfortable. Well, it turns out they are. A lot more comfortable.

They also avoid another Desire Z problem – when attached to the cap, the volume control buttons of the Desire Z rest on the bulldog clip. That causes the “volume down” button to remain depressed, putting the phone into vibrate mode – and making the phone buzz continually to let me know. The other phones I tested are either too light to depress the buttons, or the buttons are located somewhere else. So now I’m tempted to buy a cheap second-hand Android as a dedicated augmented reality device.


ASCII cam

About a year ago I wrote a post about the brightness of various ASCII characters. It seemed like a fairly obscure and pointless thing to be investigating, but there was method to my madness. The characters identified by that research have been used in my new Android application, ASCII cam.

The application was actually finished several months ago, but it’s only recently that I’ve had time to design a decent launch icon. I’m quite happy with the way it turned out.

Here’s the promotional blurb for the Android Market …

A long time ago computers didn’t have graphics. All they could do was show green characters on a black screen or black characters on white paper.

But programmers were resourceful. They discovered that if you filled the screen with certain characters and squinted you could sort of make out a picture. Thus ASCII art was born.

The ASCII cam application lets you view the world as ASCII art using the camera on an Android device. Characters are displayed as green-on-black, white-on-black, or, for that line printer look, as dark-grey-on-white.

It’s just the thing for Android owners suffering iPhone envy: a photography application that’s even more retro than Hipstamatic. Short of simulating clay tablets or papyrus, this is as old-school as it gets.

Pictures can be saved as either PNG images or text files.

And here’s a screen shot …


Baseball cap head-up display

For a while now I’ve been interested in augmented reality. Working in the field of geospatial science, I’m always looking for better ways to present location-based information, and augmented reality shows a lot of potential. Plus I’m a big fan of science fiction, and AR interfaces have featured in some of my favourite movies – The Terminator (1984) and Robocop (1987) – as well as some of my favourite books – Neuromancer (1984) and Virtual Light (1993).

But although the idea has been a part of popular culture for nearly three decades, the reality has been disappointing. Head-up displays have long been used by military pilots, and are starting to be deployed in luxury cars, but personal units have been gimmicks at best. There have been augmented reality motorcycle helmets and ski goggles, and Google is rumoured to be releasing Google Glasses by the end of the year, but none of them provide a full field-of-view display. This limits their ability to provide fully immersive graphics or even to play movies.

So I decided to build my own. How hard could it be? Not very, it turns out.

I put one together using a smartphone (an HTC Desire Z), a baseball cap, a couple of fresnel lenses, a plastic mirror, some mirrored film from an iPhone screen protector, and assorted office supplies. The smartphone is held in place with a rubber band, and whatever appears on the screen is displayed in the wearer’s field of view. The fresnel lenses are there to push the screen’s perceived distance out to about 50cm, otherwise you’d get serious eye strain.

It works pretty well. Lining up the reflective screens is fiddly, so future designs should lock them in place. They also don’t work too well in direct sunlight, since the mirrored film isn’t very reflective, so the reflected light from the screen is pretty dim. Ideally you’d want some sort of mirror that automatically adjusts to the ambient light, and you’d want the smartphone screen to dim at night, but these work fairly well indoors.

The image quality from fresnel lenses is mediocre at the best of times, but I was forced to use a pair of these in series to get the focal length short enough. As you’d expect, that creates a fair bit of distortion at the periphery, especially since the mounting isn’t very rigid. It would be better to use a single, rigid lens, with just the right focal length – preferably a bit shorter than my current setup, since the 50cm screen distance is a bit too close to be comfortable. I suspect that the best solution would be to do away the fresnel lenses altogether and use curved mirror-tinted surfaces to display the image, but that’s beyond my design and manufacturing ability.

My HTC is fairly heavy with its slide-out keyboard, and with all that weight on the end of the bill you need to put the cap on nice and tight to keep it in place. Something a bit lighter, like an iPhone, would probably be more comfortable.

But it works as a proof-of-concept. It’s pretty cool to run Wikitude with some duct tape over the camera lens, and see all the points of interest rotate around as you turn your head. Now all it needs is some custom software so I can see where I’m going while I read my e-mail.

Taking pictures of an inside view of the display was tricky. In the end, I turned the cap upside-down, stuffed a t-shirt in it, and rested a manual-focus camera on top. The resulting pictures were taken through an upside-down display, but you get the idea.


Android X server

For the past few months I’ve been implementing an X11 server to run natively under Android. In the near future I may have need for a serializable user interface, so to get a better understanding of how they work I decided to implement the de facto standard, X11.

Well, it turns out the X protocol is bigger than I thought, but through sheer bloody-mindedness I got it finished. And it might actually be useful.

I had assumed that all internet-enabled smartphones would be sitting behind NAT-ing routers, both for security reasons and to conserve IPv4 addresses. But no, on the “3″ network in Australia at least, phones all have externally-accessible IP addresses, meaning they can run servers. So you could potentially launch a Linux X application out in the cloud and have it display on your phone.

The user interface is fairly simple: touch the screen to move the pointer, and use the directional pad to activate the left/middle/right buttons. Update: the volume up/down buttons now work as mouse left/right buttons. Both virtual and physical keyboards are supported.

The source code is available at http://code.google.com/p/android-xserver/ under an MIT licence, and the application (called X Server) is available for free through the Android Market.

There are a few parts of the X protocol it doesn’t implement …

  • Dynamic colourmaps. Android only supports a 24-bit static colourmap.
  • Dashed lines, tiles, and stipples. There’s no native support for these in Android, and seriously, does anyone use them?
  • Drawing operations other than Copy and Xor. That’s all Android supports.
  • Queueing keyboard and pointer events during grabs.
  • Any extensions. There are hooks provided in the code, so if you’re feeling ambitious, try implementing RENDER and SHAPE. Quite a few applications use them.
  • Key click, auto-repeat, and keyboard LEDs.

The server also ships without a window manager, which is a problem because a number of applications expect one to be running. The code includes a parameter specifying an Android service to be launched once the X server is running, and this is intended to start a window manager. But first someone will have to implement a window manager in Android, and doing that properly requires a re-implementation Xlib. Not me, I’m afraid.

However, there is a workaround. Because access control is disabled by default, you can run a window manager remotely, e.g. fvwm -d xxx.xxx.xxx.xxx:0. Not very efficient in terms of network traffic, but it works.


PhD final draft complete

The last few months were spent at conferences and sitting in front of a computer knocking out the final draft of the thesis. I was working on another project over the Christmas break, but I’ll discuss that in another post. What I’m going to talk about here is the status of our Heritage Health Prize efforts.

Basically, we’ve stopped working on it. We haven’t submitted since August last year, and we’ve slipped to 12th position (although we climbed three places due to teams ahead of us merging). There are a few reasons …

  • We ran out of ideas. Simple as that. Actually, I’ve got a few ideas I haven’t tried, but they’re a lot of effort to implement and the pay-off won’t be worth it.
  • It’s not cost-effective. The US$3 million prize won’t go off, trust me. So the best we can hope for is $500,000. Shared between two people, for two year’s work, converted to Australian dollars, I’d be better off with a real job. And that’s assuming we win.
  • I’ve learned all I wanted to learn. I’ve never done data mining before, so one motivation for competing was to get up to speed on the latest techniques. Mission accomplished. I’ve now resurrected my undergrad linear regression skills and learned all about decision trees. When it comes to learning new skills I hit the point of diminishing returns long ago.
  • It isn’t useful. Probably the greatest pleasure I gain from writing software is knowing that someone will use it. I hate wasted effort. That’s why I much prefer the business world over academia. Unfortunately, due to privacy safeguards, the data provided in the competition is nothing like real world data, so the algorithms we develop will never be used in practice. That wasn’t the competition’s intention, but that’s the way it will play out.

Continuing on the last point, consider the following example. Probably the easiest hospitalization outcome to predict is childbirth. On the day a pregnant woman gets her first medical check-up, the doctor can pretty much pencil in the date she’ll need a hospital bed. Sure, some pregnancies end in miscarriage or late-term abortion, but they often require hospitalization as well.

Unfortunately, the HHP data doesn’t contain enough information to figure out the date of conception. Or to tell for certain if the patient was pregnant. Or if they had an abortion after discovering they were pregnant. You can tell when they actually gave birth (it’s a hospitalization event with a specific code), but when I tried to predict those outcomes I was wrong almost as often as I was right.

In other words, I think the world’s best data mining software, trained on crippled data, will be less effective at predicting hospitalization than a medical professional using real data. So the software will never be used, and all the effort will be wasted.

 


Rooted Android phone

After much stuffing around (documented here) I succeeded in rooting my HTC Tattoo. What was once running Android 1.6 is now sporting a shiny new 2.3.7 OS.

The new OS gives me access to the Bluetooth API, which I’ll use for testing peer-to-peer communications.


Javascript graphics

It has been a while since my last post. During that time I made concerted push to complete the first draft of my thesis, and all of my 20 percent time was spent on the Heritage Health Prize (we got as high as fourth, but haven’t submitted for a while and slipped out of the top 10).

Anyway, the first draft of the thesis is done, so until I get back from some conferences and start on a re-write, I have a lot of free time. Right now probably 90 percent of my time is 20 percent time.

One technology I needed to brush up on was Javascript, in particular AJAX and using Javascript to do graphics. So I set myself the task of implementing an interactive demand curve. It’s something that my be useful for a dotcom idea I’m kicking around, but at the very least it will provide some working reference code that I can copy and paste to other projects.

When looking around for Javascript graphics libraries I came across Raphael. Very impressive. It’s a drop-in library that works on with all modern browsers. It’s easy to use, and the results look great. Best of all, it provides an animate command, so when you update any element on screen it animates to the new position. It’s fun to watch, and no more effort to implement than a regular non-animated update.

If you’re interested, the interactive graph is here.

 


Planet Melbourne

For the past month all my 20% time has been spent on the Heritage Health Prize, which went live two weeks ago. Competing as Planet Melbourne we’re currently holding down 5th position among 100+ teams, so the early preparation is paying off.

Unfortunately, because of the high stakes involved in this competition, I can’t really say anything about algorithms or strategies. The difference in scores between the top teams is so tight that giving a competitor even a hint could push you out of the prizemoney. Personally, I think the 0.4 threshold for the $3 million prize is unlikely to be reached unless the June 4 data release is something special, but I’ll settle for half of the $500,000 consolation prize.


Heritage Health Prize

My current side project is the Heritage Health Prize, a data mining competition with a $3 million first prize. I’ve teamed up with a much smarter friend and started playing around with new algorithms, in particular Random Forests.

I think we’re in with a chance. I believe that the difference between mathematicians and programmers is that mathematicians try to model the world with equations, whereas programmers use look-up tables, and this competition seems to favour the programmers (like me).

To prevent individual patients from being identified, the dataset has been massaged to conceal rarely-occurring variables. So instead of providing the patient’s age as a number they give us a range, accurate to the nearest decade. instead of a continuous variable, which could be plugged into an equation, we have nine discrete values, which favours a look-up table.

For the time being we’re limiting our efforts to building generic algorithms rather than focusing on the actual data. There are some inconsistencies in the dataset, and I wouldn’t be surprised if there are changes before the release of the rest of the data on May 4th.

But it looks like a fun competition, and I’m picking up some useful skills along the way.


Follow

Get every new post delivered to your Inbox.