Microsoft – Photogrammetry, CO
The below are excerpts from the talk of Franz Leberl at the Map Asia 2007 conference, held at KL, Malaysia during 14-16th August, 2007. That was then… and Now…
Last year, when I was at Map Asia and presented the maps, everything was in 2D. Now, when you access https://maps.live.com, there is a 3D button associated with the 2D website. A dimension has been added and so people can orient themselves with the steepness. This has focused initially on US and North America, followed by Europe and Japan (Asian cities have been neglected!).
what does it mean?
This means that the website is in 3D. It also means that now more than 100 cities are present on the internet with its buildings and other features, like Atlanta, Las Vegas, Los Angeles etc., (Figure-1a & b). The user interface takes hardly any time to load and allows users to navigate in 3D.
Where does this come from?
It comes from a vision that was formulated by Bill Gates on his 50th Birthday, celebrated in London. The vision as he stated was “You’ll be walking around in downtown London and be able to see the shops, the stores, see what the traffic is like. Walk in a shop and navigate the merchandise. Not in the flat, 2D interface that we have on the web today, but in a virtual reality walkthrough.” This means that the 3D models need to be created without even a minute of manual labour, completely automated using aerial photographs with a resolution of 10-15cm per pixel.
This means, tomorrow, if we map the entire world at a 15 cm per pixels, the land masses only, and cities and buildings for 6billion people, we would create a dataset of 22 peta bytes (1 peta bytes = 1024^5 bytes). Imagine this for at least 200-300 cities! This is not all. If we add the street data at 2cm and the indoor spaces for shopping malls, monuments, churches and temples etc., at 0.5cm per pixel, imagine the kind of datasets created and all this becomes the real challenge. Yes, this is the vision of Bill Gates, is the vision of Microsoft – Photogrammetry and Microsoft is committed to it. However, what Microsoft – Photogrammetry will be doing will be driven by Virtual Earth (VE) requirements. This will require sensing through aerial photogrammetry and automating the extraction of 3D data.
Some of the hardware used in this process is shown in Figure-2 and the one that is most important is the digital aerial camera, Ultra CamX (Figure-3 a & b).
This camera, produces images with dimensions of almost 10k to 15k pixels every time you push the trigger and the resolution in this case is around 6cm and we can get upto 2½ cm/pixel. We can or must try to get to the resolution that people want from street side digital sensors. The DEM is produced automatically from the dataset.
But, I was asked once (Jan, 2006), by Bill Gates, “What’s so hard about large format digital aerial cameras?”
And I said, “You need to create 3GB per sec and you have to do it in terms of 432MB per trigger. You have to trigger the camera at an interval of 1second or so in order to get high overlap as you fly. You’ve to store the data at the rate of 2TB in a single instrument and you’d like to swap the instrument while you are in the air, so that you have an unlimited number of images in one aerial flight, without ever going back to the airport. You’ve to do this with a geometric accuracy of 2mm across the 15k pixels. Then you’ve to create each image with 7k gray values at almost 13 bit per pixel. And you have to do all this while it’s cold, while it is shaking, while the wind is blowing, while it is moist or bone dry”.
Bill Gates explained now I understand, Thank you! Perhaps there are 1000 of these cameras operational which operates in airplanes from South Africa to Pittsburg and Japan to Brazil and we have installed 77 of these, around the world.
For the accuracies, we flew the camera in a block (marked by blue in Figure-4). For geometric accuracies, we flew 9 flight lines in N-S direction and 5 flight lines in E-W direction for 80% forward and 60% sideways overlap. For the accuracies in the elevation we took 79 images in the N-S direction and 90 images in the E-W direction. After processing, we got an accuracy that is on the right hand side that is the elevation accuracy (Table-1) and you’ve to relate them to the GSD (9am, 13cm, etc). So what it really means is that its aerial triangulation is accurate to ½ a pixel, which is fantastic.
Let me speculate, what will happen, when we convene next year for Map Asia 2008! There’s a lot of software work going on. An interesting software work that could be expected will recognize the moving objects and will remove them automatically in the final digital terrain model. To illustrate this, look at the image (Figure-5).
In the left hand side of the image, all these cars some of them are driving and some of them are parked, which will be gone in 5 seconds and the parked cars are probably gone by tomorrow. So the software work ongoing is to automate this recognition of the car and remove it and use an infill algorithm to replace the area surrounding texture in an intelligent way (right hand side of the image). This would even consider the pedestrians crossing path. So that kind of software is being developed. This is being worked upon using the redundancy in the digital cameras. By next year, Microsoft will have more than 500 cities. At present the rate of production is more than 1 city per day and by the second half of next year, this will be accelerated to 2-3 cities per day. This will lead to 5000 cities in the next five years.
Other interesting software that is coming is “Photosynth” ). This software takes a large collection of photos of a place or an object, analyzes them for similarities, and displays them in a reconstructed three-dimensional space. As is shown (in Figure-6), these are several photographs of the Piazza San Marco in Venice, Italy. It loads them does an automated orientation and creates a 3d model from the 100s and 1000s of images that can be managed and presented back to us. This is the management of images those are not very large. There is the other software “Sea Dragon”, which can manage 1000s of digital aerial images each upto ½ gb at 16 bit/ pixel. Next year, Photosynth will be a part of VE.
Other new thing is Street Side Imaging at Microsoft. This technology has been developed to stitch (Figure-7) imagery. This huge collection is from the images that are collected while the cars are driven around and later stitched together. So this capability in a year from now will also be available.
So I look forward to the huge success with our system will continue and we will report next year some more exciting news.