Building a Virtual City from the Real World

Time Warner Center, converted by SparseWorld

From the early days of modern computing, the ability to simulate massive virtual worlds has been an attractive and lucrative concept. The games SimCity and Grand Theft Auto, featuring increasingly-elaborate worlds in each version, have sold millions of copies each.

The 1995 movie “Hackers” imagined a filesystem that looked like an intricate virtual city. Massively-multiplayer online games (MMOs) like World of Warcraft and multiplayer sandbox games like Minecraft have given players huge, even infinite worlds to explore or shape. In each case, the world is either created through tireless work by designers and graphic artists, or randomly generated by a procedural algorithm. The effort to build a virtual world that mimics real-world locations has invariably carried a prohibitive cost.

Three trends in modern computing have now come together to make detailed virtual copies of real-world locations possible. First, in an effort to support more detailed real-world navigation, many companies and academic groups have either manually created 3D models of buildings in large cities, or developed the technology to create such models automatically. Google’s Google Earth product contains a combination of automatically-generated buildings and models hand-created by users and employees. Microsoft and both offer mapping products with 3D buildings. The Computer Graphics and Immersive Technologies Laboratory at the University of Southern California has developed a method of creating 3D building models from LiDAR data. Second, extensive high-resolution orthoimagery (satellite imagery) and terrain information is freely available from sources such as the United States Geological Survey’s EROS service. Third, in the quest to satisfy the needs of so-called “Big Data”, a plethora of techniques have been developed to process large quantities of data in a highly parallel manner, vital for generating detailed virtual worlds in reasonable time.


On the hypothesis that data and technology has developed sufficiently to make the creation of virtual worlds from data about the physical world feasible, I have created a system called SparseWorld. SparseWorld combines orthoimagery, bathyspheric and elevation data from the USGS EROS service, and 3D buildings from Google’s 3D Warehouse. It can generate a full photorealistic terrain model of New York City with select buildings for the creative sandbox game Minecraft in a few hours on a cluster of servers containing an aggregate 300 cores and 200GB of RAM. In the remainder of this article, I will explain how SparseWorld works, discuss some of the most significant technical hurdles I faced, and conclude with a look forward on what’s next for SparseWorld.

How SparseWorld Works

As a proof-of-concept prototype, the SparseWorld system is constructed in Python, and combines components from several existing projects. The TopoMC project, itself built on a Python library called PyMCLevel, can generated scaled-down wilderness terrain from USGS elevation and groundcover data. TopoMC provided a base on which to code a converter that could also pull in satellite imagery, could generate full-scale terrain (in which one virtual meter equals one real-world meter), and could parallelize the terrain conversion across many cores or many machines. Because sample 3D building models from Google’s 3D Warehouse are available in the Collada format, I used the PyCollada library to parse the structure of 3D buildings and convert them into voxelized models appropriate for inclusion in a virtual world. I designed and implemented my own parallelization system on top of Python’s multithreading capabilities to allow terrain segments and buildings to be converted in parallel.

SparseWorld collects, converts, and combines the map and building datasets in six steps, five of which are currently implemented:

  1. GetRegion Phase: Determine what areas of real-world terrain data need to be fetched, download the relevant elevation, landcover, and orthoimagery data, and stitch together pieces as necessary.
  2. PrepRegion Phase: Warp and combine data into one large 8-layer GeoTIFF image. Layers are elevation, landcover, core depth, bathyspheric depth, terrain red channel, terrain green channel, terrain blue channel, and terrain IR channel.
  3. BuildRegion Phase 1: Generate Minecraft tiles (16 meter x 16 meter vertical slices of the terrain) from the terrain data (first half of BuildRegion phase). This phase is well-suited to parallelization.
  4. BuildRegion Phase 2: Weld tiles into regions, 512 meter x 512 meter vertical slices of the terrain, each of which is stored in a single file. This phase is reasonably well-suited to parallelization.
  5. StreetCorrect Phase: Generate 2D splines from OpenStreetMap data, correct building shadows and overlaps over streets in orthoimagery. (Planned)
  6. BuildingConvert Phase: Generate voxelized building, structure, and tree models from Collada 3D models, then place onto terrain. (Implemented, data missing) This phase will be parallelized with a pool of converter workers, a crossbar, and a pool of terrain region workers. (Planned)

In addition to these phases, it is clear that missing detail or errors in existing datasets will require some manual verification and tuning of the final results. While the architecture of the SparseWorld system was straightforward to design, the technical hurdles were significant, ranging from opaque file formats to game engine limitations to difficulty obtaining necessary datasets.

Technical Hurdles

Developing the SparseWorld system pushed me to solve several interesting subproblems, some related to my specialty of distributed computing, some in fields unfamiliar to me. The earliest challenge I ran across was converting mesh building models into properly-shaped and -textured voxel models. This was further complicated by details of the Collada model format that initially eluded me. Later, after I implemented terrain creation, conversions between different types of geographical coordinates led to small, hard-to-trace errors in terrain orientation and the alignment of buildings onto the terrain. Finally, while not strictly novel, developing an efficient method of parallelizing the terrain and building conversion required carefully considering the dependencies between the various phases of creating a world. I’ll focus on the process of voxelizing a mesh as the most interesting of the issues faced.

The process used to voxelize mesh models of buildings into voxel models appropriate for Minecraft went through several revisions and refinements. The very first version, a “strawman” or naïve implementation, computed the rectangular prism enclosing each mesh triangle, and filled any voxel that intersected with the prism with stone. This produced blocky but recognizable models, showing that I was correctly interpreting the original models’ mesh coordinates. The second refinement iterated through all of the voxels in the rectangular prism, only filling a voxel if the mesh triangle intersected that voxel. While this made most meshes convert properly, especially thin triangles would occasionally leave holes in the resulting model. A colleague specializing in scientific computing helped me develop a final refinement that iterates over the surface of the mesh rather than the volume of the enclosing prism, reducing the complexity of mesh conversion from O(n^3) to O(n^2) while improving accuracy and making texture-mapping much simpler.

What’s Next for SparseWorld

The current SparseWorld system is capable of generating 277 million square meters of terrain representing Manhattan Island and surrounding areas, comprising 71 billion cubic meters of compressed world information, in a few hours using a handful of powerful machines. The most pressing technical issue is the parallelization of the building conversion, which is at the top of the feature-implementation list. The biggest non-technical issue is the acquisition of a full 3D building dataset. While I have reached out to Google, Bing, and, I have had little success acquiring the necessary data to create a complete map of the city with all buildings.

Beyond those two issues, future development will focus on increasing the fidelity of the generated worlds. For example, a colleague has experimented with using computer vision techniques to recognize repeated patterns in buildings, which could be applied to identify windows in building models’ textures. Google Earth’s dataset includes accurate placement and models of trees in many cities, which would improve SparseWorld’s guesses of where trees should go based on landcover data and the infrared channel of USGS orthoimagery. Finally, to make the system performant, a compiled language rather than Python would be preferable, at the considerable development cost of re-implementing libraries like PyMCLevel and PyCollada.

More Information

Like what you see?

We release awesome content every Wednesday.
Stay updated; signup to our mailing list here:

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>