I have a table with polygons that I wish to display in a map. I would like the smaller polygons to appear on top of larger polygons. Is there a way to set the drawing order such that the larger polygon outlines don't cover up the smaller ones?
I've tried using the polygon area attribute as a transparency factor within the layer's symbology properties - but this is not working well enough.
Since you already have a field carrying the area, just use the Sort tool, and use the area field with the 'DESCENDING' option.
Note: The Sort tool requires an advanced license.
You could add a new field called
Size, and then field calc either
Largeinto the new field.
Then symbolize by Unique Values based on the Size field. Then go into
Advanced --> Symbol Levelsand make sure the
Draw this layer using symbol levels specified belowis checked, and that the Small label is above the Large label.
However you choose to do it, what you're looking for is the Symbol Levels option.
If these are gdb feature classes, another option would be to use a Definition Query that is based on the SHAPE_AREA field. Just right-click on the layer in the TOC and copy and paste it. Place the new layer on top of the old layer, and in the new layer add a definition query that will filter out any features that are bigger than ____. Now you have the smaller features on top of the larger features.
Choosing the format in which to store your data is often a balance of many factors, including the needs of your organization or users, purpose of the data, size of the data, analysis or data maintenance requirements, and so on. However, in terms of speed, shapefiles are generally fastest, followed by personal geodatabases, then file geodatabases. For optimal map performance, the data in your map should reside locally on the computer that is being used to view it rather than on a remote machine. For maps that will typically display relatively small areas of large datasets, serving data via ArcSDE will yield significant performance benefits over storing this same data in files.
Avoid personal geodatabases for any situation where you must have multiuser access or are serving the map, since this format was not designed for these purposes.
Here are some additional considerations for setting up your data:
- Keeping all projections the same—If possible, keep all data in a single projection and use the same projection in the data frame when working in ArcMap. This is particularly important to keep in mind when editing data or authoring data to be served. When all layers are in the same projection, the performance penalty incurred by on-the-fly projection calculations can be avoided.
- Working with joined or related data—Data from appended fields accessed through joins and relates can be used to symbolize and label features, to perform queries, and for many other operations. However, accessing data through joins and relates can slow performance. See Essentials of joining tables for details on optimizing table joins. In addition, you can export the data to new feature classes that contain the joined or related information.
- Using attribute or spatial indexes—If the data source allows it, index any fields used for querying or rendering. Indexes are specific to each data format. For more information, see Modifying indexes in shapefiles by adding a spatial indexand A quick tour of setting a spatial index (geodatabases). The geoprocessing framework also provides a tool to create attribute indexes: the Add_Attribute_Index tool.
- Simplifying data—Use simplified or generalized versions of layers when displaying at smaller scales. For example, a detailed map of world coastlines may draw slowly at full scale. If this layer is simplified to have fewer vertices and line segments, it will draw much faster with little difference in appearance at a small scale. In addition, simplified data can improve performance for querying and identify operations.
Types of feature classes
Vector features (geographic objects with vector geometry) are versatile and frequently used geographic data types, well suited for representing features with discrete boundaries, such as streets, states, and parcels. A feature is an object that stores its geographic representation, which is typically a point, line, or polygon, as one of its properties (or fields) in the row. In ArcGIS, feature classes are homogeneous collections of features with a common spatial representation and set of attributes stored in a database table, for example, a line feature class for representing road centerlines.
When creating a feature class, you'll be asked to set the type of features to define the type of feature class (point, line, polygon, and so forth).
Generally, feature classes are thematic collections of points, lines, or polygons, but there are seven feature class types. The first three are supported in databases and geodatabases. The last four are only supported in geodatabases.
- Points: Features that are too small to represent as lines or polygons as well as point locations (such as GPS observations).
- Lines:Represent the shape and location of geographic objects, such as street centerlines and streams, too narrow to depict as areas. Lines are also used to represent features that have length but no area, such as contour lines and boundaries.
- Polygons: A set of many-sided area features that represents the shape and location of homogeneous feature types such as states, counties, parcels, soil types, and land-use zones.
- Annotation: Map text including properties for how the text is rendered. For example, in addition to the text string of each annotation, other properties are included such as the shape points for placing the text, its font and point size, and other display properties. Annotation can also be feature linked and can contain subclasses.
- Dimensions: A special kind of annotation that shows specific lengths or distances, for example, to indicate the length of a side of a building or land parcel boundary or the distance between two features. Dimensions are heavily used in design, engineering, and facilities applications for GIS.
- Multipoints: Features that are composed of more than one point. Multipoints are often used to manage arrays of very large point collections, such as lidar point clusters, which can contain literally billions of points. Using a single row for such point geometry is not feasible. Clustering these into multipoint rows enables the geodatabase to handle massive point sets.
- Multipatches: A 3D geometry used to represent the outer surface, or shell, of features that occupy a discrete area or volume in three-dimensional space. Multipatches comprise planar 3D rings and triangles that are used in combination to model a three-dimensional shell. Multipatches can be used to represent anything from simple objects, such as spheres and cubes, to complex objects, such as iso-surfaces and buildings.
Navigating in the data frame and working with its layers
The Tools toolbar is one of the primary ways that you interact with geographic information displayed in the data frame. It contains tools for working with the contents within the active data frame, for example, to pan and zoom your map, to identify features, and to measure distances.
Allows you to zoom in to a geographic window by clicking a point or dragging a box.
Allows you to zoom out from a geographic window by clicking a point or dragging a box.
Allows you to pan the data frame.
Allows you to zoom to the full extent of your map.
Allows you to zoom in on the center of your data frame.
Allows you to zoom out on the center of your data frame.
Allows you to go back to the previous extent.
Allows you to go forward to the next extent.
Allows you to select features graphically, by clicking or dragging a box around them. You can also use the Select By Polygon, Lasso, Circle, and Line tools to select features using graphics drawn to the screen.
Unselects all the currently selected features in the active data frame.
Allows you to select, resize, and move text, graphics, and other objects placed on the map.
Identifies the geographic feature or place on which you click.
Triggers hyperlinks from features.
Triggers HTML pop-up windows from features.
Measures distances and areas on your map.
Finds features in the map.
Allows you to calculate point-to-point routes and driving directions.
Allows you to type an x,y location and navigate to it.
Opens a time slider window for working with time-aware layers and tables.
Allows you to create a new viewer window by dragging a rectangle.
In addition, right-clicking in the data frame displays a context menu of data navigation tools.
Interactive panning and zooming using basemap layers
Smooth, continuous panning and zooming can be turned on and are productive, especially when using basemap layers.
ArcGIS - possible to set the drawing order of polygon features - Geographic Information Systems
GIS is playing an increasing role in managing human activities. Emerging technology will expand this role enormously.
Today, our world is evolving rapidly, increasingly influenced by millions of independent human activities. As a result of these activities, our world is becoming more populated and urbanized more technical and specialized more connected, globalized, and informed and, some people would suggest, more fragile.
These human activities are impacting our environment, earth's natural places, biodiversity, and the availability of natural resources that support civilization as we know it. Many people believe these activities are not sustainable and that, without more intelligent planning and management, they will negatively affect our future.
The evidence of human activity is everywhere. Scientific measurements suggest that the earth is getting warmer, thus changing climate, sea level, and potentially many aspects of our environment.
These trends suggest that we need to take responsibility for better managing our world.
The Growing Role of Geography and GIS
Geography is the science of our world. It involves the study of the earth and all the contents and processes that are evolving around us. GIS builds on geographic science by providing an information system for organizing, managing, and integrating complex scientific data and knowledge. GIS is also a framework for making this knowledge accessible to scientists, planners, decision makers, and the public.
GIS is particularly valuable for managing human activities. The tens of thousands of successful GIS applications provide evidence of this. GIS brings our geographic measurements together with powerful tools for visualization, analysis, and modeling. These technologies are increasingly being integrated into the planning and decision-making operational workflows of our organizations.
GIS and this geographic approach are providing a new opportunity for reorganizing our methods and approaches to consider all the factors. The implication of this is vast and will eventually affect the planning, design, engineering, and management activities of nearly every profession.
GIS Users Are Making a Difference
Our users' work benefits many areas. At the level of basic science, GIS users are improving understanding of how the planet works at all scales. At the application level, GIS users are analyzing complex situations, visualizing problems, and creating geographic plans and solutions they are also increasing efficiency, reducing costs, and helping people make faster and better decisions that consider all the geographic factors necessary to create a sustainable future. And GIS users are improving the processes of communication and collaboration, helping to better coordinate work across organizations.
All the hundreds of thousands of individual GIS efforts are clearly making huge contributions toward better management of our planet. Nevertheless, there is much more to be done. Current technology is evolving in such a way that, for the first time, all our individual efforts can be connected and integrated to provide a basis for all of society to benefit from the power of GIS.
The Vision of a System of SystemsThe GeoWeb
Deploying GIS on the Internet can lead to a distributed and multiparticipant GIS. While still in its early development, we are already using this as a platform to publish maps and share geographic data. But with appropriate Web services standards, it is also possible to link and connect GIS Web sites to integrate our individual datasets into new applications that model geography and support a multitude of applications.
This type of GIS will evolve into something I like to call the GeoWeba wide network of distributed GIS services that describe and model everything known about our planet: the sum of our geographic knowledge. This framework already supports map and data publishing, metadata cataloging, and the discovery of geospatial services. Over time, it will expand to support the dynamic connection of a whole host of distributed GIS services, including data management, modeling, GIS analysis, and advanced visualization. This can provide a platform for us to look at bigger problems that depend on cross-organization and cross-discipline collaboration.
The notion of a system of systems for geospatial information was first suggested by the National Academy of Sciences Mapping Science Committee and was referred to as the National Spatial Data Infrastructure. More recently, this architecture has been adopted by the National Oceanic and Atmospheric Administration and others as part of their architecture for the Global Earth Observation System of Systems (GEOSS). GEOSS will serve as a framework for integrating the large number of global remote-sensing systems into a loosely coupled network available to many participants. We are now seeing how this same architecture will serve as a framework for the GeoWeb.
The GeoWeb "system of systems" will be used for a whole range of applications and purposes, supporting regional, national, and even global applications. This new working environment will provide many benefits beyond what we are doing in individual systems and organizations it will help us better manage our world.
This multiuser, distributed GeoWeb will require GIS services-oriented technology, leadership, sharing policies, and GIS professionals who have the knowledge and experience, as well as an interest in sharing their efforts as part of a larger network.
This GeoWeb will be constructed using standardized spatial data models, with GIS metadata portals for organization and discovery, and will be implemented by various collaborative agreements to share and contribute knowledge and information.
The GeoWeb will support a multitude of GIS services with participants organized loosely into various geospatial communities.
These communities will range from consumers who are interested in finding or navigating to a location or looking at maps, images, and visualizations to professionals focusing on sophisticated logistics, real-time situational awareness, geospatial modeling, decision support, and integration of GIS services into enterprise environments.
GIS knowledge will permeate virtually all our government and business enterprises, providing better geographic awareness of what is happening, improving decision making regarding all levels of human activity, and delivering many benefits.
Over time, these communities will expand, interoperate more, and become increasingly synergistic. This will be fueled by increasingly easy-to-use GIS technology. The GeoWeb will evolve rapidly, not driven by a few thousand, or even a few hundred thousand, but by millions of participants. We already see evidence of this in the popularity of consumer mapping and visualization Web sites from Google, MapQuest, and Microsoft.
We are moving into a geodata-rich society, with more geospatial information and more access to it. It has been suggested that within the next six years we will have a hundred times more satellite imagery available. GPS location data and the real-time monitoring of various geographic phenomena will be increasingly available in consumer, as well as professional, applications. These measurements will be served on the Web and available as GIS application building blocks in GIS portals designed for geospatial knowledge discovery. Together, all these patterns will cause an increase in spatial literacy among virtually all our social institutions and the citizens who support them.
The evolution of GIS is being supported by powerful enabling technology that includes faster (multicore) processors, inexpensive data storage, high-performance networks, standards for open/interoperable protocols (e.g., XML, SOAP), and more locationally aware wireless devices for accessing GeoWeb services. At the same time, GIS software is evolving new capabilities and services-oriented architectures that will make GIS use more productive.
Esri Software Strategy
The fundamental goal of Esri's software development is to help users solve their problems. We are doing this by investing in four fundamental software areas:
- Enhancing the core desktop GIS platform
- Strengthening and simplifying geodata management
- Extending the GIS server environment
- Providing more access to mobile GIS tools and GIS Web services
These individual strategies are being realized in an integrated network architecture that leverages the multiple ways our users deploy our tools (e.g., mobile, desktop, client/server, and networks).
ArcGIS 9.2Responding to Users' Requests
The current Esri development focus is on the ArcGIS 9.2 release, which is targeted to be available in the first half of 2006. This release is large and responds to literally hundreds of user requests for additional functionality.
This release also focuses on making major improvements in user documentation, quality, and the usability of our software.
There are major enhancements in all four ArcGIS Desktop products (ArcReader, ArcView, ArcEditor, and ArcInfo) and their extensions. Improvements include areas such as data compilation, geoprocessing, data management, interoperability, cartography, charting, and animation.
Data Compilation and Editing
Data compilation/editing will be enhanced to include a full suite of COGO construction tools for supporting land records data entry inside ArcEditor and ArcInfo. There will also be many improvements in CAD integration, including better annotation support, native rendering, and improved tools for georeferencing. There will be improved raster-to-vector conversion tools for better feature recognition and extraction.
In addition, ArcGIS Survey Analyst will implement a complete workflow for cadastral data management.
ArcGIS 9.2 will extend and improve support for standards-based interoperability and will also add new data sources to the ArcGIS Data Interoperability extension, providing new direct read/conversion capabilities. This extension will support "transformations" of complex data from one set to anothernot just the data format conversion but also the underlying schema reorganization necessary for semantics translation.
This Extract, Transform, and Load type of interoperability procedure is particularly important for interoperability between and among systems on the Internet. Examples of large users in the United States already doing this include the Bureau of Land Management, Department of Homeland Security, and the U.S. Geological Survey. In addition, hundreds of local governments, such as Sacramento County, California, and Salt Lake City, Utah, are using this extension to dynamically support their data transformation remapping for all types of datasets.
ArcGIS 9.2 will offer many new advancements in cartography, particularly for creating high-quality maps. Perhaps the largest of these is in the area of cartographic editing and finishing (i.e., being able to edit and persist map symbology in a geodatabase). The editing capability is similar to that supported with graphics packages often used for cartographic finishing. This means cartographers can use advanced rules-based technology for automatic, computer-generated mapping and interactive graphic tools to apply an artistic touch to the design and finishing of maps.
ArcGIS 9.2 will support the ability to store multiple graphic representations for a single feature in a geodatabase. This means that a single set of features can be used to support multiple types of map products at different scales, while at the same time maintaining a single set of features for "one-touch" editing.
ArcGIS 9.2 will also have a series of cartographic generalization tools that performs a variety of geometric operations of GIS features, including line simplification, aggregation of polygons, and simplifying buildings.
This year, ArcGIS will introduce a new sketching tool for geographic information. This tool will support many fields of geographic design, such as landscape architecture, city and regional planning, forestry, military planning, and other types of strategic designs in which sketch visualization is important.
The Historic 1988 Fires in Yellowstone National Park
Get a sense of how the fires progressed
- Turn on the Historic Yellowstone Fire layer by checking the box to the left of its name.
- If they are on, turn off the Yellowstone Facilities, Yellowstone Town, Natl Wildlife Refuge, Teton Natl Park, Yellowstone National Park, National Forest layers.
- To investigate the dates of the fire, Open the Attributetable for the Historic Yellowstone Fire layer and sort the DATE field in ascending and descending order. NOTE: The date format is year&acirc&#128&#147month-day. Search for the name of the first and last fire of 1988. The unburned areas have the date "1988".
- Right-click on the Historic Yellowstone Fire layer in the Table of Contents and select Open Attribute Table.
- Scroll across the Attributetable to find the DATE field in the last column.
- Right-click on the DATE field heading and select Sort Ascending to find the first fire of 1988. Then switch to Sort Descending to find the last fire of 1988.
First fire &acirc&#128&#147 Fan Fire, June 30th, 1988 Last fire &acirc&#128&#147 Clover-Mist, Oct 10th, 1988
Create and execute a Query before finding the statistics on the number of acres burned by the major fires
- First, set up a query to locate the North Fork Fire. Then open the Historic Yellowstone Fires attribute table. Finally right click in the Acres field to get the Statistics on the North Fork Fire. Repeat this process for the other large fires including Clover-Mist, Mink, Storm Creek, and Hellroaring.
- Click the Selection &gt Select by Attribute menu option to open the Select by Attribute window.
- A new window opens. Move it to where you can see both the Select by Attributes window and the map.
- In the Select by Attributes window, double click on FIRENAME, then click once on the equals sign. Click on Get Unique Values and then double click on the words North Fork. (FIRENAME = 'North Fork'). Click Apply and OK.
- To find out how many acres were burned by the North Fork Fire, right click the Historic Yellowstone Fires layer and click on Open Attribute Table. Right click on the ACRES field header and click on Statistics in the context menu.
- A new window opens. Look at the Sum field for the total acres burned in the North Fork fire. 827 records are selected on the map as is displayed in the
Count: on the Statistics window.
- North Fork - 531,225.451 acres
- Clover-Mist - 360,055.750 acres
- Mink - 144,687.751 acres
- Storm Creek - 143,650.534 acres
- Hellroaring - 101,974.311 acres
A Historical Geographic Information System (HGIS) of Nubia Based on the William J. Bankes Archive (1815-1822)
The William J. Bankes Archive, Dorchester, is an impressive collection of original material concerning the archaeological, anthropological and natural heritage of Nubia and was amassed in the years 1815-1822. In the last two hundred years, many geo-human factors caused radical changes in the region. In a landscape almost untouched for centuries, the signs of the interactions between the ancient human communities and the natural environment were much clearer in Bankes’ times than now. Digital humanities offer powerful tools to manage and visualize large amounts of data and GIS in particular is an effective form of relational database, where all items of data have a position on the earth. This paper presents the methodology and the preliminary results of a research project that aims at a draft reconstruction of ancient Nubia based on the Bankes Archive. Archaeological, historical, natural history and ethnographic information extracted from the documents will be georeferenced in the GIS. Original maps, landscape views and epigraphic copies will also be made available on-line.
Geoprocessing service configurations
Geoprocessing services can be created by publishing two different ArcGIS Desktop resources a geoprocessing toolbox or an ArcMap document (.mxd) containing tool layers.
- When you publish a toolbox, all tools within the toolbox become geoprocessing tasks within the geoprocessing service.
- When you publish an ArcMap document, all tool layers within the map document become geoprocessing tasks within the geoprocessing service. (Tool layers are created by dragging and dropping tools into the ArcMap table of contents.)
- When publishing an ArcMap document containing tool layers, you can also specify that you want the ArcMap document to become a map service that will be used to draw the output of tasks. A map service that draws task outputs is called a result map service.
These three configurations are illustrated below.
Geoprocessing service from a toolbox
When you publish a toolbox, all tools within the toolbox become geoprocessing tasks. Data output by tasks is transported back to the client.
Geoprocessing services with a source map document
If you have used geoprocessing tools in an ArcMap session, you know that tools can often use layers found in the ArcMap table of contents, as well as data on disk.
In the same way, your geoprocessing task can use layers found in its source map document. The source map document, in this case, acts as a container of layers. You can make layers in the source map document input parameters to your task. In the graphic below, the Data to extract variable is an input parameter that allows the user to choose layers in the source map document.
Geoprocessing tasks can only access layers found in its source map document—they cannot access layers found in other map services or in the client application.
There are performance benefits to using layers from a source map document in your model or script processes. The illustration below shows a model that uses a network dataset, StreetsNetwork , to construct a route analysis layer. The StreetsNetwork variable can either reference a layer (which it does in this case) or a dataset on disk. Opening a network dataset is expensive relative to other kinds of datasets because network datasets contain several advanced data structures and tables that must be read and cached. By using the layer instead of the dataset, there is a performance advantage, because ArcMap opens the dataset once, caches basic properties of the dataset, and keeps the dataset open. When the model executes, the dataset does not have to be reopened, since the source map document already has it opened—a performance boost. Conversely, if the StreetsNetwork variable directly referenced the dataset, the dataset would be opened each time the model executes—a performance degradation.
For network analysis, you always want the network dataset as a layer in the source map document and to use that layer in model variables. For other kinds of datasets, such as features and rasters, the performance advantage of using layers in the source map document is very slight.
Geoprocessing services with a result map service
Geoprocessing services can have a result map service to create a digital map image of task results. Digital maps contain visual representations of geographic datasets that communicate information. Digital maps are transported across the Web as images (such as a .jpg ). A map image, byte for byte, contains far more human-interpretable information than raw features in a feature class. Map images are also manageable—they are easily compressed, they can be tiled into manageable chunks, and there are established methods for transporting and viewing them across the Web.
Map images are created by an ArcGIS Server map service and are the result of publishing an ArcMap document ( .mxd ). Because of the characteristics of a map image, you may want to create one for the results of your geoprocessing task and transport the image across the Web rather than transporting the result dataset or datasets. Geoprocessing services can have a result map service used by ArcGIS Server to create map images of your output data.
Result map services should be used when
- The result of your task is a (potentially) large dataset.
- The data type of your output is unsupported by the client, such as rasters in ArcGIS Explorer. In this case, you use the result map service to display the output.
- You want to protect the result of your task by allowing it to only be viewed as a map and not downloaded as a dataset.
- You have complex cartography that needs to be drawn by the result map service and not by the client.
When you use result map services, it is important to realize that there are two services—the geoprocessing service and the result map service. These two services execute independently of each other. When the task executes, ArcGIS Server executes the geoprocessing task first, then executes the result map service to draw the output of the geoprocessing service. Because of this execution order, the result map service needs datasets on disk produced by the geoprocessing service. This means that the output of the tasks in the geoprocessing service must be datasets on disk, not layers or in-memory datasets.
As new information and communication technologies have altered so many aspects of our daily lives over the past decades, they have simultaneously stimulated a shift in the types of data that we collect, produce, and analyze. Together, this changing data landscape is often referred to as "big data." Big data is distinguished from "small data" not only by its high volume but also by the velocity, variety, exhaustivity, resolution, relationality, and flexibility of the datasets. This entry discusses the visualization of big spatial datasets. As many such datasets contain geographic attributes or are situated and produced within geographic space, cartography takes on a pivotal role in big data visualization. Visualization of big data is frequently and effectively used to communicate and present information, but it is in making sense of big data – generating new insights and knowledge – that visualization is becoming an indispensable tool, making cartography vital to understanding geographic big data. Although visualization of big data presents several challenges, human experts can use visualization in general, and cartography in particular, aided by interfaces and software designed for this purpose, to effectively explore and analyze big data.
Poorthuis, A. (2018). Big Data Visualization. The Geographic Information Science & Technology Body of Knowledge (3rd Quarter 2018 Edition), John P. Wilson (Ed.). DOI: 10.22224/gistbok/2018.3.5.
This entry was first published on September 16, 2018.
This Topic is also available in the following editions: DiBiase, D., DeMers, M., Johnson, A., Kemp, K., Luck, A. T., Plewe, B., and Wentz, E. (2006). Computational issues in cartography and visualization. The Geographic Information Science & Technology Body of Knowledge. Washington, DC: Association of American Geographers. (2nd Quarter 2016, first digital).
Big data: Datasets that are characterized not only by their high volume but also by their velocity, variety, exhaustivity, resolution, relationality, and flexibility.
Volume: The amount of data necessary to be considered ‘big’ data. Typically, the volume of big data is measured in terabytes and petabytes, or consisting of millions to billions of observations.
Velocity: The frequency with which a dataset is updated. Typically, big data is produced or updated in real-time or at a fine temporal granularity.
Variety: The diversity of data points available within and between data sets. Big data typically consists of a wide range of structured and unstructured datasets from different sources and provenances.
Exhaustivity: A term that describes the scope of big data. For big data, the data set is typically as wide as possible, focused on entire populations rather than samples.
Resolution: The granularity and detail in big data. Big data is typically as detailed as possible, including being indexical in identifying the objects under study.
Relationality: The extent to which different datasets can be joined together based on common attributes. One of the defining characteristics of big data is its ability to be connected to other datasets.
Flexibility: The ability of a dataset to be easily extended (with additional attributes) and expanded (by adding additional observations).
Data Reduction: A strategy used to reduce the amount of data or summarize relevant parts of a dataset.
Filtering: The subsetting of a dataset based on attributes of the data.
Subsampling: The subsetting of a dataset based on stochastic sampling.
Aggregation: The combination of multiple data points into a higher-level aggregation.
Projection: A data reduction strategy that ‘maps’ data points to either a smaller number of dimensions or narrower data range.
2.1 What is Big Data?
New information and communication technologies have altered many aspects of our daily lives over the past decades, and simultaneously stimulated a palpable shift in the types of data that companies, governments, scientists, and individuals are able to collect, produce, and analyze. These new emerging datasets are often referred to as big data. The term ‘big data’ was first coined in the 1990s (Diebold, 2012). While the exact definition of big data remains somewhat fluid, there have been several efforts to define its core characteristics. One of the most commonly used definitions is based on the “three V’s” (Laney, 2001):
- Volume. Big data is massive and is often measured in terabytes and petabytes or consisting of million or billions of observations.
- Velocity. Big data is produced or updated in real-time or at a fine temporal granularity.
- Variety. Big data consists of a wide range of structured and unstructured datasets from different sources and provenances.
Although the 3V definition is succinct, new and alternative definitions of the concept have also been developed that help to further distinguish big data from “small data.” A useful synthesis of these definitions adds four additional dimensions to the 3V definition (see for an extensive review and (Kitchin, 2013 2014 Kitchin & McArdle, 2016)):
- Exhaustivity. The scope of big data is as wide as possible, focused on entire populations rather than samples.
- Resolution. Big data is as detailed as possible, including being indexical in identifying the objects under study.
- Relationality. Big data can be connected easily. Different datasets can be joined together based on common attributes.
- Flexibility. Big data can be easily extended (with additional attributes) and expanded (by adding additional observations).
A data source or dataset does not need to exhibit all of the seven characteristics to be considered big data and there is no exact threshold that differentiates small and big data. Instead, it is an accepted notion that there exists a gray transition zone between the two. Further, multiple different forms or ‘species’ of big data may exist at the same time (Kitchin & McArdle, 2016). However, regardless of the semantics, it is clear that many of the datasets that are produced, analyzed, and visualized in the 21 st century differ significantly from their 20 th century counterparts, prompting a re-evaluation of the role cartography and visualization in this process.
1.2 The Relevance of Big Data for GIS&T
A large portion of big data is geographic in nature and, as such, big data has had a large impact on the geographic disciplines. Spatial big data ranges from mobile phone and traffic data to social media platforms (see Social Media Analytics) and credit card transactions, to air quality sensors and satellite imagery – each of which provides not only a data point, but a geographic location associated with that data point. All of these datasets can potentially help us better understand the world around us (see Citizen Science with GIS&T) and thus have seen an uptake in spatial research (Arribas-Bel, 2014 Goodchild, 2007 Graham & Shelton, 2013). The increasing prevalence of these types of datasets have spurred an entire new discipline of Data Science and some people working in GIS and related fields have started to relabel themselves as "spatial data scientists," as can be seen in the new Center for Spatial Data Science at the University of Chicago and the Geographic Data Science Lab at University of Liverpool.
More importantly, big data might change how we approach spatial analysis and visualization. While we now have access to unparalleled, large quantities of heterogeneous data about the world around us, it remains a formidable challenge to understand and interact with this data in meaningful ways. As a result, new approaches have been developed to help automate many aspects of data analysis, such as automated machine learning approaches, artificial intelligence and other "unsupervised" computational methods (see Artificial Intelligence). While these automated approaches can be useful additions to our toolbox, the human role in spatial data analysis and visualization remains essential. As Shneiderman (2014) argues, while computer-led data analysis might be effective for well understood topics, the creation of new knowledge and breakthroughs requires human experts who can use and understand visualizations to gain new insights. Visualization is an indispensable tool to make sense of big data, which makes cartography vital to understanding geographic big data.
In the domain of big data visualization, we can make the distinction between roughly two types of visualizations: those that aid in visual thinking and those meant for visual communication (DiBiase, 1990) (see Cartography & Science and Geovisualization for a more in-depth discussion). Visual communication is best done with a "Map-to-See," a straightforward cartographic representation to be understood in the blink of an eye (Kraak, 1988). On the other hand, visual thinking is often done through more complex cartographic products that may take a while to be fully understood: a "Map-to-Read."
In the context of big data, visual communication has been employed by companies, news desks, and scientists (see Narrative & Storytelling, forthcoming) to communicate findings, present narratives, or sometimes simply to impress on the reader the complexity or size of the underlying dataset. A clear example of the latter is the so-called "hairball" visualizations in which complex, large networks are visualized with an equally complex ball of lines (Krzywinski, Birol, Jones, & Marra, 2012). Within cartography, an analogous example is projecting a big dataset consisting of spatial points directly onto a map, resulting in a complex representation with millions of dots. Although many big data sets are indeed visualized to present and communicate – often in beautiful and compelling ways – ultimately the use of big data within this map use mode is not significantly different from that of small or more conventional data sets.
In the "visual thinking" mode, visualization is inextricably linked with big data for the purposes of exploration and analysis, and specifically to make sense of big data and generate new (scientific) knowledge (Fox & Hendler, 2011). Although it comes with its own set of challenges (see next section), visualization allows researchers to explore, analyze, and synthesize datasets that are too large, complex, and heterogeneous to understand by merely looking at the raw data. Visualization as such is an indispensable tool in this process and an important driving force in complex analyses of big data (see Geovisual Analytics).
Figure 1: Examples of 'hairball'-type visualizations. From left to right, an example of a namesake network visualization a map of global passenger air routes (Josullivan.58 / CC-BY-3.0, https://commons.wikimedia.org/wiki/File:World_airline_routes.png) and a map displaying over 6 billion tweets showcasing Mapbox’ mapping platform (Eric Fisher / CC-BY-2.0, https://www.flickr.com/photos/walkingsf/15869589271/in/photostream/).
The most obvious set of challenges with big data visualization are computational in nature. In its simplest form, it can be a challenge for conventional CPU-based mapping software to draw increasingly large amounts of data points (see Graphics Processing Units). Large datasets can also complicate even basic functions, such as data storage. For example, the file size of a standard shapefile in a Geographic Information System is limited to 2GB (or roughly 70 million point features) and 255 attributes, and each field is limited to 254 characters. Many big datasets exceed these limits, which warrants new file formats. In addition, the unstructured nature of many big data sets does not necessarily fit in the structured rigidity of conventional relational databases. New database ontologies (such as document-oriented and other NoSQL formats) have been developed to address these issues.
Another set of challenges with the visualization of big data lays within the domain of visualization itself. It should be noted here that these issues are not inherently unique to big data. Rather, big data significantly amplifies many pre-existing challenges in cartography and forces us to acknowledge and address them explicitly. The most obvious of these challenges is related to the size of the data. Simply visualizing or plotting such a large number of data points might create confusing visualizations that yield no insights (cf. the hairball visualization discussed above) or visualizations that hide or obscure data, often referred to as overplotting (see (Dang, Wilkinson, & Anand, 2010) for a discussion).
Many spatial big datasets contain precise geographic coordinates for each observation, which poses another, paradoxical challenge: the ease with which these coordinates can be plotted as points on a map may lure us into a potentially narrow or constraining visualization of big data (Crampton et al., 2013). On the flip side, some big data contains less precise, but still spatial, references to vernacular place names, neighborhoods, and spatial regions that might not be easily mapped to the discrete geometry of a polygon.
Of course, the "richness" or heterogeneity of such data presents additional questions. For example, how can the qualitative textual data of social media be effective visualized? This is particularly the case for datasets that have real-time or frequent temporal updates, meaning that the dataset may constantly be in a state of flux. Finally, the unstructured nature big data also means that observations might be inaccurate or less precise. In other words, potential uncertainty within the data might need to be accounted for in the visualization as well (see Representing Uncertainty).
4.3 Representation, Ethics, and Privacy
Apart from technical challenges, it is important to be cognizant of a series of ethical challenges for big data visualization. While ethics form an important part of the entire domain of GIS&T (see Professional & Practical Ethics of GIS&T and Cartography & Power), big data may enlarge or amend those ethical issues. A particularly notable example is the privacy of those whose data are mapped and visualized. Conventional datasets typically aggregate social data to census tracts or other administrative geographies, while many big datasets provide precise coordinate pairs, oftentimes at the level of the individual. Visualizing such data with the same precision may do harm to people. Conversely, coordinate pairs might also be spoofed or altered deliberately, potentially placing people in locations which they have never visited (Zhao & Sui, 2017). There are many additional issues surrounding the visualization of big data (e.g. representation consent bias) and it is an essential part of any project to be cognizant of these (see for an overview (boyd & Crawford, 2012 Zook et al., 2017 Zwitter, 2014)).
5.1 Data Reduction
To address some of the challenges above, one important approach to big data visualization is to ‘make big data small’ (Poorthuis & Zook, 2017 Poorthuis, Zook, Shelton, Graham, & Stephens, 2015) and falls within the domain of data reduction or summarization. Visualizations of complex, large datasets do not have to be complex or large themselves. Sarikaya (2017) distinguishes four specific reduction strategies, which are not dissimilar from strategies employed in cartographic generalization (see Scale & Generalization):
- Filtering. Subsetting a dataset based on attributes of the data. For example, only including records relevant to the process under study.
- Subsampling. Subsetting a dataset based on stochastic sampling. For example, by performing a random sample if unnecessary to visualize the entire dataset.
- Aggregation. Combining multiple data points in a higher-level aggregation. This can be a with a bottom-up approach by clustering proximate or similar points (see Classification & Clustering, forthcoming) or top-down by aggregating individual points to a higher spatial unit (e.g., administrative region) (see Aggregation of Spatial Entities).
- Projection. Unstructured or high-dimensional big data can be simplified by ‘mapping’ data points to either a smaller number of dimensions or narrower data range. In its simplest form, this can be done manually but larger datasets require the use of automated techniques that range from Principal Components Analysis (see Analyzing Multidimensional Attributes,forthcoming) to newer machine learning techniques (see Machine Learning Programming for GIS, forthcoming).
Figure 2: Big data can be made "small" through the use of data reduction strategies that yield summary visualizations. Figure reproduced with permission from Sarikaya (2017).
5.2 Visual Strategies
Data reduction strategies make big data small in order to use relatively conventional, straightforward cartographic techniques. However, depending on the nature of the data and the purpose of the visualization, this is not always an option. Explicitly incorporating big data in cartography, without the simplification from data reduction, is still at the cutting edge of the field, full of new challenges and opportunities (see (Robinson et al., 2017) for an overview). A canon of techniques has yet to crystalize but several examples of strategies can be identified (see Table 1).
Reproduced with permission (cf. Kumar, Morstatter, & Liu, 2014)
Sophie Engle / GPL-3.0 (cf. Holten & Van Wijk, 2009)
Kraak and Kveladze (2017), CC-BY-4.0
Nost, Rosenfeld, Vincent, Moore, & Roth (2017), CC-BY-NC-ND-4.0
Dheeraj Savala / MIT license
The process of big data visualization relies heavily on computationally intensive procedures, which requires us to work in close concert with our computers. To facilitate this process, big data visualization is often done in an exploratory, interactive fashion with interfaces and software that enable the user to quickly perform a series of exploratory analyses through the visualization of different aspects of a dataset (see Exploratory Spatial Data Analysis (forthcoming) and UI/UX Design). These interfaces can be custom-made for a specific project or tailored to use with big data. An example of such a project is imMens, a browser-based system that allows users to explore millions of multivariate data points in an interactive, real-time environment (Liu, Jiang, & Heer, 2013). To enable this, the system pre-computes visualizations in a way similar to webmap tilesets (see Web Mapping) and it performs calculations in parallel to make sure the computer can ‘keep up’ with the user (see Parallel Programming and GIS Applications, forthcoming). More conventional, off-the-shelf software has also been adapted to enable big data visualization. For example, ArcGIS is now using both GPU rendering and parallel processing and popular data science languages (e.g., Python and R) provide authoring environments for interactive visualization and the tight coupling of analysis and visualization (see Jupyter Notebooks, forthcoming).
Figure 3: An example of an exploratory, interactive software interface visualizing big data (Chen et al., 2016). It allows the discovery of movement patterns in social media data through both data reduction (e.g., filtering) and visualization strategies (e.g., multiple linked views). Reproduced with permission (http://vis.pku.edu.cn/trajectoryvis/en/weibogeo.html).
It is clear that the smooth interaction between user and computer is crucial to gain insight from big data visualization. Therefore, approaches to big data visualization should not exclusively focus on performance, the computational aspects of processing data or specific visual challenges, but also on effective interface and experience design (see UI/UX Design and Usability Engineering & Evaluation). In this way, big data visualization necessarily combines the backend (computation) and frontend (visualization) of cartography in a tight coupling, in which both human and computer work together to create new insights from data.
Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53. DOI: 10.1016/j.apgeog.2013.09.012
Ben Shneiderman. (2014). The Big Picture for Big Data: Visualization. Science, 343(6172), 730–730. DOI: 10.1126/science.343.6172.730-a
Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication & Society, 15(5), 662–679. DOI: 10.1080/1369118X.2012.678878
Chen, S., Yuan, X., Wang, Z., Guo, C., Liang, J., Wang, Z., et al. (2016). Interactive Visual Discovering of Movement Patterns from Sparsely Sampled Geo-tagged Social Media Data. IEEE Transactions on Visualization and Computer Graphics, 22(1), 270–279. DOI: 10.1109/TVCG.2015.2467619
Crampton, J. W., Graham, M., Poorthuis, A., Shelton, T., Stephens, M., Wilson, M. W., & Zook, M. A. (2013). Beyond the geotag: situating “big data” and leveraging the potential of the geoweb. Cartography and Geographic Information Science, 40(2), 130–139. DOI: 10.1080/15230406.2013.777137
Dang, T. N., Wilkinson, L., & Anand, A. (2010). Stacking Graphic Elements to Avoid Over-Plotting. IEEE Transactions on Visualization and Computer Graphics, 16(6), 1044–1052. DOI: 10.1109/TVCG.2010.197
DiBiase, D. (1990). Visualization in the earth sciences. Earth and Mineral Sciences, 59(2), 13–18.
Diebold, F. X. (2012). A Personal Perspective on the Origin(s) and Development of “Big Data”: The Phenomenon, the Term, and the Discipline, Second Version. SSRN Electronic Journal. DOI: 10.2139/ssrn.2202843
Fox, P., & Hendler, J. (2011). Changing the Equation on Scientific Data Visualization. Science, 331(6018), 705–708. DOI: 10.1126/science.1197654
Goodchild, M. F. (2007). Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211–221. DOI: 10.1007/s10708-007-9111-y
Graham, M., & Shelton, T. (2013). Geography and the future of big data, big data and the future of geography. Dialogues in Human Geography, 3(3), 255–261. DOI: 10.1177/2043820613513121
Holten, D., & Van Wijk, J. J. (2009). Force‐Directed Edge Bundling for Graph Visualization. Computer Graphics Forum, 28(3), 983–990. DOI: 10.1111/j.1467-8659.2009.01450.x
Kitchin, R. M. (2013). Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography, 3(3), 262–267. DOI: 10.1177/2043820613513388
Kitchin, R. M. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12. DOI: 10.1177/2053951714528481
Kitchin, R. M., & McArdle, G. (2016). What makes Big Data, Big Data? Exploring the ontological characteristics of 26 datasets. Big Data & Society, 3(1), 205395171663113. DOI: 10.1177/2053951716631130
Kraak, M.-J. (1988). Computer-assisted cartographical three-dimensional imaging techniques (Doctoral Dissertation). Delft University Press, Delft.
Krzywinski, M., Birol, I., Jones, S. J., & Marra, M. A. (2012). Hive plots—rational approach to visualizing networks. Briefings in Bioinformatics, 13(5), 627–644. DOI: 10.1093/bib/bbr069
Kumar, S., Morstatter, F., & Liu, H. (2014). Twitter Data Analytics. New York, NY: Springer New York. DOI: 10.1007/978-1-4614-9372-3
Kraak, M. J., & Kveladze, I. (2017). Narrative of the annotated Space–Time Cube–revisiting a historical event. Journal of maps, 13(1), 56-61. DOI: 10.1080/17445647.2017.1323034
Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity, and Variety. Retrieved August 29, 2015, from http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Manageme.
Liu, Z., Jiang, B., & Heer, J. (2013). imMens: Real‐time Visual Querying of Big Data. Computer Graphics Forum, 32(3), 421–430. DOI: 10.1111/cgf.12129
Nost, E., Rosenfeld, H., Vincent, K., Moore, S. A., & Roth, R. E. (2017). HazMatMapper: an online and interactive geographic visualization tool for exploring transnational flows of hazardous waste and environmental justice. Journal of Maps, 13(1), 14–23. DOI: 10.1080/17445647.2017.1282384
Poorthuis, A., & Zook, M. A. (2017). Making Big Data Small: Strategies to Expand Urban and Geographical Research Using Social Media. Journal of Urban Technology, 36, 1–21. DOI:10.1080/10630732.2017.1335153
Poorthuis, A., Zook, M. A., Shelton, T., Graham, M., & Stephens, M. (2015). Using Geotagged Digital Social Data in Geographic Research. In N. Clifford, S. French, M. Cope, & S. Gillespie (Eds.), Key Methods in Geography (3rd ed.).
Robinson, A. C., Demšar, U., Moore, A. B., Buckley, A., Jiang, B., Field, K., et al. (2017). Geospatial big data and cartography: research challenges and opportunities for making maps that matter. International Journal of Cartography, 18(5), 1–29. DOI: 10.1080/23729333.2016.1278151
Sarikaya, A. T. (2017). Targeting Designs of Scalable, Exploratory Summary Visualizations (Doctoral Dissertation). The University of Wisconsin - Madison, Madison, WI.
Zhao, B., & Sui, D. Z. (2017). True lies in geospatial big data: detecting location spoofing in social media. Annals of GIS, 23(1), 1–14. DOI: 10.1080/19475683.2017.1280536
Zook, M. A., Barocas, S., boyd, D., Crawford, K., Keller, E., Gangadharan, S. P., et al. (2017). Ten simple rules for responsible big data research. PLoS Computational Biology, 13(3), e1005399. DOI: 10.1371/journal.pcbi.1005399
Zwitter, A. (2014). Big Data ethics. Big Data & Society, 1(2), 1–6. DOI: 10.1177/2053951714559253
5 Answers 5
The Screen Actor's Guild sets certain required pay scales for actors. In some cases actors will agree to pay adjustments or a lack of credit as a favor to a friend directing the film (Robin Williams in Baron Munchausen) or just to have the chance to cameo in the film for fun. In some cases an actor won't be billed so as not to spoil a surprise. Lately the Marvel films (Avengers, Iron Man, etc) have been not billing cameo actors for this reason.
Lastly, in some cases its a gross oversight. James Earl Jones had to wait decades for the digital reissue of Star Wars to get screen credit as the voice of Darth Vader.
I haven't seen Mystic River, so I don't know how big Eli Wallach's role is and I don't recognize Olivia Williams enough to look for her in X-Men: First Class, so I can't speak to these two specifically but, on the topic of Cameos:
The Wiki article for "Cameo appearance" pretty much does all of the answering.
These roles are generally small, many of them non-speaking ones, and are commonly either appearances in a work in which they hold some special significance (such as actors from an original movie appearing in its remake), or renowned people making uncredited appearances.
And why aren't they credited?
Cameos are generally not credited because of their brevity, or a perceived mismatch between the celebrity's stature and the film or TV show in which he or she is appearing. Many are publicity stunts.
Some performers opt not to take credit for a project. They will do this either for professional reasons (they don't want to be associated with a really bad film) or because the process of working on the film was unpleasant. For example, Don Cheadle in Ocean's Eleven refused to be listed in the credits:
KW: I’ve noticed that you sometimes appear uncredited in movies, like in Ocean’s 11 [sic] and Rush Hour 2. Why is that?
DC: For different reasons. I did Rush Hour 2 just as kind of a laugh, so I didn’t really need a credit. To me, it was fine if people recognized me. And if they didn’t, that was fine, too. With Ocean’s, there was some stuff that happened behind the scenes that I didn’t like how it went down, so I just said, “Take my name off it.”
The situation must have gotten better, though, as he came back for two more films, in which he was credited. With Rush Hour 2, I'm guessing that fits more into the cameo-type appearance.
There's another possible explanation. as hinted at above when they say non-speaking roles. if a performer doesn't say anything on screen they can be considered an extra and extras never get credited. Even when cast in a speaking role, if all of a performer's lines are out of the final edit, they may not get credit.
Extras get paid very little in films, even if a film is shot in an area where SAG has extras jurisdiction (SAG extras in theatrical projects currently make $157/day for 8 hours). If there's no SAG jurisdiction, they usually get paid minimum wage.
Speaking roles in SAG films get paid on a daily or weekly rate depending on the number of shooting days for their role. Bigger name performers can earn double or triple scale or, if they're big enough, can set their own salary. Sometimes, they will work out a no-quote rate with a lower budget project they just want to work on, which means they're taking less than usual on a project but it's in their contract that production can not divulge what the performer made. They will still get credit and will often say, "I'm taking a pay cut, give me a single or shared card at the beginning/end rather than just putting my name in the scroll". and sometimes they'll get it.
The current SAG Rate sheet good through mid-2017 for full-budget Theatrical projects can be downloaded from their site here.
SAG doesn't set rates for big name talent, their agents and managers do, so talent can take as little as they want provided it's, at minimum, SAG scale.