• Joe Wilkinson

Dot Distribution Map: Data Science Spotlight

Communication is one of the most critical aspects of data science. What good is data if we can't explain what it means and why it's important? Humans are built to see patterns. It helps us come to interpret our surroundings faster and react swiftly. Data visualization techniques become essential because they allow us to skip sifting through text-based data and find the patterns.

One exciting data visualization technique is dot distribution maps(DDM), also known as dot density maps. DDM is a map that uses data points to demonstrate the location of the phenomena. By placing the points on the map, we can quickly get a sense for the geographic relation between our data points. DDM's are an excellent tool for showing us where clusterings of data points reside. DDM's show a level of granularity that most other map visualizations can't compete with. This vital because geographic data is hard to make sense of with simple numbers. You can't take an average of addresses. You can't find the standard deviation. It only makes sense when you place that information on a map.

The dots on the map can either be one-to-one or one-to-many. One-to-one DDM is the most accurate and straight forward. Each dot represents a single point of data. This feature gives a view of the totality of our dataset that one-to-many can not. Whereas in one-to-many DDMs, each dot on the map represents a collection of data points. Using one-to-many gives us the opportunity to zoom the map out without losing representation for each data point.

Choosing between the two will be dependent on the number of individual data points, the size of the map, and what's critical in the context of your use case. When tracking the outbreak of a disease, it's important to see each data point. Whereas, when looking at census data from the United States, you probably don't need to see all 300 million people.


One-to-many Dot Distribution Map

The size of a dot can have a significant impact on the conclusions drawn by the visualization and can be a considerable weakness of DDM's. It's vital that each dot is visible. So, no matter how much area your map covers, we still have to be able to see the dot. The visibility of a dot causes granularity to be more of an issue the more we zoom out. You have to be careful to know your use case and what level of granularity is vital for the information you need. If you're trying to stop the spread of disease, you will need to know precisely where in a city, maybe even in a neighborhood, your observations lie. In this case, a high level of granularity becomes a necessity.

DDM's can differentiate what each data points mean by using multiple colors or symbols. This feature allows us to visualize geographic differences based on different conditions. It is often used to show the difference in location based on demographic data. An excellent example of this is showing the lingering segregation of cities in our country.


Dot Distribution Map differentiated by color

Overall, dot distribution maps are powerful data visualization techniques. It best for a narrow set use cases but when it's right, it allows for an instant understanding of the data. It's so intuitive that when someone sees a DDM, they consume and understand the geographic relationships between the data points.

Conclusions

Pros

  • Shows geographic relationships

  • High granularity of data

Cons

  • Size of dots can misrepresent location

  • Requires a large dataset

  • Size of the map is highly dictated by the number of data points

(816)281-6662

©2019 by Code the Block a 501(c)3 Organization.