OVERVIEW

Histogram was first introduced by Karl Pearson as a visualization to present the distribution of a set of data. It shows the estimation of the probability distribution for a continuous variable like a range of value, which is referred to be “bin”. Histogram is easily recognized as a bar chart with no space between each bar because the bins are consecutive and non-overlap intervals of data. The bins and the bars have to be adjacent and equal of size.

1

  • NOTE: The above histogram is an example from book “Excel Dashboard and Report”. Each bin is a range of unit sold with the bar showing the number of sale representatives in each bin. It includes a line of cumulative percentage in blue. The distribution was dimmed in the background with a light gray color to differentiate with the line.

CREATE ON EXCEL

  • Concept: First, we need to create the bin with equal intervals of data. Then we count a number of Sale representatives in each bin with cumulative % for each bin. Then we create a combo chart with a line and a bar chart.
  • Data: Open Excel file -> Tab Data
  • Raw Data includes only information of sale for each sale representative.

2.JPG

  • Using Frequency Formula to create range and count data for each bin

3.JPG

{=FREQUENCY(B3:B246,D3:D13)}

NOTE: The frequency formula is an array formula (with the bracket {}) pointing to the data (B3: B246) and the bins (D3: D13). Type in the formula in the first cell, then select the entire range up to the last bin, click on the formula bar to edit (or use F2) then use combo Ctrl + Shift + Enter. The formula will automatically apply to all the cells of the range and become an array formula.

  • Create final bin with range and count of sale representatives with cumulative %

4.JPG

NOTE: Unit Sold is created by the two consecutive bins above. The count of sale representative is the count of the end of the bin. For example, range 0-5 will have 8 representatives, which is equal to the number of bin 5. The cumulative % is calculated by taking the ratio of the sum from the first range up to the range it is counting over the total sum of representatives.

=SUM($H$3:H3)/SUM($H$3:$H$12)
  • Create a combo chart with bar graph and line graph

5.JPG

  • Erase the space between each bar by setting 0% Series Overlap and 0% Gap Width

6.JPG

  • Format the graph with dimmed Axis and bar color in light gray. The line is highlighted with blue color and showing the label.

TIPS AND OPINIONS

  • Color: Use proper color for the bar graphhere I use color light gray not to interrupt the line but still give a great visualization to see the distribution.
  • Bins: With the data set, different bin structures will tell a different story. The more intervals of data, the more detail of the distribution will be visualized. Proper use of bin range will make it easier for the reader to understand the story behind.

REFERENCES:

  • Alexander, M., & Walkenbach, J. (2016). Excel dashboards and reports. Hoboken, NJ: John Wiley & Sons.

About the Author BBnguyen

I am interested and working, researching about the new technology for business intelligence, big data, IoT and statistical analysis.

Trả lời

Mời bạn điền thông tin vào ô dưới đây hoặc kích vào một biểu tượng để đăng nhập:

WordPress.com Logo

Bạn đang bình luận bằng tài khoản WordPress.com Đăng xuất / Thay đổi )

Twitter picture

Bạn đang bình luận bằng tài khoản Twitter Đăng xuất / Thay đổi )

Facebook photo

Bạn đang bình luận bằng tài khoản Facebook Đăng xuất / Thay đổi )

Google+ photo

Bạn đang bình luận bằng tài khoản Google+ Đăng xuất / Thay đổi )

Connecting to %s