Skip to content

CO2412 Computational Thinking Contents
Lecture 12 - Probability

Lecture 16 - Info Representation.pdf

Information Representation & Relative FrequencyΒΆ

Organising & Presenting Data GraphicallyΒΆ

Data in raw form is usually not easy to use for decision making.
Some type of organisation is needed:
- Table
- Graph
Techniques Reviewed here:
- Bar Charts & Pie Charts
- Ordered Array
- Stem & Leaf Display
- Frequency Distributions, histograms and polygons
- Contingency Tables
- Scatter Diagrams

VariablesΒΆ

A variable is a characteristic that changes or varies over time and/or for different individuals or objects under consideration.

Examples include:
- Hair Colour
- White Blood Cell Count
- Time to failure of a computer component

Types of VariablesΒΆ

Quantitative - Discrete & Continuous
Qualitative

Tables and Charts for Categorical DataΒΆ

Categorical Data.png

The Summary TableΒΆ

Summarise data by category.png

Dot PlotsΒΆ

The simplest graph for Quantitative Data. Plot the measurements as points on a horizontal axis, stacking the points that duplicate existing points.

Example Set using Dot PlotsΒΆ

The set used for the following example is: 4, 5, 5, 7, 6
horizontal way to show data.png
Interpretating Graphs - Location & Speed.png
Interpreting Graphs - Shapes.png
Interpreting Graphs - Outliers.png

Bar & Pie ChartsΒΆ

Bar, and Pie charts are often used for categorical data.
The height of the bar or size of the pie shows the frequency or percentage for each category.

Example Bar ChartΒΆ

Bar Chart Example.png

Example Pie ChartΒΆ

Pie Chart Example.png

Organising Numerical DataΒΆ

Numerical Data.png

The Ordered ArrayΒΆ

An ordered array arranges data in rank order, from minimum and maximum values within the dataset.

This format provides several analytical benefits:
- Shows Ranges: Clearly defines the minimum and maximum values within the dataset.
- Highlights Variability: Offers insights into the distribution and variability across the range.
- Identifies Outliers: Makes unusual observations easier to spot.
- Limitations: In large datasets, the ordered array becomes less practical as a tool for analysis.

Example Ordered ArrayΒΆ

Unordered raw data and ordered result.png

Stem-and-Leaf DiagramΒΆ

A simple way to see distribution details in a dataset.

Method: Separate the sorted data series into leading digits (Stem) and the trailing digits (leaves)

Example Stem-and-Leaf DiagramΒΆ

Stem-and-Leaf-Example.png

Completed Stem-and-Leaf DiagramΒΆ

stem-and-leaf completed 1.png

Using other Stem UnitsΒΆ

stem-and-leaf 2.png

Completed Stem-and-Leaf diagram for other Stem UnitsΒΆ

stem and leaf complete 2.png

Frequency DistributionΒΆ

A frequency distribution is a structured representation of data, often displayed as a list or a table.

The frequency distribution includes:
Class Groupings: Ranges or intervals within which data values fall
Corresponding Frequencies: The number of data points that fall into each grouping or category.

Class Intervals & Class BoundariesΒΆ

Each Class Grouping has the same width. By determining the width of each interval Width of interval (approx) = range / number of desired class groupings.
Usually at least 5 but no more than 15 groupings.
Class boundaries never overlap.
Round up interval width to get desirable endpoints.

Why use Frequency DistributionΒΆ

This format helps to summarise and analyse data effectively by highlighting patterns and trends.

Example Frequency DistributionΒΆ

A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature.
unordered array.png
Sort raw data in ascending order.png
Data without CF.png

Cumulative FrequencyΒΆ

Data in ordered way with CF.png

The HistogramΒΆ

A graph of the data in a frequency distribution is called a histogram.
The class boundaries (or class midpoints) are shown on the horizontal axis.
The vertical axis is either frequency, relative frequency, or percentage.

Bars of the appropriate heights are used to represent the number of observations within each class.

Example HistogramΒΆ

Histogram on daily high temp.png

The Frequency PolygonΒΆ

Frequency Polygon.png

Misusing Graphs & Ethical IssuesΒΆ

Guidelines for good graphs:
- Do not distort the data
- Avoid unnecessary adornments (no "chart junk")
- Use a scale for each axis on a two-dimensional graph
- The vertical axis scale should begin at 0
- Properly label all axis'
- The graph should contain a title
- Use the simplest graph for a given dataset

SummaryΒΆ

Data in raw form are usually not easy to use for decision making -- Some type of organization is needed:
Table
Graph

Techniques reviewed in this chapter:
Bar charts, pie charts, and Pareto diagrams
Ordered array and stem-and-leaf display
Frequency distributions, histograms and polygons
Cumulative distributions and ogives
Contingency tables and side-by-side bar charts
Scatter diagrams and time series plots

Lecture 14 - Numeric Data Analysis