☰ Show TOC

Part 3: Deciding On Outliers

Questions 21 and 22

Choose TWO letters, A–E.
Which TWO opinions about removing outliers do the students express?

  1. It should be done only after checking data quality and context.
    B. It always improves model accuracy.
    C. It can hide real but rare events that matter.
    D. It is unethical in academic research.
    E. It is unnecessary if you use any machine learning model.

Questions 23 and 24

Choose TWO letters, A–E.
Which TWO predictions about outlier detection in the workplace are the students doubtful about?

  1. Most companies will adopt one universal threshold for every dataset.
    B. Regulators will require clearer documentation of anomaly handling decisions.
    C. Unsupervised anomaly detection will replace rule based checks entirely.
    D. More teams will use robust statistics instead of delete and forget.
    E. Outliers will disappear as sensors and logging improve.

Questions 25–30

What comment do the students make about each method?
Choose SIX answers from the box and write the correct letter, A–G, next to Questions 25–30.

Comments

  1. It is a classic method but breaks down on heavy tailed data.
    B. It is more reliable for skewed distributions than the z score.
    C. It reduces the impact of extremes without deleting records.
    D. It can spot unusual cases but is hard to explain to managers.
    E. It is often used for quick monitoring in dashboards.
    F. It depends heavily on domain knowledge and good definitions of normal.
    G. It helps modelling because it resists outliers rather than chasing them.

Methods

25 Z score rule
26 IQR rule, Tukey fences
27 Winsorising, capping extremes
28 Robust regression, for example Huber loss
29 Isolation Forest
30 Manual rule based flags, business rules