Aggregation

When you combine data from many different sources or times in order to lower the possibility of a single individual being identified.

Augment

When a machine, software, or function extends a person’s abilities or potential while maintaining their agency.

Automate

When a machine, software, or function performs a task without user involvement.

Binary Classification

Binary classification: when an ML model predicts if an example falls into one category or another based on a set of features.

Classification

When a machine learning model identifies an object. In response to an identification question, the simplest classification is “yes” or “no”. For example, if a model was shown a picture of a cat, it could classify it as “Cat”, or “Not a cat”. More complex classifications are sorting items into one of several groups.

Confidence Level, Model Confidence

The confidence level for a model is a statistical measure of how certain a prediction or outcome is.

Context Errors

Situations when the product output doesn’t make sense in the user’s current context. Often, this output is perceived as irrelevant by the user.

Counterfactuals

Rationale for why something is classified as not within the given class. Usually in the form of a statement of how the world would have to be different for a desirable outcome to occur.

Data Cascades

Compounding events that cause negative, downstream effects from data issues, and result in technical debt over time.

Data Collection and Labeling

How product teams get the data they need and apply meaningful labels to it. For example: acquiring millions of images of cats and dogs correctly labeled as “cat” or “dog”.