selected_top_l3
"/home/yossef/notes/Su/selected_top/selected_top_l3.md"
path: Su/selected_top/selected_top_l3.md
- **fileName**: selected_top_l3
- **Created on**: 2025-04-13 01:37:46
what is machine learning?
Is system that can learn from example and produce accurate results, through
self improvement, do production through learning
The list of attributes used to solve a problem is called a feature vector.
- Example: gender and age.
The target we want to predict is called label.
- Example: Willing to buy or not (True / False).
The rules used to predict is called model.
The program that is used to generate the model is called algorithm.
The process to generate a model a called training or learning,
and the set of data used in this process is called training data.
A single data instance is called a sample.
machine learning can group to:
- Supervised learning: the label information is
available during training. - Unsupervised learning: the label information
is unavailable during training. - Reinforcement learning, the machine learns
by trial and error, receiving rewards for
correct actions and penalties for mistakes.
The label of classification problem is discreate.
what is type of classification?
- If the label only has two classes, it is called binary classification.
- If the label only has more than two classes, it is multiclass classifications.
Binary classification type:
- Face verification (True, False)
- Sentiment analysis (positive, negative)
- Spam filtering (True, False)
- Cancer diagonosis (benign, malignant)
Multiclass classification type:
- Face identification (Alice, Bob, Charles, …)
- Object recognition (flower, ball, cup, dog, …)
- Weather prediction (sunny, rainy, foggy, …)
- Behavior recognition (walking, running, dancing, …)
regression: The label of classification problem is
- Stock price prediction.
- Temperature prediction.
- Salary estimation.
- House price prediction
what is unsupervised learning,
A model is trained to learn previously undetected patterns in a data set
with no pre-existing labels, Example:
- Social network user grouping.
- Image segmentation.
- Anomaly detection.
What is Spark MLlib?
Spark provides a machine learning library based on big data to practice of
machine learning tasks
At a high level, MLlib provides tools such as:
- ML Algorithms: classification, regression, clustering
- Featurization: feature extraction, reduction, selection
- Pipelines: tools for constructing, tuning, pipeline
- Utilities: linear algebra, statistics, data handling
What is Data frame?
Is a more specific data type in Spark that is built on top of Resilient
Distributed Dataset (RDD), is like data organized into frame like relation
database (sql).
continue:[[]]
before:./selected_top_l2.md