Free Practice Questions for Scikit-learn Professional Practitioner Certification Certification
Study with 399 exam-style practice questions designed to help you prepare for the Scikit-learn Professional Practitioner Certification.
Start Practicing
All Domains
Practice with randomly mixed questions from all topics
Domain Mode
Practice questions from a specific topic area
Quiz History
Exam Details
Key information about Scikit-learn Professional Practitioner Certification
- Multiple choice
- True/False
- Fill in the blank
Professional
mid-level data scientist
Exam Topics & Skills Assessed
Skills measured (from the official study guide)
Domain 1: Machine learning concepts
Subdomain 1.1: Supervised and unsupervised, regression, classification, clustering, dimensional reduction
The advanced mental model. Probabilistic outputs, regularization regimes, and what overfitting does to soft predictions.
- Supervised and unsupervised, regression, classification, clustering, dimensional reduction
Subdomain 1.2: Model families, tree-based, linear, ensemble, neighbors
Model families, tree-based, linear, ensemble, neighbors
Subdomain 1.3: Regularization, L1, L2, Elasticnet
Regularization, L1, L2, Elasticnet
Subdomain 1.4: Hard and soft predictions, predict vs predict_proba
Hard and soft predictions, predict vs predict_proba
Subdomain 1.5: Overfitting and underfitting, impact on soft predictions
Overfitting and underfitting, impact on soft predictions
Domain 2: Model building and evaluation
Subdomain 2.1: Linear models as baselines
Pick the baseline, regularize the noise, ensemble when warranted, and choose the metric that fits the problem.
- Linear models as baselines
Subdomain 2.2: Handling correlation with regularization and feature selection
Handling correlation with regularization and feature selection
Subdomain 2.3: Bagging and boosting, the working ensemble methods
Bagging and boosting, the working ensemble methods
Subdomain 2.4: Choosing metrics for outliers and imbalanced settings
Choosing metrics for outliers and imbalanced settings
Domain 3: Interpretation and communication
Subdomain 3.1: Visualizing results with intermediate matplotlib and seaborn techniques
Read the plot, name the failure mode, explain it without using the word probability twice.
- Visualizing results with intermediate matplotlib and seaborn techniques
Subdomain 3.2: Interpreting model outputs and performance metrics
Interpreting model outputs and performance metrics
Subdomain 3.3: Communicating results to non-technical stakeholders
Communicating results to non-technical stakeholders
Domain 4: Data preprocessing
Subdomain 4.1: Loading parquet datasets
Heatmaps, PCA, polynomial features, label propagation. The shaping work that makes a real-world dataset trainable.
- Loading parquet datasets
Subdomain 4.2: Heatmaps and PCA for first look
Heatmaps and PCA for first look
Subdomain 4.3: Identifying strongly correlated features
Identifying strongly correlated features
Subdomain 4.4: Missing values in the target via label propagation
Missing values in the target via label propagation
Subdomain 4.5: Feature engineering with PolynomialFeatures, SplineTransformer
Feature engineering with PolynomialFeatures, SplineTransformer
Subdomain 4.6: Combining features with FeatureUnion
Combining features with FeatureUnion
Domain 5: Model selection and validation
Subdomain 5.1: Cross-validation with group structure and non i.i.d. data
Group structure, non i.i.d. data, nested CV, stable hyperparameters across folds.
- Cross-validation with group structure and non i.i.d. data
Subdomain 5.2: Hyperparameter tuning, GridSearchCV, RandomSearchCV
Hyperparameter tuning, GridSearchCV, RandomSearchCV
Subdomain 5.3: Stability of optimal hyperparameters via nested cross-validation
Stability of optimal hyperparameters via nested cross-validation
Techniques & products