Free Practice Questions for Scikit-learn Professional Practitioner Certification Certification
Study with 380 exam-style practice questions designed to help you prepare for the Scikit-learn Professional Practitioner Certification.
Start Practicing
Random Questions
Practice with randomly mixed questions from all topics
Domain Mode
Practice questions from a specific topic area
Exam Information
Exam Details
Key information about Scikit-learn Professional Practitioner Certification
- Multiple choice
- True/False
- Fill in the blank
Professional
mid-level data scientist
Exam Topics & Skills Assessed
Skills measured (from the official study guide)
Domain 1: Machine learning concepts
Subdomain 1.1: Supervised learning and unsupervised
Supervised learning and unsupervised (regression, classification, clustering, dimensional reduction)
Subdomain 1.2: Types of model families
Types of model families (tree-based, linear, ensemble, neighbors)
Subdomain 1.3: Regularization
Regularization (L1, L2, Elasticnet)
Subdomain 1.4: Hard and soft predictions in classification
Hard and soft predictions in classification (predict vs predict_proba)
Subdomain 1.5: Model overfitting and underfitting impact on soft predictions
Model overfitting and underfitting impact on soft predictions
Domain 2: Model building and evaluation
Subdomain 2.1: Linear models as baselines
Linear models as baselines
Subdomain 2.2: Handling correlation with regularization and feature selection
Handling correlation with regularization and feature selection
Subdomain 2.3: Understanding of bagging and boosting ensemble methods
Understanding of bagging and boosting ensemble methods
Subdomain 2.4: Correct choice of metrics
Correct choice of metrics (presence of outliers, imbalanced settings, etc)
Domain 3: Interpretation of results & communication
Subdomain 3.1: Visualizing model results using intermediate plotting techniques
Visualizing model results using intermediate plotting techniques (matplotlib, seaborn)
Subdomain 3.2: Interpreting and communicating model outputs and performance metrics to non-technical stakeholders
Interpreting and communicating model outputs and performance metrics to non-technical stakeholders
Domain 4: Data preprocessing
Subdomain 4.1: Loading parquet datasets
Loading parquet datasets
Subdomain 4.2: Visualizing data with intermediate plotting techniques
Visualizing data with intermediate plotting techniques (heatmaps, PCA)
Subdomain 4.3: Identify strongly correlated features
Identify strongly correlated features
Subdomain 4.4: Handling missing values in the target by using label propagation
Handling missing values in the target by using label propagation
Subdomain 4.5: Feature engineering
Feature engineering using PolynomialFeatures, SplineTransformer, etc
Subdomain 4.6: Combining features with FeatureUnion
Combining features with FeatureUnion
Domain 5: Model selection and validation
Subdomain 5.1: Broader understanding of cross-validation techniques
Broader understanding of cross-validation techniques (group structure, non i.i.d. data, etc)
Subdomain 5.2: Performing hyperparameter tuning
Performing hyperparameter tuning using GridSearchCV, RandomSearchCV
Subdomain 5.3: Stability of optimal hyperparameters across splits with nested cross validation
Stability of optimal hyperparameters across splits with nested cross validation
Techniques & products