AWS Certified Machine Learning – Specialty (MLS-C01) Practice Exam

Exam Name MLS-C01 Practice Exam – AWS Certified Machine Learning Specialty (2026 Updated)
Exam Provider Amazon Web Services (AWS)
Certification Type Specialty-Level Certification (Advanced Machine Learning, Data Engineering, Modeling & Deployment on AWS)
Total Practice Questions 100 Advanced MCQs (Scenario-Based + Feature Engineering + Modeling + Evaluation + Deployment + MLOps)
Exam Domains Covered • Data Engineering (data collection, transformation, feature engineering)
• Exploratory Data Analysis (EDA, visualization, feature selection)
• Modeling (classification, regression, clustering, deep learning)
• Evaluation (metrics, bias-variance trade-off, cross-validation)
• Deployment (real-time endpoints, batch inference, CI/CD pipelines)
• Monitoring & Optimization (drift detection, retraining, performance tuning)
• AWS ML Services (SageMaker, Feature Store, Model Monitor, pipelines)
Questions in Real Exam • Total: ~65 Questions
• Scenario-heavy with real-world ML workflows
• Focus on advanced ML concepts and AWS integration
Exam Duration • Total Time: 180 Minutes
• Complex problem-solving questions requiring deep ML knowledge
• Emphasis on practical application and architecture decisions
Passing Score • Scaled Score: 750 / 1000
• Requires strong ML theory and AWS service expertise
• Focus on real-world production ML systems
Question Format • Multiple Choice & Multiple Response
• Scenario-Based ML Problem Solving
• Feature Engineering & Data Processing Cases
• Model Evaluation & Optimization Questions
• Deployment & Monitoring Scenarios
Difficulty Level Advanced to Expert (Specialty-Level + Production ML Scenarios)
Key Knowledge Areas • Advanced feature engineering (encoding, scaling, dimensionality reduction)
• Model selection and tuning (hyperparameters, ensembles, deep learning)
• Evaluation metrics (precision, recall, F1, ROC-AUC, RMSE, NDCG)
• Bias-variance trade-off and overfitting/underfitting handling
• Deployment strategies (SageMaker endpoints, batch transform, CI/CD)
• Monitoring (data drift, concept drift, model performance tracking)
• Distributed training and optimization (GPU, parallelization)
• MLOps practices (pipelines, versioning, reproducibility)
Common Exam Traps • Data leakage during preprocessing or feature engineering
• Using incorrect evaluation metrics for imbalanced datasets
• Ignoring feature scaling or transformation requirements
• Misinterpreting precision vs recall trade-offs
• Overfitting due to excessive model complexity
• Not considering drift detection and retraining strategies
• Choosing incorrect deployment method (real-time vs batch)
• Ignoring cost and latency trade-offs in production ML
Skills Developed • Designing scalable end-to-end ML pipelines
• Performing advanced feature engineering and data preparation
• Selecting and tuning models for optimal performance
• Evaluating models using appropriate metrics and validation strategies
• Deploying ML models in production with AWS services
• Monitoring, maintaining, and improving ML systems over time
Study Strategy • Focus on real-world ML scenarios and decision-making
• Practice feature engineering and data preprocessing techniques
• Understand evaluation metrics and when to use them
• Learn SageMaker services and distributed training concepts
• Study model tuning, regularization, and cross-validation
• Analyze production ML systems and failure scenarios
• Take timed mock exams and review detailed explanations
Best For • Machine learning engineers and data scientists
• AI/ML specialists working with AWS
• Cloud engineers implementing ML pipelines
• Professionals preparing for advanced ML certifications
Career Benefits • Validates advanced machine learning expertise on AWS
• Opens roles in ML engineering, data science, and AI architecture
• Enhances skills in production ML and MLOps
• Increases earning potential in AI-driven industries
• Recognized as one of the most advanced AWS certifications
Updated 2026 Latest Version – Based on AWS MLS-C01 Exam Guide & Real Exam Patterns

1.

A dataset contains highly skewed numerical features. What is BEST preprocessing step?

A. Ignore
B. Log transformation
C. One-hot encoding
D. Remove feature

Answer: B
Rationale: Highly skewed data can negatively affect model performance and convergence. Applying a log transformation normalizes the distribution, making patterns easier for models to learn and improving overall stability and predictive accuracy.


2.

A classification dataset is heavily imbalanced. What is BEST approach?

A. Use accuracy
B. Use F1-score and resampling
C. Ignore imbalance
D. Increase features

Answer: B
Rationale: Accuracy can be misleading for imbalanced datasets. Using F1-score ensures a balance between precision and recall, while resampling techniques help the model learn from minority class examples effectively.


3.

A developer wants to detect anomalies in streaming data. What is BEST?

A. Regression
B. Isolation Forest
C. Classification
D. Clustering

Answer: B
Rationale: Isolation Forest is specifically designed for anomaly detection and works well for identifying outliers in streaming or large datasets by isolating anomalies based on random partitioning.


4.

A model shows high variance. What is BEST solution?

A. Increase complexity
B. Add regularization
C. Reduce data
D. Ignore

Answer: B
Rationale: High variance indicates overfitting. Regularization techniques such as L1 or L2 penalize model complexity, helping the model generalize better to unseen data while reducing sensitivity to noise.


5.

A developer wants feature selection for high-dimensional data. What is BEST?

A. Add features
B. PCA
C. Ignore
D. Increase epochs

Answer: B
Rationale: PCA reduces dimensionality by transforming features into principal components, retaining most variance while removing redundant features, improving performance and reducing computational cost.


6.

A developer wants to optimize hyperparameters efficiently. What is BEST?

A. Grid search
B. Random search
C. Bayesian optimization
D. Manual tuning

Answer: C
Rationale: Bayesian optimization intelligently explores the hyperparameter space using probabilistic models, making it more efficient than grid or random search, especially for complex ML models.


7.

A developer wants real-time ML predictions. What is BEST?

A. Batch transform
B. SageMaker endpoint
C. S3
D. EC2

Answer: B
Rationale: SageMaker endpoints provide low-latency, real-time inference capabilities and can scale automatically based on incoming request volume, making them ideal for production applications.


8.

A developer wants batch predictions for large datasets. What is BEST?

A. Endpoint
B. Batch transform
C. EC2
D. Lambda

Answer: B
Rationale: Batch transform is designed for offline processing of large datasets, allowing cost-effective inference without maintaining always-on endpoints.


9.

A developer wants to prevent overfitting. What is BEST?

A. Increase epochs
B. Early stopping
C. Ignore
D. Reduce data

Answer: B
Rationale: Early stopping halts training when validation performance starts to degrade, preventing the model from overfitting to training data and improving generalization.


10.

A developer wants to explain model predictions. What is BEST?

A. Ignore
B. SHAP
C. PCA
D. EC2

Answer: B
Rationale: SHAP values provide interpretable insights into how each feature contributes to a model’s predictions, improving transparency and trust in ML systems.


11.

A developer wants to detect data drift. What is BEST?

A. Ignore
B. Compare feature distributions over time
C. Increase features
D. Use EC2

Answer: B
Rationale: Data drift occurs when input distributions change over time. Monitoring feature distributions helps identify drift early, ensuring model performance remains stable in production.


12.

A developer wants scalable ML pipelines. What is BEST?

A. Manual
B. SageMaker pipelines
C. EC2
D. S3

Answer: B
Rationale: SageMaker Pipelines automate ML workflows including preprocessing, training, and deployment, enabling reproducibility and scalability across teams.


13.

A developer wants secure data storage. What is BEST?

A. Public S3
B. S3 with encryption (KMS)
C. EC2
D. RDS

Answer: B
Rationale: Encrypting data at rest using KMS ensures compliance and security, protecting sensitive ML datasets from unauthorized access.


14.

A developer wants model monitoring. What is BEST?

A. CloudTrail
B. SageMaker Model Monitor
C. Config
D. Lambda

Answer: B
Rationale: Model Monitor tracks data drift, prediction quality, and anomalies in production, ensuring long-term model reliability.


15.

A developer wants to handle missing values. What is BEST?

A. Ignore
B. Imputation
C. Remove dataset
D. EC2

Answer: B
Rationale: Imputation techniques such as mean, median, or model-based filling preserve dataset size while handling missing values effectively.


16.

A developer wants feature scaling. What is BEST?

A. Ignore
B. Normalize/standardize
C. EC2
D. S3

Answer: B
Rationale: Feature scaling ensures consistent input ranges, improving convergence speed and performance for algorithms like gradient descent and neural networks.


17.

A developer wants CI/CD for ML. What is BEST?

A. Manual
B. CodePipeline
C. EC2
D. S3

Answer: B
Rationale: CodePipeline automates ML deployment workflows, ensuring consistent and repeatable model releases.


18.

A developer wants evaluation metric for regression. What is BEST?

A. Accuracy
B. RMSE
C. Precision
D. Recall

Answer: B
Rationale: RMSE measures the average magnitude of prediction errors and is widely used for evaluating regression models.


19.

A developer wants clustering. What is BEST?

A. Regression
B. K-means
C. Classification
D. EC2

Answer: B
Rationale: K-means groups similar data points into clusters, making it suitable for unsupervised learning tasks.


20.

A developer wants production ML system. What is BEST?

A. Single model
B. End-to-end ML pipeline
C. EC2
D. S3

Answer: B
Rationale: A complete ML pipeline ensures scalability, reproducibility, monitoring, and continuous improvement, which are critical for production-grade machine learning systems.

21.

A model performs poorly due to multicollinearity. What is BEST solution?

A. Add features
B. Remove correlated features
C. Increase epochs
D. Ignore

Answer: B
Rationale: Multicollinearity occurs when features are highly correlated, making model coefficients unstable and less interpretable. Removing redundant features improves model stability and performance.


22.

A developer wants to improve recall in a fraud detection model. What is BEST?

A. Increase threshold
B. Decrease threshold
C. Ignore
D. Remove data

Answer: B
Rationale: Lowering the classification threshold increases recall by capturing more true positives, which is critical in fraud detection, even if it increases false positives slightly.


23.

A dataset has outliers affecting model performance. What is BEST?

A. Ignore
B. Remove or cap outliers
C. Increase features
D. EC2

Answer: B
Rationale: Outliers can distort statistical relationships and model predictions. Removing or capping them improves robustness and model accuracy.


24.

A developer wants distributed training for large datasets. What is BEST?

A. Single instance
B. SageMaker distributed training
C. S3
D. Lambda

Answer: B
Rationale: Distributed training allows models to be trained across multiple instances, reducing training time and enabling handling of large datasets efficiently.


25.

A developer wants to reduce training time. What is BEST?

A. Increase data
B. Use GPU instances
C. Ignore
D. Remove model

Answer: B
Rationale: GPUs accelerate matrix computations used in ML training, significantly reducing training time for deep learning and large models.


26.

A model suffers from underfitting. What is BEST?

A. Reduce complexity
B. Increase complexity
C. Remove features
D. Ignore

Answer: B
Rationale: Underfitting occurs when a model is too simple to capture patterns. Increasing complexity or adding features helps improve learning.


27.

A developer wants ranking evaluation metric. What is BEST?

A. Accuracy
B. NDCG
C. RMSE
D. MAE

Answer: B
Rationale: NDCG evaluates ranking quality by considering both relevance and position, making it ideal for recommendation systems and search ranking tasks.


28.

A developer wants to reduce variance. What is BEST?

A. Increase complexity
B. Regularization or more data
C. Ignore
D. Remove features

Answer: B
Rationale: Variance indicates overfitting. Regularization or adding more training data helps improve generalization and reduce sensitivity to noise.


29.

A developer wants to monitor concept drift. What is BEST?

A. Ignore
B. Monitor prediction distributions over time
C. Increase features
D. EC2

Answer: B
Rationale: Concept drift occurs when relationships between features and labels change. Monitoring predictions and accuracy trends helps detect this issue early.


30.

A developer wants feature importance for tree-based models. What is BEST?

A. Ignore
B. Built-in feature importance
C. EC2
D. S3

Answer: B
Rationale: Tree-based models like Random Forest provide built-in feature importance scores, helping identify key predictors.


31.

A developer wants scalable inference. What is BEST?

A. EC2
B. SageMaker endpoint autoscaling
C. S3
D. Lambda

Answer: B
Rationale: Autoscaling endpoints dynamically adjust resources based on traffic, ensuring consistent performance.


32.

A developer wants batch ML workflow automation. What is BEST?

A. Manual
B. Step Functions or pipelines
C. EC2
D. S3

Answer: B
Rationale: Workflow orchestration automates batch processes and ensures reliability.


33.

A developer wants model explainability for compliance. What is BEST?

A. Ignore
B. SHAP or LIME
C. EC2
D. S3

Answer: B
Rationale: SHAP and LIME provide interpretable explanations required for regulatory compliance.


34.

A developer wants to reduce latency for inference. What is BEST?

A. Increase model size
B. Optimize model or use smaller model
C. EC2
D. S3

Answer: B
Rationale: Smaller models or optimized architectures reduce inference latency and improve performance in real-time applications.


35.

A developer wants to handle categorical variables with many categories. What is BEST?

A. One-hot encoding
B. Target encoding
C. Ignore
D. EC2

Answer: B
Rationale: Target encoding reduces dimensionality compared to one-hot encoding and improves performance for high-cardinality categorical features.


36.

A developer wants to detect anomalies in logs. What is BEST?

A. Regression
B. Unsupervised learning
C. Classification
D. EC2

Answer: B
Rationale: Unsupervised models identify unusual patterns without labeled data, making them ideal for anomaly detection.


37.

A developer wants evaluation for imbalanced data. What is BEST?

A. Accuracy
B. Precision/Recall or F1
C. RMSE
D. MAE

Answer: B
Rationale: Precision, recall, and F1-score provide better insights into performance when class distributions are imbalanced.


38.

A developer wants to improve model robustness. What is BEST?

A. Ignore
B. Cross-validation
C. EC2
D. S3

Answer: B
Rationale: Cross-validation ensures model performance is consistent across different subsets of data.


39.

A developer wants feature drift detection. What is BEST?

A. Ignore
B. Compare feature distributions over time
C. EC2
D. S3

Answer: B
Rationale: Monitoring changes in feature distributions helps identify drift and maintain model performance.


40.

A developer wants production ML system. What is BEST?

A. Single model
B. End-to-end pipeline with monitoring
C. EC2
D. S3

Answer: B
Rationale: A full ML pipeline ensures scalability, monitoring, retraining, and continuous improvement, which are essential for production-grade machine learning systems.

41.

A model’s training loss decreases but validation loss increases. What is the issue?

A. Underfitting
B. Overfitting
C. Data leakage
D. EC2

Answer: B
Rationale: This pattern clearly indicates overfitting, where the model memorizes training data instead of learning general patterns. Regularization, early stopping, or more data can help improve generalization.


42.

A developer wants to handle missing categorical values. What is BEST?

A. Remove rows
B. Use “Unknown” category
C. Ignore
D. EC2

Answer: B
Rationale: Assigning a separate category for missing values preserves data and avoids bias introduced by removing rows, especially when missingness itself may carry information.


43.

A developer wants to reduce dimensionality while preserving interpretability. What is BEST?

A. PCA
B. Feature selection
C. Ignore
D. Increase features

Answer: B
Rationale: PCA reduces dimensions but sacrifices interpretability. Feature selection keeps original features, making models easier to explain while reducing complexity.


44.

A developer wants to train deep learning models faster. What is BEST?

A. CPU
B. GPU or distributed training
C. Ignore
D. S3

Answer: B
Rationale: GPUs accelerate matrix computations and distributed training allows parallel processing, significantly reducing training time for large deep learning models.


45.

A developer wants to prevent data leakage in preprocessing. What is BEST?

A. Normalize before split
B. Fit preprocessing only on training data
C. Ignore
D. EC2

Answer: B
Rationale: Preprocessing steps like scaling must be fitted only on training data to avoid leaking information from validation/test sets into training.


46.

A developer wants to evaluate classification threshold impact. What is BEST?

A. Accuracy
B. Precision-recall curve
C. RMSE
D. MAE

Answer: B
Rationale: Precision-recall curves show trade-offs across thresholds, making them ideal for tuning classification performance, especially in imbalanced datasets.


47.

A developer wants to improve training stability. What is BEST?

A. Increase learning rate
B. Normalize features
C. Ignore
D. Remove data

Answer: B
Rationale: Feature normalization ensures consistent input ranges, improving convergence stability and reducing training oscillations.


48.

A developer wants to handle high-cardinality categorical features. What is BEST?

A. One-hot encoding
B. Target encoding
C. Ignore
D. EC2

Answer: B
Rationale: One-hot encoding becomes inefficient with many categories. Target encoding reduces dimensionality while preserving predictive signal.


49.

A developer wants to monitor prediction quality in production. What is BEST?

A. CloudTrail
B. SageMaker Model Monitor
C. Config
D. Lambda

Answer: B
Rationale: Model Monitor tracks prediction accuracy, drift, and anomalies in real time, ensuring model reliability after deployment.


50.

A developer wants to retrain models automatically. What is BEST?

A. Manual
B. Scheduled pipeline
C. Ignore
D. EC2

Answer: B
Rationale: Automated retraining pipelines ensure models stay updated with new data and adapt to changing patterns.


51.

A developer wants scalable feature storage. What is BEST?

A. S3
B. SageMaker Feature Store
C. EC2
D. RDS

Answer: B
Rationale: Feature Store ensures consistent feature usage across training and inference, improving reproducibility and scalability.


52.

A developer wants to reduce inference cost. What is BEST?

A. Larger model
B. Smaller optimized model
C. Ignore
D. EC2

Answer: B
Rationale: Smaller or optimized models reduce compute usage, lowering inference costs while maintaining acceptable accuracy.


53.

A developer wants to detect label drift. What is BEST?

A. Ignore
B. Monitor prediction vs actual labels
C. EC2
D. S3

Answer: B
Rationale: Label drift occurs when the relationship between input and output changes. Monitoring predictions against actual outcomes helps detect this.


54.

A developer wants to improve model generalization. What is BEST?

A. Overfit
B. Cross-validation
C. Ignore
D. Remove data

Answer: B
Rationale: Cross-validation ensures model performance is consistent across datasets, improving generalization.


55.

A developer wants ensemble learning. What is BEST?

A. Single model
B. Combine multiple models
C. Ignore
D. EC2

Answer: B
Rationale: Ensembles combine predictions from multiple models, improving accuracy and robustness.


56.

A developer wants feature interaction detection. What is BEST?

A. Ignore
B. Tree-based models
C. EC2
D. S3

Answer: B
Rationale: Tree-based models automatically capture feature interactions, improving predictive performance.


57.

A developer wants to optimize memory usage during training. What is BEST?

A. Increase batch size
B. Reduce batch size
C. Ignore
D. EC2

Answer: B
Rationale: Smaller batch sizes reduce memory consumption, enabling training on limited resources.


58.

A developer wants distributed inference. What is BEST?

A. Single instance
B. Auto-scaling endpoints
C. Ignore
D. S3

Answer: B
Rationale: Auto-scaling endpoints distribute requests across instances, ensuring high throughput and availability.


59.

A developer wants reproducible experiments. What is BEST?

A. Manual
B. Fix random seeds + track configs
C. Ignore
D. EC2

Answer: B
Rationale: Fixing seeds and tracking configurations ensures consistent results across experiments.


60.

A developer wants production-grade ML system. What is BEST?

A. Single script
B. End-to-end pipeline with monitoring and retraining
C. EC2
D. S3

Answer: B
Rationale: A full ML lifecycle pipeline ensures scalability, monitoring, and continuous improvement, which are essential for production environments.

61.

A developer accidentally uses test data during feature scaling. What is the issue?

A. Overfitting
B. Data leakage
C. Underfitting
D. EC2

Answer: B
Rationale: Using test data in preprocessing leaks information into training, resulting in overly optimistic evaluation results and poor real-world performance. All transformations must be fit only on training data.


62.

A model has high bias and low variance. What is BEST action?

A. Increase regularization
B. Increase model complexity
C. Reduce data
D. Ignore

Answer: B
Rationale: High bias indicates underfitting. Increasing model complexity or adding features allows the model to capture more patterns and improve accuracy.


63.

A developer wants to evaluate ranking in recommendation systems. What is BEST?

A. Accuracy
B. Precision@K or NDCG
C. RMSE
D. MAE

Answer: B
Rationale: Ranking metrics like Precision@K and NDCG evaluate relevance and order of results, making them ideal for recommendation systems.


64.

A developer wants to reduce training time for large datasets. What is BEST?

A. Increase features
B. Distributed training
C. Ignore
D. S3

Answer: B
Rationale: Distributed training splits workloads across multiple nodes, significantly reducing training time and enabling scalability for large datasets.


65.

A model shows stable accuracy but declining business KPIs. What is the issue?

A. Overfitting
B. Concept drift
C. Underfitting
D. EC2

Answer: B
Rationale: Concept drift occurs when the relationship between inputs and outputs changes, causing model predictions to become less relevant despite stable accuracy metrics.


66.

A developer wants to improve model robustness to noise. What is BEST?

A. Ignore
B. Regularization + data augmentation
C. Increase epochs
D. EC2

Answer: B
Rationale: Regularization reduces overfitting, and data augmentation exposes the model to variations, improving robustness to noise.


67.

A developer wants to detect subtle anomalies in high-dimensional data. What is BEST?

A. Regression
B. Autoencoder
C. Classification
D. EC2

Answer: B
Rationale: Autoencoders learn compressed representations and can detect anomalies based on reconstruction error, making them suitable for high-dimensional anomaly detection.


68.

A developer wants to monitor feature drift in production. What is BEST?

A. Ignore
B. Compare statistical distributions over time
C. Increase features
D. EC2

Answer: B
Rationale: Monitoring statistical changes in features helps identify drift and maintain model performance.


69.

A developer wants to optimize hyperparameters under budget constraints. What is BEST?

A. Grid search
B. Bayesian optimization
C. Ignore
D. Manual

Answer: B
Rationale: Bayesian optimization efficiently explores the search space, reducing computation cost compared to exhaustive methods.


70.

A developer wants to reduce inference latency. What is BEST?

A. Larger model
B. Model quantization or pruning
C. Ignore
D. EC2

Answer: B
Rationale: Quantization and pruning reduce model size and computation, improving inference speed without significant accuracy loss.


71.

A developer wants to ensure reproducibility. What is BEST?

A. Ignore
B. Fix seeds and track configurations
C. Increase features
D. EC2

Answer: B
Rationale: Fixing random seeds and tracking configurations ensures consistent results across runs, which is critical for debugging and auditing.


72.

A developer wants to reduce variance without increasing bias too much. What is BEST?

A. Remove data
B. Ensemble methods
C. Ignore
D. Increase epochs

Answer: B
Rationale: Ensemble methods combine multiple models to reduce variance while maintaining predictive power.


73.

A developer wants to handle class imbalance in deep learning. What is BEST?

A. Ignore
B. Weighted loss function
C. Increase features
D. EC2

Answer: B
Rationale: Weighted loss functions penalize misclassification of minority classes more heavily, improving performance on imbalanced datasets.


74.

A developer wants to improve interpretability of complex models. What is BEST?

A. Ignore
B. SHAP or LIME
C. Increase features
D. EC2

Answer: B
Rationale: SHAP and LIME provide explanations for model predictions, improving transparency and trust.


75.

A developer wants efficient feature storage for ML pipelines. What is BEST?

A. S3
B. Feature Store
C. EC2
D. RDS

Answer: B
Rationale: Feature Store ensures consistency between training and inference, improving reproducibility and scalability.


76.

A developer wants automated retraining based on drift detection. What is BEST?

A. Manual
B. Event-driven pipeline
C. Ignore
D. EC2

Answer: B
Rationale: Event-driven pipelines trigger retraining automatically when drift is detected, ensuring models remain accurate over time.


77.

A developer wants to reduce memory usage during training. What is BEST?

A. Increase batch size
B. Reduce batch size
C. Ignore
D. EC2

Answer: B
Rationale: Smaller batch sizes require less memory, making training feasible on limited resources.


78.

A developer wants to improve generalization across datasets. What is BEST?

A. Overfit
B. Cross-validation
C. Ignore
D. Remove features

Answer: B
Rationale: Cross-validation ensures consistent performance across different data splits, improving generalization.


79.

A developer wants scalable inference for global users. What is BEST?

A. Single instance
B. Auto-scaling endpoints + multi-region deployment
C. Ignore
D. S3

Answer: B
Rationale: Auto-scaling and multi-region deployments ensure low latency and high availability for global users.


80.

A developer wants full production ML lifecycle. What is BEST?

A. Single model
B. End-to-end pipeline with monitoring, retraining, and CI/CD
C. EC2
D. S3

Answer: B
Rationale: A complete ML lifecycle pipeline ensures scalability, monitoring, retraining, and continuous improvement, which are essential for production-grade systems.

81.

A model’s ROC-AUC is high, but precision is low for the positive class. What is BEST action?

A. Increase threshold
B. Decrease threshold
C. Optimize for precision-recall trade-off
D. Ignore

Answer: C
Rationale: ROC-AUC can be misleading with class imbalance. Optimizing the precision-recall trade-off (e.g., tuning threshold or using PR curves) targets performance on the positive class, improving practical utility.


82.

A time-series model leaks seasonality from future periods during feature engineering. What is the issue?

A. Overfitting
B. Data leakage
C. Underfitting
D. EC2

Answer: B
Rationale: Using future-derived features (e.g., rolling stats computed with future windows) leaks information, inflating validation performance. Features must be computed using only past data at prediction time.


83.

A developer wants to reduce cold-start latency for real-time endpoints. What is BEST?

A. Larger instances only
B. Provisioned concurrency / warm pools
C. Ignore
D. S3

Answer: B
Rationale: Keeping instances warm (provisioned concurrency or min capacity) avoids initialization delays, reducing tail latency for sporadic traffic patterns.


84.

A model trained on historical data degrades after a pricing policy change. What is the issue?

A. Label noise
B. Concept drift
C. Data imbalance
D. EC2

Answer: B
Rationale: Policy changes alter the relationship between inputs and targets. Concept drift requires monitoring and retraining with recent data reflecting the new regime.


85.

A developer wants to compare models fairly across datasets of different scales. What is BEST?

A. Accuracy
B. Normalized metrics (e.g., R², MAPE)
C. RMSE only
D. Ignore

Answer: B
Rationale: Scale-dependent metrics like RMSE aren’t comparable across datasets. Normalized metrics enable fair comparison and better model selection decisions.


86.

A model suffers from exploding gradients during training. What is BEST?

A. Increase learning rate
B. Gradient clipping
C. Add features
D. Ignore

Answer: B
Rationale: Gradient clipping caps gradient norms, stabilizing training and preventing divergence, especially in deep or recurrent networks.


87.

A developer needs low-latency inference with minimal accuracy loss. What is BEST?

A. Larger model
B. Quantization and distillation
C. Ignore
D. EC2

Answer: B
Rationale: Quantization reduces precision and distillation transfers knowledge to smaller models, lowering latency and cost while preserving most accuracy.


88.

A dataset has severe class imbalance and rare positives are critical. What is BEST loss?

A. MSE
B. Weighted cross-entropy / focal loss
C. Accuracy
D. MAE

Answer: B
Rationale: Weighted losses or focal loss emphasize hard/rare examples, improving recall on minority classes crucial for tasks like fraud detection.


89.

A developer wants to ensure training/serving skew is minimized. What is BEST?

A. Separate pipelines
B. Shared feature definitions via Feature Store
C. Ignore
D. EC2

Answer: B
Rationale: Using a centralized Feature Store ensures identical transformations for training and inference, preventing skew and improving consistency.


90.

A model’s performance varies significantly across data slices (e.g., regions). What is BEST?

A. Ignore
B. Slice-based evaluation and mitigation
C. Increase epochs
D. EC2

Answer: B
Rationale: Evaluating by slices uncovers bias or subgroup issues. Mitigation may include rebalancing, separate models, or feature adjustments to ensure fairness and robustness.


91.

A developer wants faster experimentation under tight budgets. What is BEST?

A. Grid search
B. Early-stopping + successive halving
C. Ignore
D. EC2

Answer: B
Rationale: Successive halving/Hyperband with early stopping allocates resources efficiently, discarding poor configs early and reducing compute costs.


92.

A model uses high-cardinality categorical features with leakage risk. What is BEST?

A. One-hot encode all
B. Target encoding with CV folds
C. Ignore
D. EC2

Answer: B
Rationale: Target encoding can leak labels; applying it within cross-validation folds prevents leakage while handling high cardinality efficiently.


93.

A developer needs consistent offline/online metrics alignment. What is BEST?

A. Different metrics
B. Mirror online KPIs in offline evaluation
C. Ignore
D. EC2

Answer: B
Rationale: Aligning offline metrics with business KPIs ensures improvements translate to real-world impact, avoiding misleading offline gains.


94.

A streaming pipeline requires near-real-time feature computation. What is BEST?

A. Batch-only
B. Streaming features with low-latency store
C. Ignore
D. S3

Answer: B
Rationale: Streaming feature pipelines (e.g., incremental aggregates) enable timely predictions and reduce staleness for real-time use cases.


95.

A model exhibits calibration issues (overconfident probabilities). What is BEST?

A. Ignore
B. Platt scaling or isotonic regression
C. Increase epochs
D. EC2

Answer: B
Rationale: Calibration methods adjust predicted probabilities to better reflect true likelihoods, improving decision thresholds and downstream business logic.


96.

A developer wants to detect silent failures post-deployment. What is BEST?

A. CloudTrail only
B. Canary releases + shadow testing
C. Ignore
D. EC2

Answer: B
Rationale: Canary and shadow deployments compare new vs baseline behavior safely, catching regressions before full rollout.


97.

A large NLP model exceeds memory during training. What is BEST?

A. Increase batch size
B. Gradient accumulation / mixed precision
C. Ignore
D. S3

Answer: B
Rationale: Gradient accumulation simulates larger batches without extra memory, and mixed precision reduces memory footprint and speeds training.


98.

A developer wants robust model selection under noise. What is BEST?

A. Single split
B. Repeated cross-validation
C. Ignore
D. EC2

Answer: B
Rationale: Repeated CV reduces variance in estimates, yielding more reliable model comparisons in noisy datasets.


99.

A model’s features shift seasonally causing periodic errors. What is BEST?

A. Ignore
B. Seasonal features + periodic retraining
C. Increase epochs
D. EC2

Answer: B
Rationale: Incorporating seasonal indicators and scheduling retraining aligns the model with recurring patterns, stabilizing performance.


100.

A developer needs end-to-end governance for ML in production. What is BEST?

A. Ad-hoc scripts
B. Versioned pipelines + monitoring + audit logs
C. Ignore
D. EC2

Answer: B
Rationale: Governance requires versioning, lineage, monitoring, and auditing to ensure reproducibility, compliance, and reliable operations at scale.