AWS Certified Machine Learning - Specialty (MLS-C01) Exam Prep

Exam Name	MLS-C01 Practice Exam – AWS Certified Machine Learning Specialty (2026 Updated)
Exam Provider	Amazon Web Services (AWS)
Certification Type	Specialty-Level Certification (Advanced Machine Learning, Data Engineering, Modeling & Deployment on AWS)
Total Practice Questions	100 Advanced MCQs (Scenario-Based + Feature Engineering + Modeling + Evaluation + Deployment + MLOps)
Exam Domains Covered	• Data Engineering (data collection, transformation, feature engineering) • Exploratory Data Analysis (EDA, visualization, feature selection) • Modeling (classification, regression, clustering, deep learning) • Evaluation (metrics, bias-variance trade-off, cross-validation) • Deployment (real-time endpoints, batch inference, CI/CD pipelines) • Monitoring & Optimization (drift detection, retraining, performance tuning) • AWS ML Services (SageMaker, Feature Store, Model Monitor, pipelines)
Questions in Real Exam	• Total: ~65 Questions • Scenario-heavy with real-world ML workflows • Focus on advanced ML concepts and AWS integration
Exam Duration	• Total Time: 180 Minutes • Complex problem-solving questions requiring deep ML knowledge • Emphasis on practical application and architecture decisions
Passing Score	• Scaled Score: 750 / 1000 • Requires strong ML theory and AWS service expertise • Focus on real-world production ML systems
Question Format	• Multiple Choice & Multiple Response • Scenario-Based ML Problem Solving • Feature Engineering & Data Processing Cases • Model Evaluation & Optimization Questions • Deployment & Monitoring Scenarios
Difficulty Level	Advanced to Expert (Specialty-Level + Production ML Scenarios)
Key Knowledge Areas	• Advanced feature engineering (encoding, scaling, dimensionality reduction) • Model selection and tuning (hyperparameters, ensembles, deep learning) • Evaluation metrics (precision, recall, F1, ROC-AUC, RMSE, NDCG) • Bias-variance trade-off and overfitting/underfitting handling • Deployment strategies (SageMaker endpoints, batch transform, CI/CD) • Monitoring (data drift, concept drift, model performance tracking) • Distributed training and optimization (GPU, parallelization) • MLOps practices (pipelines, versioning, reproducibility)
Common Exam Traps	• Data leakage during preprocessing or feature engineering • Using incorrect evaluation metrics for imbalanced datasets • Ignoring feature scaling or transformation requirements • Misinterpreting precision vs recall trade-offs • Overfitting due to excessive model complexity • Not considering drift detection and retraining strategies • Choosing incorrect deployment method (real-time vs batch) • Ignoring cost and latency trade-offs in production ML
Skills Developed	• Designing scalable end-to-end ML pipelines • Performing advanced feature engineering and data preparation • Selecting and tuning models for optimal performance • Evaluating models using appropriate metrics and validation strategies • Deploying ML models in production with AWS services • Monitoring, maintaining, and improving ML systems over time
Study Strategy	• Focus on real-world ML scenarios and decision-making • Practice feature engineering and data preprocessing techniques • Understand evaluation metrics and when to use them • Learn SageMaker services and distributed training concepts • Study model tuning, regularization, and cross-validation • Analyze production ML systems and failure scenarios • Take timed mock exams and review detailed explanations
Best For	• Machine learning engineers and data scientists • AI/ML specialists working with AWS • Cloud engineers implementing ML pipelines • Professionals preparing for advanced ML certifications
Career Benefits	• Validates advanced machine learning expertise on AWS • Opens roles in ML engineering, data science, and AI architecture • Enhances skills in production ML and MLOps • Increases earning potential in AI-driven industries • Recognized as one of the most advanced AWS certifications
Updated	2026 Latest Version – Based on AWS MLS-C01 Exam Guide & Real Exam Patterns

A dataset contains highly skewed numerical features. What is BEST preprocessing step?

A. Ignore
B. Log transformation
C. One-hot encoding
D. Remove feature

Answer: B
Rationale: Highly skewed data can negatively affect model performance and convergence. Applying a log transformation normalizes the distribution, making patterns easier for models to learn and improving overall stability and predictive accuracy.

A classification dataset is heavily imbalanced. What is BEST approach?

A. Use accuracy
B. Use F1-score and resampling
C. Ignore imbalance
D. Increase features

Answer: B
Rationale: Accuracy can be misleading for imbalanced datasets. Using F1-score ensures a balance between precision and recall, while resampling techniques help the model learn from minority class examples effectively.

A developer wants to detect anomalies in streaming data. What is BEST?

A. Regression
B. Isolation Forest
C. Classification
D. Clustering

Answer: B
Rationale: Isolation Forest is specifically designed for anomaly detection and works well for identifying outliers in streaming or large datasets by isolating anomalies based on random partitioning.

A model shows high variance. What is BEST solution?

A. Increase complexity
B. Add regularization
C. Reduce data
D. Ignore

Answer: B
Rationale: High variance indicates overfitting. Regularization techniques such as L1 or L2 penalize model complexity, helping the model generalize better to unseen data while reducing sensitivity to noise.

A developer wants feature selection for high-dimensional data. What is BEST?

A. Add features
B. PCA
C. Ignore
D. Increase epochs

Answer: B
Rationale: PCA reduces dimensionality by transforming features into principal components, retaining most variance while removing redundant features, improving performance and reducing computational cost.

A developer wants to optimize hyperparameters efficiently. What is BEST?

A. Grid search
B. Random search
C. Bayesian optimization
D. Manual tuning

Answer: C
Rationale: Bayesian optimization intelligently explores the hyperparameter space using probabilistic models, making it more efficient than grid or random search, especially for complex ML models.

A developer wants real-time ML predictions. What is BEST?

A. Batch transform
B. SageMaker endpoint
C. S3
D. EC2

Answer: B
Rationale: SageMaker endpoints provide low-latency, real-time inference capabilities and can scale automatically based on incoming request volume, making them ideal for production applications.

A developer wants batch predictions for large datasets. What is BEST?

A. Endpoint
B. Batch transform
C. EC2
D. Lambda

Answer: B
Rationale: Batch transform is designed for offline processing of large datasets, allowing cost-effective inference without maintaining always-on endpoints.

A developer wants to prevent overfitting. What is BEST?

A. Increase epochs
B. Early stopping
C. Ignore
D. Reduce data

Answer: B
Rationale: Early stopping halts training when validation performance starts to degrade, preventing the model from overfitting to training data and improving generalization.

10.

A developer wants to explain model predictions. What is BEST?

A. Ignore
B. SHAP
C. PCA
D. EC2

Answer: B
Rationale: SHAP values provide interpretable insights into how each feature contributes to a model’s predictions, improving transparency and trust in ML systems.

11.

A developer wants to detect data drift. What is BEST?

A. Ignore
B. Compare feature distributions over time
C. Increase features
D. Use EC2

Answer: B
Rationale: Data drift occurs when input distributions change over time. Monitoring feature distributions helps identify drift early, ensuring model performance remains stable in production.

12.

A developer wants scalable ML pipelines. What is BEST?

A. Manual
B. SageMaker pipelines
C. EC2
D. S3

Answer: B
Rationale: SageMaker Pipelines automate ML workflows including preprocessing, training, and deployment, enabling reproducibility and scalability across teams.

13.

A developer wants secure data storage. What is BEST?

A. Public S3
B. S3 with encryption (KMS)
C. EC2
D. RDS

Answer: B
Rationale: Encrypting data at rest using KMS ensures compliance and security, protecting sensitive ML datasets from unauthorized access.

14.

A developer wants model monitoring. What is BEST?

A. CloudTrail
B. SageMaker Model Monitor
C. Config
D. Lambda

Answer: B
Rationale: Model Monitor tracks data drift, prediction quality, and anomalies in production, ensuring long-term model reliability.

15.

A developer wants to handle missing values. What is BEST?

A. Ignore
B. Imputation
C. Remove dataset
D. EC2

Answer: B
Rationale: Imputation techniques such as mean, median, or model-based filling preserve dataset size while handling missing values effectively.

16.

A developer wants feature scaling. What is BEST?

A. Ignore
B. Normalize/standardize
C. EC2
D. S3

Answer: B
Rationale: Feature scaling ensures consistent input ranges, improving convergence speed and performance for algorithms like gradient descent and neural networks.

17.

A developer wants CI/CD for ML. What is BEST?

A. Manual
B. CodePipeline
C. EC2
D. S3

Answer: B
Rationale: CodePipeline automates ML deployment workflows, ensuring consistent and repeatable model releases.

18.

A developer wants evaluation metric for regression. What is BEST?

A. Accuracy
B. RMSE
C. Precision
D. Recall

Answer: B
Rationale: RMSE measures the average magnitude of prediction errors and is widely used for evaluating regression models.

19.

A developer wants clustering. What is BEST?

A. Regression
B. K-means
C. Classification
D. EC2

Answer: B
Rationale: K-means groups similar data points into clusters, making it suitable for unsupervised learning tasks.

20.

A developer wants production ML system. What is BEST?

A. Single model
B. End-to-end ML pipeline
C. EC2
D. S3

Answer: B
Rationale: A complete ML pipeline ensures scalability, reproducibility, monitoring, and continuous improvement, which are critical for production-grade machine learning systems.

21.

A model performs poorly due to multicollinearity. What is BEST solution?

A. Add features
B. Remove correlated features
C. Increase epochs
D. Ignore

Answer: B
Rationale: Multicollinearity occurs when features are highly correlated, making model coefficients unstable and less interpretable. Removing redundant features improves model stability and performance.

22.

A developer wants to improve recall in a fraud detection model. What is BEST?

A. Increase threshold
B. Decrease threshold
C. Ignore
D. Remove data

Answer: B
Rationale: Lowering the classification threshold increases recall by capturing more true positives, which is critical in fraud detection, even if it increases false positives slightly.

23.

A dataset has outliers affecting model performance. What is BEST?

A. Ignore
B. Remove or cap outliers
C. Increase features
D. EC2

Answer: B
Rationale: Outliers can distort statistical relationships and model predictions. Removing or capping them improves robustness and model accuracy.

24.

A developer wants distributed training for large datasets. What is BEST?

A. Single instance
B. SageMaker distributed training
C. S3
D. Lambda

Answer: B
Rationale: Distributed training allows models to be trained across multiple instances, reducing training time and enabling handling of large datasets efficiently.

25.

A developer wants to reduce training time. What is BEST?

A. Increase data
B. Use GPU instances
C. Ignore
D. Remove model

Answer: B
Rationale: GPUs accelerate matrix computations used in ML training, significantly reducing training time for deep learning and large models.

26.

A model suffers from underfitting. What is BEST?

A. Reduce complexity
B. Increase complexity
C. Remove features
D. Ignore

Answer: B
Rationale: Underfitting occurs when a model is too simple to capture patterns. Increasing complexity or adding features helps improve learning.

27.

A developer wants ranking evaluation metric. What is BEST?

A. Accuracy
B. NDCG
C. RMSE
D. MAE

Answer: B
Rationale: NDCG evaluates ranking quality by considering both relevance and position, making it ideal for recommendation systems and search ranking tasks.

28.

A developer wants to reduce variance. What is BEST?

A. Increase complexity
B. Regularization or more data
C. Ignore
D. Remove features

Answer: B
Rationale: Variance indicates overfitting. Regularization or adding more training data helps improve generalization and reduce sensitivity to noise.

29.

A developer wants to monitor concept drift. What is BEST?

A. Ignore
B. Monitor prediction distributions over time
C. Increase features
D. EC2

Answer: B
Rationale: Concept drift occurs when relationships between features and labels change. Monitoring predictions and accuracy trends helps detect this issue early.

30.

A developer wants feature importance for tree-based models. What is BEST?

A. Ignore
B. Built-in feature importance
C. EC2
D. S3

Answer: B
Rationale: Tree-based models like Random Forest provide built-in feature importance scores, helping identify key predictors.

31.

A developer wants scalable inference. What is BEST?

A. EC2
B. SageMaker endpoint autoscaling
C. S3
D. Lambda

Answer: B
Rationale: Autoscaling endpoints dynamically adjust resources based on traffic, ensuring consistent performance.

32.

A developer wants batch ML workflow automation. What is BEST?

A. Manual
B. Step Functions or pipelines
C. EC2
D. S3

Answer: B
Rationale: Workflow orchestration automates batch processes and ensures reliability.

33.

A developer wants model explainability for compliance. What is BEST?

A. Ignore
B. SHAP or LIME
C. EC2
D. S3

Answer: B
Rationale: SHAP and LIME provide interpretable explanations required for regulatory compliance.

34.

A developer wants to reduce latency for inference. What is BEST?

A. Increase model size
B. Optimize model or use smaller model
C. EC2
D. S3

Answer: B
Rationale: Smaller models or optimized architectures reduce inference latency and improve performance in real-time applications.

35.

A developer wants to handle categorical variables with many categories. What is BEST?

A. One-hot encoding
B. Target encoding
C. Ignore
D. EC2

Answer: B
Rationale: Target encoding reduces dimensionality compared to one-hot encoding and improves performance for high-cardinality categorical features.

36.

A developer wants to detect anomalies in logs. What is BEST?

A. Regression
B. Unsupervised learning
C. Classification
D. EC2

Answer: B
Rationale: Unsupervised models identify unusual patterns without labeled data, making them ideal for anomaly detection.

37.

A developer wants evaluation for imbalanced data. What is BEST?

A. Accuracy
B. Precision/Recall or F1
C. RMSE
D. MAE

Answer: B
Rationale: Precision, recall, and F1-score provide better insights into performance when class distributions are imbalanced.

38.

A developer wants to improve model robustness. What is BEST?

A. Ignore
B. Cross-validation
C. EC2
D. S3

Answer: B
Rationale: Cross-validation ensures model performance is consistent across different subsets of data.

39.

A developer wants feature drift detection. What is BEST?

A. Ignore
B. Compare feature distributions over time
C. EC2
D. S3

Answer: B
Rationale: Monitoring changes in feature distributions helps identify drift and maintain model performance.

40.

A developer wants production ML system. What is BEST?

A. Single model
B. End-to-end pipeline with monitoring
C. EC2
D. S3

Answer: B
Rationale: A full ML pipeline ensures scalability, monitoring, retraining, and continuous improvement, which are essential for production-grade machine learning systems.

41.

A model’s training loss decreases but validation loss increases. What is the issue?

A. Underfitting
B. Overfitting
C. Data leakage
D. EC2

Answer: B
Rationale: This pattern clearly indicates overfitting, where the model memorizes training data instead of learning general patterns. Regularization, early stopping, or more data can help improve generalization.

42.

A developer wants to handle missing categorical values. What is BEST?

A. Remove rows
B. Use “Unknown” category
C. Ignore
D. EC2

Answer: B
Rationale: Assigning a separate category for missing values preserves data and avoids bias introduced by removing rows, especially when missingness itself may carry information.

43.

A developer wants to reduce dimensionality while preserving interpretability. What is BEST?

A. PCA
B. Feature selection
C. Ignore
D. Increase features

Answer: B
Rationale: PCA reduces dimensions but sacrifices interpretability. Feature selection keeps original features, making models easier to explain while reducing complexity.

44.

A developer wants to train deep learning models faster. What is BEST?

A. CPU
B. GPU or distributed training
C. Ignore
D. S3

Answer: B
Rationale: GPUs accelerate matrix computations and distributed training allows parallel processing, significantly reducing training time for large deep learning models.

45.

A developer wants to prevent data leakage in preprocessing. What is BEST?

A. Normalize before split
B. Fit preprocessing only on training data
C. Ignore
D. EC2

Answer: B
Rationale: Preprocessing steps like scaling must be fitted only on training data to avoid leaking information from validation/test sets into training.

46.

A developer wants to evaluate classification threshold impact. What is BEST?

A. Accuracy
B. Precision-recall curve
C. RMSE
D. MAE

Answer: B
Rationale: Precision-recall curves show trade-offs across thresholds, making them ideal for tuning classification performance, especially in imbalanced datasets.

47.

A developer wants to improve training stability. What is BEST?

A. Increase learning rate
B. Normalize features
C. Ignore
D. Remove data

Answer: B
Rationale: Feature normalization ensures consistent input ranges, improving convergence stability and reducing training oscillations.

48.

A developer wants to handle high-cardinality categorical features. What is BEST?

A. One-hot encoding
B. Target encoding
C. Ignore
D. EC2

Answer: B
Rationale: One-hot encoding becomes inefficient with many categories. Target encoding reduces dimensionality while preserving predictive signal.

49.

A developer wants to monitor prediction quality in production. What is BEST?

A. CloudTrail
B. SageMaker Model Monitor
C. Config
D. Lambda

Answer: B
Rationale: Model Monitor tracks prediction accuracy, drift, and anomalies in real time, ensuring model reliability after deployment.

50.

A developer wants to retrain models automatically. What is BEST?

A. Manual
B. Scheduled pipeline
C. Ignore
D. EC2

Answer: B
Rationale: Automated retraining pipelines ensure models stay updated with new data and adapt to changing patterns.

51.

A developer wants scalable feature storage. What is BEST?

A. S3
B. SageMaker Feature Store
C. EC2
D. RDS

Answer: B
Rationale: Feature Store ensures consistent feature usage across training and inference, improving reproducibility and scalability.

52.

A developer wants to reduce inference cost. What is BEST?

A. Larger model
B. Smaller optimized model
C. Ignore
D. EC2

Answer: B
Rationale: Smaller or optimized models reduce compute usage, lowering inference costs while maintaining acceptable accuracy.

53.

A developer wants to detect label drift. What is BEST?

A. Ignore
B. Monitor prediction vs actual labels
C. EC2
D. S3

Answer: B
Rationale: Label drift occurs when the relationship between input and output changes. Monitoring predictions against actual outcomes helps detect this.

54.

A developer wants to improve model generalization. What is BEST?

A. Overfit
B. Cross-validation
C. Ignore
D. Remove data

Answer: B
Rationale: Cross-validation ensures model performance is consistent across datasets, improving generalization.

55.

A developer wants ensemble learning. What is BEST?

A. Single model
B. Combine multiple models
C. Ignore
D. EC2

Answer: B
Rationale: Ensembles combine predictions from multiple models, improving accuracy and robustness.

56.

A developer wants feature interaction detection. What is BEST?

A. Ignore
B. Tree-based models
C. EC2
D. S3

Answer: B
Rationale: Tree-based models automatically capture feature interactions, improving predictive performance.

57.

A developer wants to optimize memory usage during training. What is BEST?

A. Increase batch size
B. Reduce batch size
C. Ignore
D. EC2

Answer: B
Rationale: Smaller batch sizes reduce memory consumption, enabling training on limited resources.

58.

A developer wants distributed inference. What is BEST?

A. Single instance
B. Auto-scaling endpoints
C. Ignore
D. S3

Answer: B
Rationale: Auto-scaling endpoints distribute requests across instances, ensuring high throughput and availability.

59.

A developer wants reproducible experiments. What is BEST?

A. Manual
B. Fix random seeds + track configs
C. Ignore
D. EC2

Answer: B
Rationale: Fixing seeds and tracking configurations ensures consistent results across experiments.

60.

A developer wants production-grade ML system. What is BEST?

A. Single script
B. End-to-end pipeline with monitoring and retraining
C. EC2
D. S3

Answer: B
Rationale: A full ML lifecycle pipeline ensures scalability, monitoring, and continuous improvement, which are essential for production environments.

61.

A developer accidentally uses test data during feature scaling. What is the issue?

A. Overfitting
B. Data leakage
C. Underfitting
D. EC2

Answer: B
Rationale: Using test data in preprocessing leaks information into training, resulting in overly optimistic evaluation results and poor real-world performance. All transformations must be fit only on training data.

62.

A model has high bias and low variance. What is BEST action?

A. Increase regularization
B. Increase model complexity
C. Reduce data
D. Ignore

Answer: B
Rationale: High bias indicates underfitting. Increasing model complexity or adding features allows the model to capture more patterns and improve accuracy.

63.

A developer wants to evaluate ranking in recommendation systems. What is BEST?

A. Accuracy
B. Precision@K or NDCG
C. RMSE
D. MAE

Answer: B
Rationale: Ranking metrics like Precision@K and NDCG evaluate relevance and order of results, making them ideal for recommendation systems.

64.

A developer wants to reduce training time for large datasets. What is BEST?

A. Increase features
B. Distributed training
C. Ignore
D. S3

Answer: B
Rationale: Distributed training splits workloads across multiple nodes, significantly reducing training time and enabling scalability for large datasets.

65.

A model shows stable accuracy but declining business KPIs. What is the issue?

A. Overfitting
B. Concept drift
C. Underfitting
D. EC2

Answer: B
Rationale: Concept drift occurs when the relationship between inputs and outputs changes, causing model predictions to become less relevant despite stable accuracy metrics.

66.

A developer wants to improve model robustness to noise. What is BEST?

A. Ignore
B. Regularization + data augmentation
C. Increase epochs
D. EC2

Answer: B
Rationale: Regularization reduces overfitting, and data augmentation exposes the model to variations, improving robustness to noise.

67.

A developer wants to detect subtle anomalies in high-dimensional data. What is BEST?

A. Regression
B. Autoencoder
C. Classification
D. EC2

Answer: B
Rationale: Autoencoders learn compressed representations and can detect anomalies based on reconstruction error, making them suitable for high-dimensional anomaly detection.

68.

A developer wants to monitor feature drift in production. What is BEST?

A. Ignore
B. Compare statistical distributions over time
C. Increase features
D. EC2

Answer: B
Rationale: Monitoring statistical changes in features helps identify drift and maintain model performance.

69.

A developer wants to optimize hyperparameters under budget constraints. What is BEST?

A. Grid search
B. Bayesian optimization
C. Ignore
D. Manual

Answer: B
Rationale: Bayesian optimization efficiently explores the search space, reducing computation cost compared to exhaustive methods.

70.

A developer wants to reduce inference latency. What is BEST?

A. Larger model
B. Model quantization or pruning
C. Ignore
D. EC2

Answer: B
Rationale: Quantization and pruning reduce model size and computation, improving inference speed without significant accuracy loss.

71.

A developer wants to ensure reproducibility. What is BEST?

A. Ignore
B. Fix seeds and track configurations
C. Increase features
D. EC2

Answer: B
Rationale: Fixing random seeds and tracking configurations ensures consistent results across runs, which is critical for debugging and auditing.

72.

A developer wants to reduce variance without increasing bias too much. What is BEST?

A. Remove data
B. Ensemble methods
C. Ignore
D. Increase epochs

Answer: B
Rationale: Ensemble methods combine multiple models to reduce variance while maintaining predictive power.

73.

A developer wants to handle class imbalance in deep learning. What is BEST?

A. Ignore
B. Weighted loss function
C. Increase features
D. EC2

Answer: B
Rationale: Weighted loss functions penalize misclassification of minority classes more heavily, improving performance on imbalanced datasets.

74.

A developer wants to improve interpretability of complex models. What is BEST?

A. Ignore
B. SHAP or LIME
C. Increase features
D. EC2

Answer: B
Rationale: SHAP and LIME provide explanations for model predictions, improving transparency and trust.

75.

A developer wants efficient feature storage for ML pipelines. What is BEST?

A. S3
B. Feature Store
C. EC2
D. RDS

Answer: B
Rationale: Feature Store ensures consistency between training and inference, improving reproducibility and scalability.

76.

A developer wants automated retraining based on drift detection. What is BEST?

A. Manual
B. Event-driven pipeline
C. Ignore
D. EC2

Answer: B
Rationale: Event-driven pipelines trigger retraining automatically when drift is detected, ensuring models remain accurate over time.

77.

A developer wants to reduce memory usage during training. What is BEST?

A. Increase batch size
B. Reduce batch size
C. Ignore
D. EC2

Answer: B
Rationale: Smaller batch sizes require less memory, making training feasible on limited resources.

78.

A developer wants to improve generalization across datasets. What is BEST?

A. Overfit
B. Cross-validation
C. Ignore
D. Remove features

Answer: B
Rationale: Cross-validation ensures consistent performance across different data splits, improving generalization.

79.

A developer wants scalable inference for global users. What is BEST?

A. Single instance
B. Auto-scaling endpoints + multi-region deployment
C. Ignore
D. S3

Answer: B
Rationale: Auto-scaling and multi-region deployments ensure low latency and high availability for global users.

80.

A developer wants full production ML lifecycle. What is BEST?

A. Single model
B. End-to-end pipeline with monitoring, retraining, and CI/CD
C. EC2
D. S3

Answer: B
Rationale: A complete ML lifecycle pipeline ensures scalability, monitoring, retraining, and continuous improvement, which are essential for production-grade systems.

81.

A model’s ROC-AUC is high, but precision is low for the positive class. What is BEST action?

A. Increase threshold
B. Decrease threshold
C. Optimize for precision-recall trade-off
D. Ignore

Answer: C
Rationale: ROC-AUC can be misleading with class imbalance. Optimizing the precision-recall trade-off (e.g., tuning threshold or using PR curves) targets performance on the positive class, improving practical utility.

82.

A time-series model leaks seasonality from future periods during feature engineering. What is the issue?

A. Overfitting
B. Data leakage
C. Underfitting
D. EC2

Answer: B
Rationale: Using future-derived features (e.g., rolling stats computed with future windows) leaks information, inflating validation performance. Features must be computed using only past data at prediction time.

83.

A developer wants to reduce cold-start latency for real-time endpoints. What is BEST?

A. Larger instances only
B. Provisioned concurrency / warm pools
C. Ignore
D. S3

Answer: B
Rationale: Keeping instances warm (provisioned concurrency or min capacity) avoids initialization delays, reducing tail latency for sporadic traffic patterns.

84.

A model trained on historical data degrades after a pricing policy change. What is the issue?

A. Label noise
B. Concept drift
C. Data imbalance
D. EC2

Answer: B
Rationale: Policy changes alter the relationship between inputs and targets. Concept drift requires monitoring and retraining with recent data reflecting the new regime.

85.

A developer wants to compare models fairly across datasets of different scales. What is BEST?

A. Accuracy
B. Normalized metrics (e.g., R², MAPE)
C. RMSE only
D. Ignore

Answer: B
Rationale: Scale-dependent metrics like RMSE aren’t comparable across datasets. Normalized metrics enable fair comparison and better model selection decisions.

86.

A model suffers from exploding gradients during training. What is BEST?

A. Increase learning rate
B. Gradient clipping
C. Add features
D. Ignore

Answer: B
Rationale: Gradient clipping caps gradient norms, stabilizing training and preventing divergence, especially in deep or recurrent networks.

87.

A developer needs low-latency inference with minimal accuracy loss. What is BEST?

A. Larger model
B. Quantization and distillation
C. Ignore
D. EC2

Answer: B
Rationale: Quantization reduces precision and distillation transfers knowledge to smaller models, lowering latency and cost while preserving most accuracy.

88.

A dataset has severe class imbalance and rare positives are critical. What is BEST loss?

A. MSE
B. Weighted cross-entropy / focal loss
C. Accuracy
D. MAE

Answer: B
Rationale: Weighted losses or focal loss emphasize hard/rare examples, improving recall on minority classes crucial for tasks like fraud detection.

89.

A developer wants to ensure training/serving skew is minimized. What is BEST?

A. Separate pipelines
B. Shared feature definitions via Feature Store
C. Ignore
D. EC2

Answer: B
Rationale: Using a centralized Feature Store ensures identical transformations for training and inference, preventing skew and improving consistency.

90.

A model’s performance varies significantly across data slices (e.g., regions). What is BEST?

A. Ignore
B. Slice-based evaluation and mitigation
C. Increase epochs
D. EC2

Answer: B
Rationale: Evaluating by slices uncovers bias or subgroup issues. Mitigation may include rebalancing, separate models, or feature adjustments to ensure fairness and robustness.

91.

A developer wants faster experimentation under tight budgets. What is BEST?

A. Grid search
B. Early-stopping + successive halving
C. Ignore
D. EC2

Answer: B
Rationale: Successive halving/Hyperband with early stopping allocates resources efficiently, discarding poor configs early and reducing compute costs.

92.

A model uses high-cardinality categorical features with leakage risk. What is BEST?

A. One-hot encode all
B. Target encoding with CV folds
C. Ignore
D. EC2

Answer: B
Rationale: Target encoding can leak labels; applying it within cross-validation folds prevents leakage while handling high cardinality efficiently.

93.

A developer needs consistent offline/online metrics alignment. What is BEST?

A. Different metrics
B. Mirror online KPIs in offline evaluation
C. Ignore
D. EC2

Answer: B
Rationale: Aligning offline metrics with business KPIs ensures improvements translate to real-world impact, avoiding misleading offline gains.

94.

A streaming pipeline requires near-real-time feature computation. What is BEST?

A. Batch-only
B. Streaming features with low-latency store
C. Ignore
D. S3

Answer: B
Rationale: Streaming feature pipelines (e.g., incremental aggregates) enable timely predictions and reduce staleness for real-time use cases.

95.

A model exhibits calibration issues (overconfident probabilities). What is BEST?

A. Ignore
B. Platt scaling or isotonic regression
C. Increase epochs
D. EC2

Answer: B
Rationale: Calibration methods adjust predicted probabilities to better reflect true likelihoods, improving decision thresholds and downstream business logic.

96.

A developer wants to detect silent failures post-deployment. What is BEST?

A. CloudTrail only
B. Canary releases + shadow testing
C. Ignore
D. EC2

Answer: B
Rationale: Canary and shadow deployments compare new vs baseline behavior safely, catching regressions before full rollout.

97.

A large NLP model exceeds memory during training. What is BEST?

A. Increase batch size
B. Gradient accumulation / mixed precision
C. Ignore
D. S3

Answer: B
Rationale: Gradient accumulation simulates larger batches without extra memory, and mixed precision reduces memory footprint and speeds training.

98.

A developer wants robust model selection under noise. What is BEST?

A. Single split
B. Repeated cross-validation
C. Ignore
D. EC2

Answer: B
Rationale: Repeated CV reduces variance in estimates, yielding more reliable model comparisons in noisy datasets.

99.

A model’s features shift seasonally causing periodic errors. What is BEST?

A. Ignore
B. Seasonal features + periodic retraining
C. Increase epochs
D. EC2

Answer: B
Rationale: Incorporating seasonal indicators and scheduling retraining aligns the model with recurring patterns, stabilizing performance.

100.

A developer needs end-to-end governance for ML in production. What is BEST?

A. Ad-hoc scripts
B. Versioned pipelines + monitoring + audit logs
C. Ignore
D. EC2

Answer: B
Rationale: Governance requires versioning, lineage, monitoring, and auditing to ensure reproducibility, compliance, and reliable operations at scale.