Skills Laboratory - DataCraft Portfolio

Technical Skills in Action

Interactive demonstrations of programming languages, frameworks, and tools with live code examples

Programming Languages

Python

Intermediate • 3 years

75%

# Customer Segmentation & RFM Analysis
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

def generate_sample_data(num_customers=1000):
    """Generate customer data for RFM analysis"""
    np.random.seed(42)

    # Generate RFM data
    recency = np.random.randint(2, 46, size=num_customers)
    frequency = np.random.randint(3, 21, size=num_customers)
    avg_order_value = np.full(num_customers, 100)
    monetary = frequency * avg_order_value

    # Create DataFrame
    df = pd.DataFrame({
        'recency': recency,
        'frequency': frequency,
        'monetary': monetary,
        'orders': frequency,
        'avg_order_value': avg_order_value
    })

    df['total_spent'] = df['orders'] * df['avg_order_value']

    # Scale features
    scaler = StandardScaler()
    scaler.fit_transform(df[['recency', 'frequency', 'monetary']])

    # Calculate results
    return {
        'high_value': len(df[df['total_spent'] > df['total_spent'].quantile(0.8)]),
        'avg_clv': round(df['total_spent'].mean(), 2),
        'retention_rate': round((df['recency'] < 30).mean() * 100, 2)
    }

# Run analysis on 1000 customers
result = generate_sample_data(1000)
print(f"High-value customers: {result['high_value']}")
print(f"Average CLV: ${result['avg_clv']:.2f}")
print(f"Retention rate: {result['retention_rate']:.1f}%")

Interactive Demo

A/B Testing

Advanced • Statistical Analysis

85%

# Statistical Analysis in Python
import numpy as np
from scipy import stats

# A/B Test Analysis Function
def analyze_ab_test(control, treatment):
    """Perform A/B test analysis with t-test and effect size"""
    control = np.array(control)
    treatment = np.array(treatment)

    # Calculate basic statistics
    control_mean = np.mean(control)
    treatment_mean = np.mean(treatment)

    # Perform independent samples t-test
    t_stat, p_value = stats.ttest_ind(treatment, control)

    # Calculate effect size (Cohen's d)
    n1, n2 = len(control), len(treatment)
    var1 = np.var(control, ddof=1)
    var2 = np.var(treatment, ddof=1)
    pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2))
    effect_size = (treatment_mean - control_mean) / pooled_std

    return {
        'control_mean': control_mean,
        'treatment_mean': treatment_mean,
        'p_value': p_value,
        'effect_size': effect_size,
        'significant': p_value < 0.05
    }

# Sample data
control_group = [2.3, 2.1, 2.4, 2.2, 2.5, 2.0, 2.3]
treatment_group = [2.8, 2.9, 2.7, 3.0, 2.6, 2.8, 2.9]

# Run analysis
results = analyze_ab_test(control_group, treatment_group)
lift = (results['treatment_mean'] / results['control_mean'] - 1) * 100
print(f"Lift: {lift:.1f}%")
print(f"P-value: {results['p_value']:.4f}")
print(f"Significant: {results['significant']}")

Statistical Output

Database & SQL

Advanced SQL

PostgreSQL, MySQL, BigQuery

-- Customer Cohort Analysis
WITH monthly_cohorts AS (
  SELECT
    customer_id,
    DATE_TRUNC('month', first_order_date) as cohort_month,
    DATE_TRUNC('month', order_date) as order_month
  FROM customer_orders
),
cohort_data AS (
  SELECT
    cohort_month,
    order_month,
    COUNT(DISTINCT customer_id) as customers,
    EXTRACT(MONTH FROM AGE(order_month, cohort_month)) as period_number
  FROM monthly_cohorts
  GROUP BY 1, 2, 4
)
SELECT
  cohort_month,
  period_number,
  customers,
  ROUND(100.0 * customers /
    FIRST_VALUE(customers) OVER (
      PARTITION BY cohort_month
      ORDER BY period_number
    ), 2) as retention_rate
FROM cohort_data
ORDER BY cohort_month, period_number;

Query Results Visualization

Click "Run Query" to load cohort analysis data from SQLite

Database Technologies

PostgreSQL Redis BigQuery Pinecone Vector DB MongoDB

Frameworks & Libraries

Data Science

Pandas

75%

NumPy

72%

SciPy

68%

Statsmodels

45%

Machine Learning

Scikit-learn

74%

TensorFlow

37%

PyTorch

42%

XGBoost

40%

AI Engineering

LangChain

65%

Ollama

88%

Hugging Face

62%

Pinecone

68%

Visualization

Matplotlib

73%

Seaborn

71%

Plotly

69%

Streamlit

52%

AI & ML Models

Machine Learning Demonstrations

Experience production-ready ML models with real-time predictions. Test with your own inputs and see instant results powered by advanced algorithms.

4 Live Models

77%+ Accuracy

Real-time Inference

Customer Churn Prediction

Random Forest 77.47% Accuracy

Monthly Charges ($)

Tenure (months)

Total Charges ($)

Contract Type

Dynamic Price Optimization

Gradient Boosting Revenue Maximization

Product Category

Competitor Price ($)

Inventory Level

Demand Score (1-10)

Real-time Sentiment Analysis

Gemini AI Multi-language

Enter Text to Analyze

Select Language

Movie Recommendation Engine

MongoDB Atlas Vector Search

Describe Your Movie Preference

Number of Recommendations

5 Auto

1 10

Minimum Match Score

50% Auto

50% 100%

Interactive Skills Laboratory

Technical Skills in Action

Programming Languages

Python

Interactive Demo

A/B Testing

Statistical Output

Database & SQL

Advanced SQL

Query Results Visualization

Database Technologies

Frameworks & Libraries

Data Science

Machine Learning

AI Engineering

Visualization

Machine Learning Demonstrations

Customer Churn Prediction

Dynamic Price Optimization

Real-time Sentiment Analysis

Movie Recommendation Engine

Recommended Movies:

Interactive Skills Laboratory

Technical Skills in Action

Programming Languages

Python

Interactive Demo

A/B Testing

Statistical Output

Database & SQL

Advanced SQL

Query Results Visualization

Database Technologies

Frameworks & Libraries

Data Science

Machine Learning

AI Engineering

Visualization

How It Works

Real-World Use Cases

Why It Matters

Key Features

Machine Learning Demonstrations

Customer Churn Prediction

Dynamic Price Optimization

Real-time Sentiment Analysis

Movie Recommendation Engine

Recommended Movies: