200 Python Exercises Generated Iteratively with Claude Code

Introduction

As part of my interview prep journey, I needed to practice my Python skills and statistical concepts. Instead of spending dozens of dollars on interview prep platforms or prompting AI for exercises one by one, I decided to scale up my AI‑generated Python questions and create a database of 200 Python exercises I can use to practice.

This database was created with Data Scientists in mind, covering topics relevant to day‑to‑day tasks, less frequent but important work, and deeper technical knowledge that comes up from time to time (we need to flex these muscles now and then, friends).

Hopefully, you'll find this as useful as I have.

What Is Included

200 Python coding exercises, many of which aren’t just about writing functions or wrangling data, but about performing actual DS professional tasks. They’re not meant only to prep you for a Python coding interview; they also help you refresh concepts that are paramount for us to know as professionals. The topics included are:

Topics & Subtopics (200 Python Exercises)

topic	sub_topic	topic_difficulty
data manipulation	DataFrame creation and inspection	beginner
data manipulation	Filtering with boolean masks	beginner
data manipulation	GroupBy aggregations	beginner
data manipulation	Joins and merges	beginner
data manipulation	Pivot and melt	beginner
data manipulation	Datetime parsing and resampling	beginner

Expand to show the remaining topics

topic	sub_topic	topic_difficulty
data manipulation	Missing data handling and imputation	beginner
data manipulation	Window functions: rolling and expanding	intermediate
data manipulation	Vectorization vs apply	intermediate
data manipulation	Memory and performance optimization	expert
statistics and causal inference	Descriptive stats and distributions	beginner
statistics and causal inference	CLT and sampling distributions	beginner
statistics and causal inference	Confidence intervals	beginner
statistics and causal inference	Hypothesis tests (t and z)	beginner
statistics and causal inference	Power analysis and MDE	intermediate
statistics and causal inference	Linear regression (OLS) and diagnostics	intermediate
statistics and causal inference	Logistic regression and odds ratios	intermediate
statistics and causal inference	Fixed effects and panel regression	intermediate
statistics and causal inference	Difference-in-differences	expert
statistics and causal inference	Propensity score methods	expert
experimentation	Defining units and exposure	beginner
experimentation	Randomization and hashing	beginner
experimentation	Sample size and power planning	beginner
experimentation	AA tests and SRM checks	intermediate
experimentation	Guardrail metrics selection	intermediate
experimentation	CUPED variance reduction	intermediate
experimentation	Multiple testing control (FDR)	intermediate
experimentation	Sequential testing and alpha spending	expert
experimentation	Clustered and geo experiments	expert
experimentation	Switchback designs for platforms	expert

How I Built This

I used Claude Code to iterate through multiple combinations of topics and difficulty levels, experimenting with different exercise formats and validation approaches. The iterative process helped refine the exercise structure and ensure comprehensive coverage of Python concepts.

Process

I explained the task and objectives to Claude, fed it the above table of topics and difficulty levels, and asked it to create a set of 15–20 datasets that I could use.
I created a list of tasks, each specifying how many exercises to create given a topic, subtopic, and exercise difficulty:

{
  "group_id": 19,
  "topic": "statistics_and_causal_inference",
  "subtopic": "Difference-in-differences",
  "topic_difficulty": "expert",
  "exercise_count": 9,
  "difficulty_split": {"hard": 5, "hells_of_flame": 4},
  "datasets": ["employee_panel", "geo_experiment", "student_performance"],
  "id_range": "stat_056 to stat_064"
}

I manually asked Claude to go through the groups, executing the tasks (creating exercises following the guidelines).

gif_image

Once I had the initial output, I used the Claude API to review each exercise individually—checking for accuracy, appropriate difficulty calibration, and clear problem statements—which improved the overall quality of the exercises.

Instructions

Initialize the instructor (ensure JSON files are in the same directory):

instructor = PythonInstructor()

Get an exercise by topic and difficulty:

exercise = instructor.get_exercise(topic='loops', difficulty='beginner')

Display the exercise prompt:

print(exercise.get('exercise'))

Attempt to solve the problem:

# ... your solution code here ...

View the solution when ready:

print(exercise.get('solution'))
print(exercise.get('expl'))

How to Get the Package?

Send me a LinkedIn invite and slide into my DMs!

Python Exercises Generate Iteratively with AI for Interview Prer