Python Basics Cheat Sheet
A quick reference cheat sheet for Python basics for data science. Quick refresher for beginners and intermediate users.
Oct 25, 2025·8 min read·⏱ 0s··
Python dominates data science because it's intuitive, powerful, and backed by amazing libraries like pandas, numpy, and scikit-learn. Whether you're just starting out or need a quick refresher, this cheat sheet hopefully has everything you need.
Pro tip: Keep this page bookmarked! You'll reference it constantly as you work through data science projects.
Working with Files
The Working Directory
The working directory is where Python looks for files by default (e.g., C://file/path).
1import os
2
3# Get current working directory
4wd = os.getcwd() # '/current/path'
5
6# List files in directory
7os.listdir(wd)
8
9# Change working directory
10os.chdir('new/working/directory')
11
12# Common file operations
13os.rename('old.txt', 'new.txt') # Rename
14os.remove('file.txt') # Delete
15os.mkdir('new_folder') # Create folder
1import os
2
3# Get current working directory
4wd = os.getcwd() # '/current/path'
5
6# List files in directory
7os.listdir(wd)
8
9# Change working directory
10os.chdir('new/working/directory')
11
12# Common file operations
13os.rename('old.txt', 'new.txt') # Rename
14os.remove('file.txt') # Delete
15os.mkdir('new_folder') # Create folder
Operators
Operators let you perform mathematical operations, comparisons, and logical tests. Master these fundamentals first.
Arithmetic Operators
1# Addition
210 + 2 # 12
3
4# Subtraction
510 - 2 # 8
6
7# Multiplication
84 * 6 # 24
9
10# Division
1122 / 7 # 3.142857...
12
13# Integer division
1422 // 7 # 3
15
16# Power (exponentiation)
173 ** 4 # 81
18
19# Modulo (remainder)
2022 % 7 # 1
1# Addition
210 + 2 # 12
3
4# Subtraction
510 - 2 # 8
6
7# Multiplication
84 * 6 # 24
9
10# Division
1122 / 7 # 3.142857...
12
13# Integer division
1422 // 7 # 3
15
16# Power (exponentiation)
173 ** 4 # 81
18
19# Modulo (remainder)
2022 % 7 # 1
Assignment Operators
1# Assign a value
2a = 5
3
4# Change list item
5x[0] = 1
1# Assign a value
2a = 5
3
4# Change list item
5x[0] = 1
Comparison Operators
1# Test equality
23 == 3 # True
3
4# Test inequality
53 != 3 # False
6
7# Greater than
83 > 1 # True
9
10# Greater than or equal
113 >= 3 # True
12
13# Less than
143 < 4 # True
15
16# Less than or equal
173 <= 4 # True
1# Test equality
23 == 3 # True
3
4# Test inequality
53 != 3 # False
6
7# Greater than
83 > 1 # True
9
10# Greater than or equal
113 >= 3 # True
12
13# Less than
143 < 4 # True
15
16# Less than or equal
173 <= 4 # True
Logical Operators
1# Logical NOT
2not (2 == 2) # False
3
4# Logical AND
5(1 != 1) and (1 < 1) # False
6
7# Logical OR
8(1 == 1) or (1 < 1) # True
1# Logical NOT
2not (2 == 2) # False
3
4# Logical AND
5(1 != 1) and (1 < 1) # False
6
7# Logical OR
8(1 == 1) or (1 < 1) # True
Lists
Lists are the bread and butter of data science. They store sequences of values: numbers, text, even other lists!
Use lists when you need ordered data that you'll iterate through or transform.
Creating Lists
1# Create lists with [], elements separated by commas
2x = [1, 3, 2, 4]
3fruits = ['apple', 'banana', 'orange']
4mixed = [1, 'hello', 3.14, True]
1# Create lists with [], elements separated by commas
2x = [1, 3, 2, 4]
3fruits = ['apple', 'banana', 'orange']
4mixed = [1, 'hello', 3.14, True]
List Functions and Methods
1# Return sorted copy
2sorted([3, 1, 2]) # [1, 2, 3]
3
4# Sort in place
5x.sort()
6
7# Reverse order
8reversed(x) # Returns reversed iterator
9
10# Reverse in place
11x.reverse()
12
13# Count elements
14x.count(2) # Number of times 2 appears
1# Return sorted copy
2sorted([3, 1, 2]) # [1, 2, 3]
3
4# Sort in place
5x.sort()
6
7# Reverse order
8reversed(x) # Returns reversed iterator
9
10# Reverse in place
11x.reverse()
12
13# Count elements
14x.count(2) # Number of times 2 appears
Selecting List Elements
Lists are zero-indexed (first element has index 0).
1x = ['a', 'b', 'c', 'd', 'e']
2
3x[0] # 'a' (first element)
4x[-1] # 'e' (last element)
5x[1:3] # ['b', 'c'] (1st inclusive, 3rd exclusive)
6x[2:] # ['c', 'd', 'e'] (2nd to end)
7x[:3] # ['a', 'b', 'c'] (0th to 3rd exclusive)
1x = ['a', 'b', 'c', 'd', 'e']
2
3x[0] # 'a' (first element)
4x[-1] # 'e' (last element)
5x[1:3] # ['b', 'c'] (1st inclusive, 3rd exclusive)
6x[2:] # ['c', 'd', 'e'] (2nd to end)
7x[:3] # ['a', 'b', 'c'] (0th to 3rd exclusive)
Concatenating Lists
1x = [1, 3, 6]
2y = [10, 15, 21]
3
4x + y # [1, 3, 6, 10, 15, 21]
53 * x # [1, 3, 6, 1, 3, 6, 1, 3, 6]
1x = [1, 3, 6]
2y = [10, 15, 21]
3
4x + y # [1, 3, 6, 10, 15, 21]
53 * x # [1, 3, 6, 1, 3, 6, 1, 3, 6]
Dictionaries
Think of dictionaries as lookup tables. Perfect for storing structured data, survey responses, and configuration settings.
Use dictionaries when: You need fast lookups by name/key rather than position.
Creating Dictionaries
1# Create a dictionary with {}
2student = {'name': 'Alice', 'age': 22, 'grade': 'A'}
3scores = {'math': 95, 'science': 87, 'history': 92}
1# Create a dictionary with {}
2student = {'name': 'Alice', 'age': 22, 'grade': 'A'}
3scores = {'math': 95, 'science': 87, 'history': 92}
Dictionary Functions and Methods
1x = {'a': 1, 'b': 2, 'c': 3}
2
3x.keys() # dict_keys(['a', 'b', 'c'])
4x.values() # dict_values([1, 2, 3])
5x['a'] # 1 (get value by key)
6x.get('d', 0) # 0 (get with default)
1x = {'a': 1, 'b': 2, 'c': 3}
2
3x.keys() # dict_keys(['a', 'b', 'c'])
4x.values() # dict_values([1, 2, 3])
5x['a'] # 1 (get value by key)
6x.get('d', 0) # 0 (get with default)
Dictionary Operations
1# Add or update
2student['gpa'] = 3.75
3
4# Remove
5del student['age']
6
7# Check if key exists
8'name' in student # True
1# Add or update
2student['gpa'] = 3.75
3
4# Remove
5del student['age']
6
7# Check if key exists
8'name' in student # True
Strings
Work with text data efficiently. String manipulation is essential for cleaning data and extracting insights.
Creating Strings
In data science: You'll parse filenames, clean text columns, extract patterns.
1# Single line strings
2"DataCamp"
3'DataCamp'
4
5# Escape quotes
6"He said, \"DataCamp\""
7
8# Multi-line strings
9"""
10A Frame of Data
11Tidy, Mine, Analyze It
12Now You Have Meaning
13"""
1# Single line strings
2"DataCamp"
3'DataCamp'
4
5# Escape quotes
6"He said, \"DataCamp\""
7
8# Multi-line strings
9"""
10A Frame of Data
11Tidy, Mine, Analyze It
12Now You Have Meaning
13"""
String Operations
1str = "DataCamp"
2
3str[0] # 'D' (first character)
4str[0:4] # 'Data' (substring)
5str.upper() # 'DATACAMP'
6str.lower() # 'datacamp'
7str.title() # 'Datacamp'
8str.replace('a', 'e') # 'DetCe' (replace all)
1str = "DataCamp"
2
3str[0] # 'D' (first character)
4str[0:4] # 'Data' (substring)
5str.upper() # 'DATACAMP'
6str.lower() # 'datacamp'
7str.title() # 'Datacamp'
8str.replace('a', 'e') # 'DetCe' (replace all)
Combining Strings
1"Data" + "Framed" # 'DataFramed'
23 * "data " # 'data data data '
3"beekeepers".split('e') # ['b', '', 'k', '', 'p', 'rs']
1"Data" + "Framed" # 'DataFramed'
23 * "data " # 'data data data '
3"beekeepers".split('e') # ['b', '', 'k', '', 'p', 'rs']
Functions
Functions transform data from one shape to another. They're the building blocks of data pipelines.
Functions keep your code DRY (Don't Repeat Yourself). Write once, use everywhere!
Basic Functions
1def calculate_mean(numbers):
2 """Calculate the mean of a list of numbers."""
3 if not numbers:
4 return 0
5 return sum(numbers) / len(numbers)
6
7# Usage
8temperatures = [72, 68, 75, 82, 77]
9avg_temp = calculate_mean(temperatures)
10print(f"Average: {avg_temp}°F")
1def calculate_mean(numbers):
2 """Calculate the mean of a list of numbers."""
3 if not numbers:
4 return 0
5 return sum(numbers) / len(numbers)
6
7# Usage
8temperatures = [72, 68, 75, 82, 77]
9avg_temp = calculate_mean(temperatures)
10print(f"Average: {avg_temp}°F")
Function Parameters
1# Default parameters
2def greet(name="Guest"):
3 return f"Hello, {name}"
4
5greet() # 'Hello, Guest'
6greet("Alice") # 'Hello, Alice'
7
8# Multiple return values
9def stats(data):
10 return min(data), max(data), sum(data)/len(data)
11
12min_val, max_val, mean = stats([1, 5, 3, 9, 2])
1# Default parameters
2def greet(name="Guest"):
3 return f"Hello, {name}"
4
5greet() # 'Hello, Guest'
6greet("Alice") # 'Hello, Alice'
7
8# Multiple return values
9def stats(data):
10 return min(data), max(data), sum(data)/len(data)
11
12min_val, max_val, mean = stats([1, 5, 3, 9, 2])
Comprehensions
Python's superpower for data transformations. List and dictionary comprehensions are faster and more readable than loops.
List Comprehensions
Comprehensions are up to 30% faster than traditional loops. Plus they're more Pythonic!
1# Traditional loop
2squared = []
3for x in range(1, 6):
4 squared.append(x ** 2)
5
6# List comprehension
7squared = [x ** 2 for x in range(1, 6)]
8
9# With condition (filtering)
10even_squares = [x ** 2 for x in range(1, 11) if x % 2 == 0]
11# Result: [4, 16, 36, 64, 100]
1# Traditional loop
2squared = []
3for x in range(1, 6):
4 squared.append(x ** 2)
5
6# List comprehension
7squared = [x ** 2 for x in range(1, 6)]
8
9# With condition (filtering)
10even_squares = [x ** 2 for x in range(1, 11) if x % 2 == 0]
11# Result: [4, 16, 36, 64, 100]
Dictionary Comprehensions
Transform data structures efficiently:
1# Create dictionary of squares
2squares = {x: x**2 for x in range(1, 6)}
3# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
4
5# Filter and transform
6temperatures = {'Mon': 72, 'Tue': 68, 'Wed': 75, 'Thu': 82}
7hot_days = {day: temp for day, temp in temperatures.items() if temp > 75}
1# Create dictionary of squares
2squares = {x: x**2 for x in range(1, 6)}
3# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
4
5# Filter and transform
6temperatures = {'Mon': 72, 'Tue': 68, 'Wed': 75, 'Thu': 82}
7hot_days = {day: temp for day, temp in temperatures.items() if temp > 75}
Built-in Functions
Python's standard library has powerful functions that save you time. Learn these well.
enumerate()
Loop with both index and value together:
1grades = [85, 92, 78, 96]
2
3for index, grade in enumerate(grades):
4 print(f"Student {index + 1}: {grade}%")
1grades = [85, 92, 78, 96]
2
3for index, grade in enumerate(grades):
4 print(f"Student {index + 1}: {grade}%")
zip()
Combine multiple lists:
1students = ['Alice', 'Bob', 'Charlie']
2scores = [85, 92, 78]
3
4for student, score in zip(students, scores):
5 print(f"{student}: {score}")
6
7# Create dictionary
8student_dict = dict(zip(students, scores))
1students = ['Alice', 'Bob', 'Charlie']
2scores = [85, 92, 78]
3
4for student, score in zip(students, scores):
5 print(f"{student}: {score}")
6
7# Create dictionary
8student_dict = dict(zip(students, scores))
Error Handling
Real data is messy. Handle errors gracefully or your entire pipeline breaks.
1def safe_divide(num1, num2):
2 """Safely divide two numbers."""
3 try:
4 return num1 / num2
5 except ZeroDivisionError:
6 print("Cannot divide by zero")
7 return None
8 except TypeError:
9 print("Both values must be numbers")
10 return None
1def safe_divide(num1, num2):
2 """Safely divide two numbers."""
3 try:
4 return num1 / num2
5 except ZeroDivisionError:
6 print("Cannot divide by zero")
7 return None
8 except TypeError:
9 print("Both values must be numbers")
10 return None
Always handle edge cases in data science. Missing values, type mismatches, and division by zero are common!
Modules
Organize your code into reusable modules. Essential for building larger projects.
Importing Packages
1# Import without alias
2import pandas
3
4# Import with alias
5import pandas as pd
6
7# Import specific object
8from pandas import DataFrame
1# Import without alias
2import pandas
3
4# Import with alias
5import pandas as pd
6
7# Import specific object
8from pandas import DataFrame
Creating Your Own Module
1# data_utils.py
2"""Utility functions for data science."""
3
4def mean(data):
5 """Calculate the mean of a dataset."""
6 return sum(data) / len(data)
7
8def median(data):
9 """Calculate the median of a dataset."""
10 sorted_data = sorted(data)
11 n = len(sorted_data)
12 if n % 2 == 0:
13 return (sorted_data[n//2 - 1] + sorted_data[n//2]) / 2
14 return sorted_data[n//2]
1# data_utils.py
2"""Utility functions for data science."""
3
4def mean(data):
5 """Calculate the mean of a dataset."""
6 return sum(data) / len(data)
7
8def median(data):
9 """Calculate the median of a dataset."""
10 sorted_data = sorted(data)
11 n = len(sorted_data)
12 if n % 2 == 0:
13 return (sorted_data[n//2 - 1] + sorted_data[n//2]) / 2
14 return sorted_data[n//2]
Using Modules
1# Import your module
2import data_utils
3
4# Use functions
5temperatures = [72, 68, 75, 82, 77]
6avg = data_utils.mean(temperatures)
1# Import your module
2import data_utils
3
4# Use functions
5temperatures = [72, 68, 75, 82, 77]
6avg = data_utils.mean(temperatures)
Standard Library Modules
Python's standard library is a goldmine for data science:
collections
Advanced data structures for complex operations:
1from collections import Counter, defaultdict
2
3# Counter for frequency analysis
4votes = ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob']
5vote_counts = Counter(votes)
6print(vote_counts.most_common(2))
7# [('Alice', 2), ('Bob', 2)]
8
9# defaultdict for nested dictionaries
10student_scores = defaultdict(list)
11student_scores['Alice'].append(95)
12student_scores['Bob'].append(87)
1from collections import Counter, defaultdict
2
3# Counter for frequency analysis
4votes = ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob']
5vote_counts = Counter(votes)
6print(vote_counts.most_common(2))
7# [('Alice', 2), ('Bob', 2)]
8
9# defaultdict for nested dictionaries
10student_scores = defaultdict(list)
11student_scores['Alice'].append(95)
12student_scores['Bob'].append(87)
csv
Read and write CSV files—essential for data science:
1import csv
2
3# Reading CSV files
4with open('data.csv', 'r') as file:
5 reader = csv.DictReader(file)
6 for row in reader:
7 print(row['name'], row['score'])
1import csv
2
3# Reading CSV files
4with open('data.csv', 'r') as file:
5 reader = csv.DictReader(file)
6 for row in reader:
7 print(row['name'], row['score'])
json
Work with JSON data from APIs and files:
1import json
2
3# Convert to JSON string
4data = {'name': 'Alice', 'age': 22, 'scores': [95, 87, 92]}
5json_string = json.dumps(data)
6
7# Parse from JSON string
8parsed = json.loads(json_string)
1import json
2
3# Convert to JSON string
4data = {'name': 'Alice', 'age': 22, 'scores': [95, 87, 92]}
5json_string = json.dumps(data)
6
7# Parse from JSON string
8parsed = json.loads(json_string)
Most modern APIs return JSON. You'll use json constantly when working with external data sources.
Lambda Functions
One-line functions for quick operations:
Use lambdas with map(), filter(), and sorted(). They're perfect for applying simple transformations.
1# Lambda for quick calculations
2square = lambda x: x ** 2
3
4# Use with built-in functions
5numbers = [1, 2, 3, 4, 5]
6squared = list(map(lambda x: x ** 2, numbers))
7
8# Sorting with custom key
9students = [
10 {'name': 'Alice', 'score': 85},
11 {'name': 'Bob', 'score': 92},
12 {'name': 'Charlie', 'score': 78}
13]
14
15# Sort by score
16sorted_students = sorted(students, key=lambda x: x['score'], reverse=True)
1# Lambda for quick calculations
2square = lambda x: x ** 2
3
4# Use with built-in functions
5numbers = [1, 2, 3, 4, 5]
6squared = list(map(lambda x: x ** 2, numbers))
7
8# Sorting with custom key
9students = [
10 {'name': 'Alice', 'score': 85},
11 {'name': 'Bob', 'score': 92},
12 {'name': 'Charlie', 'score': 78}
13]
14
15# Sort by score
16sorted_students = sorted(students, key=lambda x: x['score'], reverse=True)
Quick Reference Summary
Bookmark this page for instant lookup!
Data Structures Cheat Sheet
Lists & Dictionaries - Your Main Tools
1# Creating collections
2[1, 2, 3] # List of numbers
3['a', 'b', 'c'] # List of strings
4{'name': 'Alice', 'age': 22} # Dictionary
5x[0] # Access first element
6x[-1] # Access last element
7x[1:4] # Slice elements 1-3
1# Creating collections
2[1, 2, 3] # List of numbers
3['a', 'b', 'c'] # List of strings
4{'name': 'Alice', 'age': 22} # Dictionary
5x[0] # Access first element
6x[-1] # Access last element
7x[1:4] # Slice elements 1-3
Quick Snippets:
len(x) - Get length
x.append(item) - Add to list
x.keys() / x.values() - Dictionary methods
'key' in x - Check membership
List Comprehensions vs Loops
1# Old way
2result = []
3for x in data:
4 result.append(x * 2)
5
6# Pythonic way
7result = [x * 2 for x in data]
8
9# With condition
10evens = [x for x in data if x % 2 == 0]
1# Old way
2result = []
3for x in data:
4 result.append(x * 2)
5
6# Pythonic way
7result = [x * 2 for x in data]
8
9# With condition
10evens = [x for x in data if x % 2 == 0]
Dictionary Comprehensions
1{x: x**2 for x in range(5)} # Create mapping
2{x: x for x in data if x > 0} # Filter while mapping
1{x: x**2 for x in range(5)} # Create mapping
2{x: x for x in data if x > 0} # Filter while mapping
Functions & Control Flow
Essential Patterns
1# Define function
2def clean_data(x):
3 return x.strip()
4
5# Lambda (one-liner)
6lambda x: x * 2
7
8# Error handling
9try:
10 result = x / y
11except ZeroDivisionError:
12 result = 0
1# Define function
2def clean_data(x):
3 return x.strip()
4
5# Lambda (one-liner)
6lambda x: x * 2
7
8# Error handling
9try:
10 result = x / y
11except ZeroDivisionError:
12 result = 0
Most Important Rules 🎯
- Lists → Ordered data, iterations
- Dictionaries → Key-value lookups
- Comprehensions → Fast transformations
- Functions → Reusable logic
- Error handling → Real-world resilience