🐍 Python and Numbers: The Definitive Guide for Social Researchers [Part 2 of 2]

Are you struggling with numerical data analysis in your social research? Do you spend hours processing surveys and calculating statistics manually? In this second part, you'll learn to master numerical operations in Python: from basic number handling to professional result formatting. You'll discover how to perform precise calculations, efficiently compare values, and present your results with professional clarity.

Although these concepts are the foundation of quantitative analysis in any field, we've designed each example with the real challenges that social researchers face in mind: from survey analysis to community program evaluation.

What will you learn? From basic operations to advanced formatting techniques, each section includes practical examples from social research: demographic analysis, survey processing, social program evaluation, and more.

📌 Quick Content Guide

  • Advanced operations for social analysis

  • Data comparison and results evaluation

  • Professional formatting techniques

  • Effective analysis documentation

Estimated reading time: 25 minutes

1️⃣ Advanced Operations in Python

How would you calculate the increase in inequality in your region over the last decade? Or the real impact of a social program across different communities? The basic operations from part one aren't enough for these complex analyses.

We need more sophisticated tools. Imagine processing thousands of survey responses in seconds, or comparing development indices between regions with mathematical precision. These advanced operations will give you that power, transforming raw data into meaningful insights about social reality.

And if your field is different, don't worry: these tools are like a sharp scalpel - once you learn to use it, it serves for complex surgery or cutting paper. The fascinating thing about Python is precisely that: the same techniques you'll use to analyze social inequality can help you optimize a supply chain or predict market trends.

1.1. 🔢 Special Mathematical Operators

💡 Key Concept: Special operators allow efficient complex calculations in social research.

Imagine you're analyzing a city's population growth, or calculating how many work groups you can form with your survey respondents. Python has special operators that facilitate these calculations:

Operator What does it do? Practical example
** Raises to a power population = 1000 * (1.02 ** 5) # 2% growth for 5 years
// Division without decimals groups = 85 // 4 # 21 complete groups of 4 people
% Division remainder leftover = 85 % 4 # 1 person without group
abs() Absolute value difference = abs(north - south) # Distance between regions

Veamos estos operadores en acción con datos reales:

# ================================================
# Population and Group Analysis
# ================================================

# Initial data
current_population = 10000    # City inhabitants
growth_rate = 1.03           # 3% annual growth
projection_years = 5         # 5-year projection
group_size = 6               # People per survey group

# Project growth using **
future_population = current_population * (growth_rate ** projection_years)  # Population in 5 years

# Organize survey groups using // and %
total_surveyed = 85          # People to survey
complete_groups = total_surveyed // group_size  # How many groups we can form
people_without_group = total_surveyed % group_size # People left over

# Compare regions using abs()
north_population = 5800      # Northern zone inhabitants
south_population = 4200      # Southern zone inhabitants
difference = abs(north_population - south_population)  # Difference between zones

# Show results
print("=== Analysis Results ===")
print(f"Current population: {current_population:,} inhabitants")
print(f"Population in {projection_years} years: {future_population:,.0f} inhabitants")
print("\nSurvey organization:")
print(f"Complete groups: {complete_groups} groups of {group_size}")
print(f"People without group: {people_without_group}")
print(f"\nRegional comparison:")
print(f"North-South Difference: {difference:,} inhabitants")

Population growth works like this:

  • growth_rate = 1.03 means the population grows 3% each year.

  • ** repeats the multiplication: 1.03 ** 5 = 1.03 × 1.03 × 1.03 × 1.03 × 1.03.

  • Group division works like this:

    • 85 // 6 = 14 (we divide 85 by 6 and keep the whole number part).

    • 85 % 6 = 1 (the remainder of dividing 85 by 6).

  • Regional comparison uses abs():

    • abs(5800 - 4200) = 1600.

    • abs(4200 - 5800) = 1600 as well.

💡 Warning about common errors:

  • When dividing numbers, make sure the denominator is not zero (example: you can't divide by 0 when calculating percentages).

  • When converting text to number (like with float()), make sure the text contains only digits (example: "25" works, "twenty-five" doesn't).

  • In a future lesson, we'll learn specific techniques to handle these errors.

1.2. 🔄 Update Operators

💡 Key Concept: Update operators simplify variable modification during data analysis.

After seeing special operators, let's look at a more efficient way to update values. Are you tired of writing number = number + 1 every time you want to increase a value? Python has shortcuts that make this easier:

Operator Instead of writing... You can write... What does it do?
+= total = total + 5 total += 5 Add and store
-= total = total - 3 total -= 3 Subtract and store
*= total = total * 2 total *= 2 Multiply and store
/= total = total / 4 total /= 4 Divide and store

Let's see these shortcuts in action with a workshop example:

# ================================================
# Counting survey responses
# ================================================

# Start the count
total_responses = 0         # Nobody has responded yet
yes_responses = 0          # People who said "yes"
no_responses = 0           # People who said "no"

# Morning's first responses
total_responses += 20      # 20 responses came in
yes_responses += 12        # 12 people said yes
no_responses += 8          # 8 people said no

# Afternoon responses
total_responses += 30      # 30 more responses
yes_responses += 15        # 15 said yes
no_responses += 15        # 15 said no

# Adjust for invalid responses
total_responses -= 5       # Remove 5 incorrectly filled responses
yes_responses -= 3         # 3 were from "yes"
no_responses -= 2         # 2 were from "no"

# Show final results
print("=== Survey Results ===")
print(f"Total valid responses: {total_responses}")
print(f"'Yes' responses: {yes_responses}")
print(f"'No' responses: {no_responses}")
print(f"'Yes' percentage: {(yes_responses / total_responses) * 100:.1f}%")

💡 Notice how:

  • Each operator has a clear use:

    • += adds values: total_responses += 20 increases from 0 to 20, then to 50.

    • -= subtracts values: yes_responses -= 3 reduces the total when there are errors.

    • The numbers reflect real data that we're accumulating.

  • The code is shorter and clearer:

    • Instead of total = total + 20, we use total += 20

    • Instead of yes = yes - 3, we use yes -= 3

  • The data tells a story:

    • We see the flow of responses throughout the day.

    • We handle errors by removing invalid responses.

    • We calculate percentages at the end.

Checkpoint: Advanced Operations

You've learned:

  • Use of special operators (**, //, %, abs()).

  • Growth and proportion calculations.

  • Group division and regional comparisons. Ready to continue?

Make sure you can perform basic calculations before moving forward.

🎯 Challenge: Basic Operations


Let's Practice! 🎯

Now that we know special mathematical operators (**, //, %, abs()) and update operators (+=, -=, *=, /=), let's use them in a real case.

Your challenge:

Analyze the monthly budget of a social program and calculate its impact on the number of participants.

The output you should get is:

=== Community Budget Analysis ===
Total budget: $5,000
Monthly investment: $416
Monthly remainder: $84

=== Program Impact ===
Initial participants: 100
Participants after 3 months: 125
Absolute difference: 25

Available data:

# Initial data
budget = 5000           # Annual budget in dollars
months = 12             # Program months
participants = 100      # Initial participants
growth = 1.08           # Monthly growth (8%)
period = 3              # Months elapsed

SOLUTION:

Here's the code commented step by step:

# Initial data
budget = 5000            # Annual budget in dollars
months = 12             # Program months
participants = 100      # Initial participants
growth = 1.08          # Monthly growth (8%)
period = 3             # Months elapsed


# 1. Calculate budget distribution
# Divide budget by program months
monthly_investment = budget // months     # Integer division for monthly amount
remainder = budget % months              # Remainder for leftover

# 2. Calculate impact on participants
# Apply growth rate for each elapsed month
final_participants = participants * (growth ** period)  # Exponential growth
difference = abs(final_participants - participants)     # Absolute difference

# 3. Show results
# Budget results
print("=== Community Budget Analysis ===")
print(f"Total budget: ${budget:,}")
print(f"Monthly investment: ${monthly_investment}")
print(f"Monthly remainder: ${remainder}")

# Impact results
print("\n=== Program Impact ===")
print(f"Initial participants: {participants}")
print(f"Participants after 3 months: {final_participants:.0f}")
print(f"Absolute difference: {difference:.0f}")

Excellent work! You've practiced with:

  • Special mathematical operators like ** and abs()

  • Update operators like += and -=

  • Calculations with real social research data.

But an important question arises: how do we know if these results are good or bad? If our program is successful? To answer this, we need to learn how to compare numbers.

2️⃣ Comparing Numbers

Now that we've mastered basic operations, the next step is comparing results. How do we know if our social program is working? If the situation improved or worsened? For this, we need comparison tools.

2.1. ⚖️ Value Comparison

💡 Key Concept: Precise comparisons allow us to evaluate results and make data-driven decisions.

Increased or decreased? Better or worse? Goal achieved or not? Social analysis lives by these comparisons. When evaluating whether the wage gap is closing, if voter turnout exceeded expectations, or if a social program meets objectives, we need precise tools to compare numbers.

Python provides comparison operators that transform complex social questions into clear and precise analyses. With them we can:

  • Detect if inequality exceeds critical thresholds.

  • Identify communities that need urgent intervention.

  • Evaluate public policy progress against goals.

  • Compare social indicators across periods or regions.

Let's look at Python's tools for these comparisons. Each operator is like a specific question for our data:

Operator Question Practical use
> Is it greater than? poverty_rate > 0.30 # Above 30%?
< Is it less than? literacy < 0.95 # Less than 95%?
>= Is it greater than or equal to? participation >= goal # Did we reach the goal?
<= Is it less than or equal to? inequality <= 0.40 # Acceptable Gini?
== Is it exactly equal to? gender_ratio == 1.0 # Exact parity?
!= Is it different from? behavior != expected # Atypical case?

Let's see these operators in action with a practical example:

# ======================================
# Community workshop evaluation
# ======================================

# Workshop information
participant_age = 25        # Age of person who wants to participate
minimum_age = 18           # Minimum allowed age
attendees = 30             # People who came today
maximum_capacity = 35      # Maximum allowed in room
satisfaction = 4.5         # Rating given (1 to 5)
minimum_satisfaction = 4.0 # Minimum acceptable rating

# Check workshop rules
can_participate = participant_age &gt;= minimum_age    # Meets age requirement?
space_available = attendees &lt; maximum_capacity      # Is there room?
good_evaluation = satisfaction &gt;= minimum_satisfaction # Did they like it?

# Show evaluation results
print("=== Today's Workshop Evaluation ===")
print(f"Participant age: {participant_age} years")
print(f"Can participate?: {can_participate}")
print()
print(f"Current attendees: {attendees}")
print(f"Space available?: {space_available}")
print()
print(f"Satisfaction: {satisfaction} / 5")
print(f"Good evaluation?: {good_evaluation}")

💡 Notice how:

  • We use clear names for variables:

    • participant_age is clearer than just age

    • maximum_capacity tells us exactly what that 35 means.

    • Descriptive names make the code more understandable.

  • Comparisons always give True or False:

    • participant_age &gt;= minimum_age gives True because 25 ≥ 18

    • attendees &lt; maximum_capacity gives True because 30 < 35

    • satisfaction &gt;= minimum_satisfaction gives True because 4.5 ≥ 4.0

  • We store each result in a variable with meaningful name:

    • can_participate stores whether age requirement is met.

    • space_available stores if there's room.

    • good_evaluation stores if satisfaction is sufficient.

  • The printed messages are clear and organized:

    • We use titles with === to separate sections.

    • We leave blank lines with print() for better readability.

    • Each message explains the meaning of the result.

2.2. 🔗 Combining Conditions in Social Analysis

💡 Key Concept: Combining conditions allows us to perform complex evaluations of social programs.

Simple comparisons are useful, but in social research, we frequently need to evaluate multiple criteria at once. For example, for our community workshop, we might ask:

  • Does the person meet the minimum age AND is there space available?

  • Did the workshop have good attendance OR good evaluation?

  • Were the workshop objectives NOT met?

Python offers three powerful tools for combining conditions:

Operator Meaning Practical use
and All conditions must be True age >= 18 AND space_available # Can they enter?
or At least one condition must be True is_holiday OR is_sunday # Is it closed?
not Inverts True to False and vice versa NOT space_available # Is it full?

Let's see how to combine these conditions in our workshop example:

# ================================================
# Complete workshop evaluation
# ================================================

# Basic information
age = 25                  # Participant's age
space_available = True    # Yes, there are spots available
day = "sunday"           # What day is today
has_registration = False # Not previously registered

# Participation requirements
minimum_age = 18          # Minimum allowed age
is_weekend = day == "saturday" or day == "sunday"  # Check if it's weekend
is_open = not is_weekend  # Open on weekdays

# Combine conditions
can_enter = age &gt;= minimum_age and space_available  # Age OK AND space available
needs_registration = not has_registration          # NOT registered
can_register = can_enter and needs_registration    # Meets everything

# Show results
print("=== Participant Evaluation ===")
print(f"Participant age: {age}")             # Show age
print(f"Space available?: {space_available}")  # Show availability
print(f"Today's day: {day}")                 # Show what day it is
print(f"Is it open today?: {is_open}")        # Show if it's open
print(f"Already registered?: {has_registration}") # Show registration status

print("\nResults:")                           # Separator for results
print(f"Can enter?: {can_enter}")            # Show if requirements are met
print(f"Needs to register?: {needs_registration}")  # Show if registration needed
print(f"Ready to register?: {can_register}")       # Show final decision

💡 Notice how:

  • We use logical operators to make decisions:

    • and: can_enter = age &gt;= minimum_age and space_available

    • or: is_weekend = day == "saturday" or day == "sunday"

    • not: is_open = not is_weekend

  • We build complex decisions step by step:

    • Basic condition: can_enter = age &gt;= minimum_age and space_available

    • Schedule verification: is_weekend = day == "saturday" or day == "sunday"

    • Final decision: can_register = can_enter and needs_registration

  • The code is like a conversation:

    • We ask: Does it meet the age requirement? Is there space?

    • We verify: Is it weekend? Is it registered?

    • We decide: Can they register?

Checkpoint: Comparisons

You've mastered:

  • Comparison operators (>, <, >=, <=, ==, !=).

  • Combining conditions (and, or, not).

  • Evaluating results against goals.

Before continuing: Practice combining at least two conditions in your own example.


Let's Practice! 🎯

Now that we know how to compare values (>, <, >=, <=, ==, !=) and combine conditions (and, or, not), let's apply these concepts to a real social research case.

Your challenge:

You are part of a team evaluating scholarship applications. You need to create a program that automatically determines which students meet the minimum requirements.

The output you should get is:

=== Scholarship Evaluation ===
Student 1: Qualifies
- Sufficient average: True
- Complete attendance: True
- Meets requirements: True

Student 2: Does not qualify
- Sufficient average: True
- Complete attendance: False
- Meets requirements: False

Student 3: Does not qualify
- Sufficient average: False
- Complete attendance: True
- Meets requirements: False

SOLUTION:

# ================================================
# Student data and requirements
# ================================================

# Minimum program requirements
minimum_average = 8.0     # Minimum required average
minimum_attendance = 0.9  # 90% minimum attendance

# Student 1 Data
average_1 = 8.5          # Average grade (0-10)
attendance_1 = 0.95      # 95% attendance
age_1 = 20              # Age in years

# Student 2 Data
average_2 = 9.0          # Average grade (0-10)
attendance_2 = 0.80      # 80% attendance
age_2 = 19              # Age in years

# Student 3 Data
average_3 = 7.5          # Average grade (0-10)
attendance_3 = 0.98      # 98% attendance
age_3 = 22              # Age in years

# ================================================
# Criteria evaluation
# ================================================

# 1. Student 1 Evaluation
# Check if each minimum requirement is met
sufficient_average_1 = average_1 >= minimum_average            # Sufficient average?
complete_attendance_1 = attendance_1 >= minimum_attendance     # Sufficient attendance?
qualifies_1 = sufficient_average_1 and complete_attendance_1   # Meets everything?

# 2. Student 2 Evaluation
sufficient_average_2 = average_2 >= minimum_average            # Sufficient average?
complete_attendance_2 = attendance_2 >= minimum_attendance     # Sufficient attendance?
qualifies_2 = sufficient_average_2 and complete_attendance_2   # Meets everything?

# 3. Student 3 Evaluation
sufficient_average_3 = average_3 &gt;= minimum_average         # Sufficient average?
complete_attendance_3 = attendance_3 &gt;= minimum_attendance  # Sufficient attendance?
qualifies_3 = sufficient_average_3 and complete_attendance_3   # Meets everything?

# ================================================
# Show results
# ================================================

print("=== Scholarship Evaluation ===")

# Student 1 Results
print("Student 1:", "Qualifies" if qualifies_1 else "Does not qualify")
print(f"- Sufficient average: {sufficient_average_1}")
print(f"- Complete attendance: {complete_attendance_1}")
print(f"- Meets requirements: {qualifies_1}\n")

# Student 2 Results
print("Student 2:", "Qualifies" if qualifies_2 else "Does not qualify")
print(f"- Sufficient average: {sufficient_average_2}")
print(f"- Complete attendance: {complete_attendance_2}")
print(f"- Meets requirements: {qualifies_2}\n")

# Student 3 Results
print("Student 3:", "Qualifies" if qualifies_3 else "Does not qualify")
print(f"- Sufficient average: {sufficient_average_3}")
print(f"- Complete attendance: {complete_attendance_3}")
print(f"- Meets requirements: {qualifies_3}")

💡 Notice how:

We use comparison operators to evaluate requirements:

  • average_1 >= minimum_average checks if grades are sufficient.

  • attendance_1 >= minimum_attendance verifies attendance meets criteria.

  • Each comparison gives us a True/False result.

We build evaluations step by step:

  • First check grades: sufficient_average_1 = average_1 >= minimum_average

  • Then check attendance: complete_attendance_1 = attendance_1 >= minimum_attendance

  • Finally combine both: qualifies_1 = sufficient_average_1 and complete_attendance_1

The code is like a scholarship committee meeting:

  • We ask: Does the student have good enough grades?

  • We verify: Did they attend enough classes?

  • We decide: Do they qualify for the scholarship?

Checkpoint: Scholarship Evaluation

You've mastered:

  • Using comparison operators to evaluate academic requirements.

  • Combining conditions with 'and' to ensure all criteria are met.

  • Presenting results clearly for each student.

Before continuing: Try creating your own evaluation system with at least two requirements that must both be met for success.

3️⃣ Showing and Formatting Numbers

Now that we know how to operate and compare numbers, how do we present these results professionally? In this section, we'll learn how to capture numerical data and display it with the appropriate format for reports and research.

3.1. 📥 Capturing and Displaying Numbers

💡 Key Concept: The correct capture and presentation of data is fundamental for effective communication of results.

How do we ask for a respondent's age? Or how do we show the percentage of positive responses with two decimals? Python has specific tools for this:

Tool What is it for? Example
input() Ask for data age = input("Age?: ")
float() Convert to number age = float(age) # From text to number
{:.2f} Show 2 decimals print(f"Age: {age:.2f}") # 25.00
{:,} Separate thousands print(f"Total: {1234:,}") # 1,234

Here's the example code showing these tools in action:

# ================================================
# Capturing and displaying survey data
# ================================================

# Ask for basic data
age = input("What's your age?: ")        # Receive text
age = float(age)                         # Convert to number

responses = input("How many surveys did you complete?: ")
responses = float(responses)

satisfaction = input("Satisfaction level (1-5)?: ")
satisfaction = float(satisfaction)

# Calculate some indicators
goal_percentage = (responses / 100) * 100  # Goal was 100 surveys
average_satisfaction = satisfaction / 1    # Keep original number

# Show results with formatting
print("=== Survey Results ===")
print(f"Surveyor age: {age:.0f} years")     # No decimals for age
print(f"Surveys completed: {responses:,.0f}")   # With thousands separator
print(f"Average satisfaction: {satisfaction:.1f}/5")  # One decimal
print(f"Goal percentage: {goal_percentage:.1f}%")   # One decimal for percentage

💡 Notice how:

  • We always follow two steps when asking for numbers:

    • First, input() receives the text: age = input("What's your age?: ")

    • Then we convert to number because:

      • input() always gives text: "25" is text, not a number.

      • We can't calculate with text: "25" + "5" = "255".

      • float() fixes this: converts "25" to the number 25.0

  • Each data type uses its format:

    • Age without decimals: {age:.0f}

    • Large numbers with commas: {responses:,}

    • Percentages with one decimal: {percentage:.1f}%

3.2. ✨ Special Formats for Numbers

💡 Key Concept: Professional number formatting improves the readability and understanding of our analyses.

Now that we know how to capture and display basic numbers, let's take a step further. How do we present large numbers in a readable way? Or percentages in the correct format? Python has special formats for every need:

Format What does it do? Example Result
{:,} Separates thousands f"{1234567:,}" 1,234,567
{:.2f} 2 decimals f"{3.14159:.2f}" 3.14
{:.1%} Percentage f"{0.756:.1%}" 75.6%
{:>8} Right aligns using 8 spaces * f"{123:>8}" ' 123'
{:<8} Left aligns using 8 spaces * f"{123:<8}" '123 '
{:^8} Centers using 8 spaces * f"{123:^8}" ' 123 '

*The number after >, < or ^ indicates the total space reserved for displaying the value. For example:

name = "Ana"
age = 25
print(f"|{name:&gt;10}|")  # |       Ana|
print(f"|{name:&lt;10}|")  # |Ana       |
print(f"|{name:^10}|")  # |   Ana    |

Let's see these shortcuts in action with a workshop example:

# ================================================
# Presenting research results
# ================================================

# Data from our study
population = 1234567       # Region inhabitants
poverty_rate = 0.2345      # 23.45% of population
average_income = 1234.5    # Monthly income

# Show data without formatting
print("=== Data Without Formatting ===")
print(f"Population: {population}")          # 1234567
print(f"Poverty rate: {poverty_rate}")      # 0.2345
print(f"Income: {average_income}")          # 1234.5

# Show with appropriate formatting
print("\n=== Data With Formatting ===")
print(f"Population: {population:,} inhabitants")       # 1,234,567
print(f"Poverty rate: {poverty_rate:.1%}")             # 23.5%
print(f"Average income: ${average_income:.2f}")        # $1234.50

# Show aligned data (useful for tables)
print("\n=== Aligned Data ===")
print(f"{'Region':^10} {'Population':&gt;10}")         # Centered titles
print(f"North:    {12345:&gt;10,}")                    #     12,345
print(f"South:    {9876:&gt;10,}")                     #      9,876
print(f"East:     {45678:&gt;10,}")                    #     45,678

💡 Notice how:

  • Each format improves readability:

    • {population:,} → 1,234,567 is more readable than 1234567.

  • Decimal control:

    • {average_income:.2f} → $1234.50 (2 decimals for money).

    • {poverty_rate:.1%} → 23.5% (1 decimal for percentages).

    • The number after the point controls how many decimals to show.

  • Alignments create ordered tables:

    • Center titles: {'Region':^10} centers the word "Region".

    • Align numbers: {12345:&gt;10,} aligns figures to the right.

Checkpoint: Formatting

You can now:

  • Capture numerical data with input()

  • Format numbers with decimals and separators

  • Present results professionally

Tip: Create a small "reference sheet" with the formats you'll use most often.


Let´s practice! 🎯

Now that we know how to capture and format numbers, let's create a small social research report.

Your challenge:

Create a program that asks for:

  • Number of surveys conducted

  • Percentage of positive responses (as a number from 0 to 100)

  • Budget used

And generates a professional format report.

The interaction should look like this:

=== Survey Data Capture ===
Number of surveys conducted: 234
Percentage of positive responses: 67.5
Budget used: 1500.50

=== RESULTS REPORT ===
Total surveys:        234
Positive responses:   67.5%
Total cost:         $1,500.50
Cost per survey:    $6.41

SOLUTION:

# Ask for basic data
print("=== Survey Data Capture ===")
# Enter: 234
surveys = input("Number of surveys conducted: ")
# Enter: 67.5
positive = input("Percentage of positive responses: ")
# Enter: 1500.50
budget = input("Budget used: ")

# Convert text to numbers
surveys = float(surveys)
positive = float(positive)
budget = float(budget)

# Calculate cost per survey
cost_per_survey = budget / surveys

# Show report with formatting
print("\n=== RESULTS REPORT ===")
print(f"Total surveys:        {surveys:.0f}")
print(f"Positive responses:   {positive:.1f}%")
print(f"Total cost:         ${budget:,.2f}")
print(f"Cost per survey:    ${cost_per_survey:.2f}")

💡 Notice how:

1. We use input() to ask for the data:

  • For "surveys" we enter: 234

  • For "positive" we enter: 67.5

  • For "budget" we enter: 1500.50

2. We convert to numbers with float()

3. We do a simple calculation by dividing

4. We format numbers in the report:

  • Surveys without decimals: .0f

  • Percentage with one decimal: .1f

  • Money with two decimals and thousands separator: ,.2f

Excellent! You now master how to capture data and present it with professional formatting. But there's something else we need to learn: how do we ensure other researchers understand our code and can replicate our work?

4️⃣ Documenting Numbers in Python

Imagine sharing your code with another researcher. Will they know what your variables mean? Will they understand what units your numbers are in? Clear documentation is crucial for making your work replicable and useful for the scientific community.

4.1. 📝 Comments and Units

💡 Key Concept: Clear documentation of units and measures is essential for the reproducibility of social research.

When working with numerical data in social research, clarity is crucial. How do we know if a rate is in percentage or proportion? Or if an income is in pesos or dollars? Let's see how to document our numbers clearly and professionally.

Let's look at an example of how to properly document units:

# Social survey data
participants = 150       # people
response_rate = 0.85    # proportion (0-1)
average_time = 25       # minutes per survey
budget = 5000          # pesos

# Average demographic data
average_age = 34.5     # years
monthly_income = 1250  # dollars
education_level = 14.2 # years of study

# Show results with clear units
print("=== Survey Summary ===")
print(f"Sample: {participants} people")
print(f"Response rate: {response_rate:.1%}")     # Convert to percentage
print(f"Time per survey: {average_time} min")
print(f"Budget: ${budget:,.2f}")

# Show demographic data
print("\n=== Average Profile ===")
print(f"Age: {average_age:.1f} years")
print(f"Income: ${monthly_income:,.2f}/month")
print(f"Education: {education_level:.1f} years")

💡 Notice how:

  • Each variable has its unit clearly specified:

    • In the name: average_time, monthly_income

    • In the comment: # people, # dollars

    • In the print: people, years, $/month

  • Comments are consistent:

    • Always after the number: 150 # people

    • Include ranges when necessary: (0-1)

    • Specify the currency: # dollars, # pesos

4.2. 📋 Format Conventions

💡 Key Concept: Consistent format conventions ensure the professionalism and clarity of our analyses.

Comments are essential, but equally important is maintaining a consistent style. In social research, consistency in format is crucial. Imagine a report where:

  • Rates sometimes appear as "0.234" and other times as "23.4%".

  • Incomes vary between "1234.5" and "$1,234.50".

  • Ages are shown with different decimals: "34", "34.0", "34.00".

To avoid confusion, we establish clear rules:

  • Rates and proportions: always in percentage with one decimal (23.4%).

  • Monetary values: always with symbol, thousands separator, and two decimals ($1,234.50).

  • Ages and counts: without decimals (34 years, 150 people).

Let's see an example of good and bad practices:

# ================================================
# ❌ INCORRECT WAY: Inconsistent Format
# ================================================
# Variables without clear standard
poverty_rate = 0.234       # Proportion or percentage?
income = 1234.5            # What currency? How many decimals?
age = 34.00                # Why two decimals?

# Results difficult to interpret
print("=== Inconsistent Results ===")
print(f"Poverty: {poverty_rate}")     # Shows 0.234
print(f"Income: {income}")            # Shows 1234.5
print(f"Age: {age}")                  # Shows 34.00

print("\n")  # Separator between examples

# ================================================
# ✓ CORRECT WAY: Consistent Format
# ================================================
# Variables with clear documentation
poverty_rate = 0.234       # Proportion (0-1) for calculations
income = 1234.50           # Pesos, two decimals
age = 34                   # Years, no decimals

# Results with professional format
print("=== Consistent Results ===")
print(f"Poverty rate: {poverty_rate:.1%}")         # Shows 23.4%
print(f"Monthly income: ${income:,.2f}")           # Shows $1,234.50
print(f"Average age: {age:.0f} years")             # Shows 34 years

Checkpoint: Documentation

You've mastered:

  • Using clear and consistent comments.

  • Documenting units and measures.

  • Maintaining format standards.

  • Creating readable and replicable code.

Before continuing: Review your previous code and add appropriate documentation following these guidelines:

  • Add units to all numerical variables.

  • Use consistent formatting for similar types of data.

  • Include ranges and valid values in comments.

  • Make sure variable names are descriptive.


Final Challenge! 🎯

It's time to put everything into practice! This challenge includes all the concepts we've learned:

  • Advanced operations.

  • Comparisons.

  • Number capture and formatting.

  • Clear documentation.

Your challenge:

You'll create a program to analyze citizen participation data that will:

  1. Capture voting data.

  2. Perform participation calculations.

  3. Compare with established goals.

  4. Present a professional report.

The interaction should look like this:

=== Citizen Participation Analysis ===

Enter voting data:
Total number of voters: 15000
Votes cast: 12345
Participation goal (%): 85

=== PARTICIPATION REPORT ===
Total voters:          15,000
Votes cast:           12,345
Current participation:  82.3%
Established goal:      85.0%
Difference from goal:  -2.7%

=== RESULTS ANALYSIS ===
Goal reached?:         No
Missing votes:         405

SOLUTION:

# Citizen participation data
# Ask for basic data
print("=== Citizen Participation Analysis ===\n")

# Enter: 15000
total_voters = float(input("Total number of voters: "))
# Enter: 12345
votes_cast = float(input("Votes cast: "))
# Enter: 85
participation_goal = float(input("Participation goal (%): "))

# Calculate indicators
current_participation = (votes_cast / total_voters) * 100  # In percentage
goal_difference = current_participation - participation_goal      # Percentage difference
missing_votes = (participation_goal/100 * total_voters) - votes_cast  # Needed votes

# Show report with professional formatting
print("\n=== PARTICIPATION REPORT ===")
print(f"Total voters:          {total_voters:,.0f}")
print(f"Votes cast:           {votes_cast:,.0f}")
print(f"Current participation:  {current_participation:.1f}%")
print(f"Established goal:      {participation_goal:.1f}%")
print(f"Difference from goal:  {goal_difference:+.1f}%")

print("\n=== RESULTS ANALYSIS ===")
print(f"Goal reached?:         {'Yes' if current_participation &gt;= participation_goal else 'No'}")
print(f"Missing votes:         {abs(missing_votes):.0f}")

💡 Notice how we used everything we learned:

  1. Advanced operations:

    • Percentage calculations with * and /

    • Absolute value with abs() for missing votes.

  2. Comparisons:

    • We check if we reached the goal.

    • We calculate percentage differences.

  3. Number formatting:

    • Thousands separator: {:,}

    • Percentages with one decimal: {:.1f}%

    • Sign in differences: {:+.1f}%

  4. Documentation:

    • Explanatory comments.

    • Units specified.

    • Descriptive variable names.

Congratulations! If you completed this final challenge, you've demonstrated that you can:

  • Capture and process numerical data.

  • Perform calculations and comparisons.

  • Present results professionally.

  • Document your code clearly.

Before moving on to the next part of our series, let's review everything we've learned.

In summary: Mastering numbers in Python for social research

We've explored the fundamental tools for working with numerical data in Python. From basic operations to professional presentation techniques, you now have the necessary skills to handle quantitative data in your social research.

What we learned:

  1. Advanced Operations

    • Powers and roots with **

    • Integer division with //

    • Remainder with %

    • Absolute value with abs()

    • Update operators (+=, -=, *=, /=)

  2. Number Comparison

    • Basic operators (&gt;, &lt;, &gt;=, &lt;=, ==, !=)

    • Combining conditions (and, or, not)

    • Evaluating results and goals

  3. Capture and Format

    • Using input() for numerical data

    • Converting with float()

    • Thousands format: {:,}

    • Controlled decimals: {:.2f}

    • Percentages: {:.1%}

  4. Professional Documentation

    • Explanatory comments.

    • Unit specification.

    • Descriptive variable names.

    • Consistent format.

What's next?

This guide about numbers is just the beginning of your journey with Python. In upcoming installments we'll explore:

  1. Built-in Functions

    • Discover the tools that Python comes with "out of the box".

    • Learn to use len(), sum(), max(), min() and more.

    • Optimize your code by leveraging these powerful functions.

  2. Data Structures: Lists and Dictionaries

    • Organize your data in more complex structures.

    • Manipulate data sets efficiently.

    • Learn to process multiple survey responses at once.

Ready to keep learning? See you in the next installment about Python's built-in functions!


Did you find this guide useful? Share it with other social researchers and leave us your comments about what other topics you'd like us to cover!

 
Previous
Previous

So You Want to Learn Python? (Please Don't Start with "Hello World")

Next
Next

🐍 Python and Numbers: The Definitive Guide for Social Researchers [Part 1 of 2]