3  Introduction to Python

Python is a versatile and powerful programming language that is widely used in data science for its simplicity, readability, and flexibility. It comes with a rich ecosystem of libraries and tools specifically designed for data analysis, statistical modeling, and machine learning. Python’s user-friendly syntax makes it an ideal choice for beginners in programming and allows for quick prototyping and iteration of data science projects.

In data science, Python is important because of its extensive libraries such as NumPy, Pandas, Seaborn, Plotly and scikit-learn, which provide advanced data manipulation, data analysis, and machine learning capabilities. These libraries enable data scientists to efficiently clean and preprocess data, perform complex statistical analysis, build and train machine learning models, and visualize the results.

As far as artificial intelligence is concerned, libraries such as TensorFlow, Keras and PyTorch among others allow you to develop and deploy your own AI-based services. More recently the community of developers and vendors involved in what is known as generative AI tend to adopt a Python first approach to release their models and services. In this sense being proficient in Python will give you some initial advantage over others.

3.1 Python as a programming language

Python is a general-purpose, high-level programming language. Unlike compiled languages like C or Java, Python is an interpreted language, meaning the code is executed line-by-line by an interpreter during runtime. This approach makes debugging easier but can impact performance compared to compiled languages.

One of Python’s standout features is its dynamic typing. You don’t need to declare the types of variables explicitly; Python infers the type based on the value you assign. For instance, if you write x = 10, Python understands x as an integer, but if you later write x = "hello", it reassigns x as a string. This ease of variable handling contributes to Python’s flexibility and simplicity.

Memory management in Python is handled through automatic garbage collection. This means the language itself takes care of allocating and deallocating memory, so programmers don’t have to worry about manual memory management like in C or C++. This feature reduces the risk of memory leaks and makes Python a safer language to work with.

Python’s rich standard library is another significant asset. It includes modules for web development, networking, file I/O, and countless other tasks. For example, by importing the pandas module, you can use existing methods to manipulate datasets, or by importing the os module, you can interact seamlessly with the operating system. This vast library means that you rarely need to write code from scratch, as many functionalities are already available.

Python supports both Object-Oriented Programming (OOP) and functional programming paradigms. You can define classes and create objects, or you can use functions as first-class citizens to build your programs. This flexibility allows developers to choose the most suitable approach for their specific needs, whether they’re working on small scripts or large, complex applications.

The language emphasizes clean, readable code through the use of indentation to define code blocks, replacing the need for curly braces or keywords. This results in code that is easier to read and maintain. For example, a simple function to greet someone might look like this:

def greet(name):
    print(f"Hello, {name}!")

In terms of performance, Python does have some limitations. Being an interpreted language, it is generally slower than compiled languages due to interpreter overhead. Additionally, Python’s Global Interpreter Lock (GIL) ensures that only one thread executes Python bytecode at a time, which can be a bottleneck in CPU-bound multithreaded programs.

Broadly speaking Python as a software programming language relies on:

  • Variables to store values and data

  • Operators to perform operations on variables and values

  • Statements as units of code that have an effect

Before we move on to the detailed explanation of each of these three concepts please have a look at the following Python example.

3.1.1 Variables in Python

In Python, a variable is essentially a name that you give to a value stored in your program. Think of it like a labeled box where you can keep things organized. Instead of just carrying around a bunch of numbers or strings, you can put them into boxes (variables) with names that make sense to you.

To create a variable, you simply need to give it a name and assign it a value using the equals sign (=).

age = 21  # Variable 'age' is assigned the value 21
name = "Alice"  # Variable 'name' is assigned the value "Alice"
pi = 3.14159  # Variable 'pi' is assigned the value 3.14159
Tip. Variables Naming Rules

There are a few rules to remember when creating variables:

  • Start with a letter or underscore (_).
  • Follow with letters, numbers, or underscores.
  • Case-Sensitive: age, AGE, and Age are different variables.
  • No reserved keywords: Don’t use words that Python has reserved for its syntax like class, if, while, etc.

Variables in Python are dynamically typed, which means you do not need to declare their type explicitly. The type is inferred from the value you assign to them. Common data types include:

Integers: Whole numbers (int)

age = 21  # Variable 'age' is assigned the value 21

Floating-Point Numbers: Decimal numbers (float)

temperature = 98.6

Strings: Textual data (str)

message = "Hello, world!"

Booleans: True or False (bool)

is_sunny = True

There are other types of data (e.g. dictionaries, tuples) yet, for the time being, these are enough for our purposes

You can use variables to store data and perform operations on them.

result = age + 5  # Adds 5 to the value of 'age'
greeting = "Hello, " + name  # Concatenates "Hello, " with the value of 'name'

Once you have a variable defined, you can use it in your program. You can print it, do math with it, or even manipulate it in various ways:

print("Your age is: " + str(age))
Your age is: 21

Reassigning variables. You can change the value of a variable at any time.

age = 22  # Now the variable 'age' holds the value 22
pi = 12  # Changing the value of 'pi' to 3

print("Your new age is: " +str(age))
print("The new value of pi is:"+ str(pi))
Your new age is: 22
The new value of pi is:12
Tip. Variables Naming Best Practices
  • Use descriptive names for variables to make your code easier to understand.
  • Keep variable names consistent in style (e.g., use snake_case or camelCase consistently).

So, in a nutshell, variables in Python are just labels for storing data values. They help you manage and manipulate data in your programs easily

3.1.2 Statements in Python

In Python, a statement is essentially a piece of code that performs some action. If you think of writing code like writing sentences in English, each statement would be like a sentence that tells the computer what to do. In Python basic statements are:

Assignment Statements: These are the statements where you assign values to variables.

age = 21
name = "Alice"

Expression Statements: These evaluate expressions, which are combinations of variables, operators, and values that can be evaluated to a single value.

print(age + 2)  # This prints the result of age + 2
23
print(13**2)  # This raises 13 to the power of 2 and prints the result
169

Import Statements: Used to include modules, which are external files containing Python code developed by others

import math  # It imports an external module called math
print(math.sqrt(16))  # Uses the sqrt function from the math module 
4.0

In future chapters you will learn about other types of statements (e.g. loops, conditionals).

In summary:

  • Statements are the building blocks of Python programs.
  • Each statement instructs the computer to perform a specific task.
  • Statements can be simple (like assignments) or complex (like control flow statements).

3.1.3 Operators in Python

Operators are special symbols or keywords that you use to perform operations on variables and values. Think of operators as the tools or glue that let you put together and manipulate data. Here’s a breakdown of the different types of operators and how they’re used.

Arithmetic Operators:These operators are used to perform basic arithmetic calculations like addition, subtraction, multiplication, division and many others

a = 5
b = 3
result = a + b  # result is 8
result = a * b  # result is 15
result = a / b  # result is 1.666...
result = a % b  # Returns the remainder of division.. Result is 2
result = a ** b  # result is 125 (5 raised to the power of 3)

Comparison Operators: These operators compare two values and return a Boolean value (True or False).

a = 5
b = 3
(a == b)  # is a equal to b?
False
a = 5
b = 3
(a != b)  # is a not equal to b?
True
a = 5
b = 3
(a > b)  # is a larger than b?
True
a = 5
b = 3
(a < b)  # is a lower than b?
False

Logical Operators: These operators are used to combine conditional statements.

a = 5
b = 3
(a > 2 and b < 5)  # is a larger that 2 AND b lower than 5 at the same time?
True
 (a < 2 or b > 1)  # is a lower than 2 OR b larger than 1 ?
True

So, operators are fundamental tools in Python programming that let you manipulate and evaluate values to make decisions, perform calculations, and manage data efficiently.

3.2 Python in practice

Like any other language the best approach to learn Python is by reading code written by others as well as writing your own code. These days it is also possible to use Generative AI based services (e.g. ChatGPT, Gemini) to speed up software programming.

Generative AI for coding purposes

You will learn how to use generative AI for data science purposes. For the time however feel free to submit the following prompt to chatGPT:

“Give me python code to sort a list of 15 numbers”.

You will get something similar to the following:

# Define the list of 15 numbers
numbers = [23, 1, 45, 9, 12, 78, 34, 65, 29, 10, 5, 2, 47, 88, 19]

# Sort the list in ascending order using sorted()
sorted_numbers = sorted(numbers)

# Print the sorted list
print("Sorted list:", sorted_numbers)

First Example

Let’s have a look at the following Jupyter Notebook. This notebook provides examples taking some information from the user (e.g. two numbers), performs some operations (e.g. multiplication) and produces an output.

Try to expand the example by developing some code that: (1) takes two numbers Base and Exponent and then (2) raises Base to the power of Exponent. Try to work it out on your own before checking the solution below:

Code
# Taking input for the base (a) and exponent (b)
a = float(input("Enter the base (a): "))  # Example: 2
b = float(input("Enter the exponent (b): "))  # Example: 3

# Performing the power operation
result = a ** b  # This is a raised to the power of b

# Outputting the result
print("a raised to the power of b is:", result)

Second Example

The following Jupyter Notebook illustrates how to use variables, data types (integers, floats, strings, lists), basic arithmetic operations, and printing. It also shows how to import the math and numpy libraries for mathematical calculations as well as methods such as math.log, math.floor, math.pow, math.sqrt, and numpy.matmul. It also shows how to perform basic comparison of two numbers using ==. Feel free to tweak the code and see the result!

Third Example

In this Jupyter Notebook you can further practice how to manipulate variables and perform mathematical operations. Feel free to tweak the code and see the result!

3.3 Conclusion

This chapter provides a foundational overview of Python diving into essential Python programming concepts such as variables, statements, and operators. By working on the examples and labs you have gained a basic understanding of how to use Python for data related purposes. The next chapter will illustrate how to use generative AI based assistants to develop and debug software programming code.

3.4 Further Readings

For those of you interested in more advanced Python concepts please feel free to check the following references: