Calculating Average with Error Handling

mikasa · (This post was last modified: May-03-2024, 09:44 AM by Gribouillis.)

I'm trying to write a Python program to calculate the average of a list of numbers, but I'm getting an error message: "TypeError: unsupported operand type(s) for +: 'int' and 'str'".

numbers = [1, 2, 3, "four", 5, "seven", "nine"]

def calculate_average(numbers):
  total = 0
  count = 0
  for num in numbers:
    try:
      total += int(num)  # Attempt to convert string to integer
      count += 1
    except ValueError:  # Handle conversion errors
      print(f"Error: Could not convert '{num}' to a number.")
  if count > 0:
    average = total / count
    return average
  else:
    return None  # Return None if no valid numbers found

average = calculate_average(numbers)

if average:
  print(f"The average of the numerical values is: {average}")
else:
  print("No valid numbers found in the list.")

Can someone help me point out what's wrong and suggest a solution?
Link Removed

Gribouillis write May-03-2024, 09:44 AM:
Clickbait link removed. Please read the Help Documents

sawtooth500 · (This post was last modified: May-03-2024, 05:56 PM by sawtooth500.)

So are you just trying to do this manually as an exercise? If so that's fine but if you just need the average... you are working way too hard.

Just get your entire list converted to ints, convert it to a numpy array, and numpy has a build in function to find the mean.

***snippsat*** · (This post was last modified: May-05-2024, 08:27 AM by snippsat.)

(May-03-2024, 08:46 AM)mikasa Wrote: but I'm getting an error message: "TypeError: unsupported operand type(s) for +: 'int' and 'str'".

Should not get that error message,this is what i get i run your code with eg Python 3.12.

Output:Error: Could not convert 'four' to a number.
Error: Could not convert 'seven' to a number.
Error: Could not convert 'nine' to a number.
The average of the numerical values is: 2.75

So it work as it should no TypeError.

To clean it up a litlle.

def calculate_average(numbers):
    total = 0
    count = 0
    for num in numbers:
        try:
            total += int(num)
            count += 1
        except ValueError:
            print(f"Error: Could not convert '{num}' to a number.")
    if count > 0:
        average = total / count
        return average

if __name__ == "__main__":
    numbers = [1, 2, 3, "four", 5, "seven", "nine"]
    average = calculate_average(numbers)
    if average:
        print(f"The average of the numerical values is: {average}")
    else:
        print("No valid numbers found in the list.")

Output:Error: Could not convert 'four' to a number.
Error: Could not convert 'seven' to a number.
Error: Could not convert 'nine' to a number.
The average of the numerical values is: 2.75

paul18fr · (This post was last modified: May-06-2024, 04:17 AM by paul18fr.)

I cannot avoid a list comprehension here, and maybe there's a simplier (and faster) way?
In the following, the pattern has been duplicated 1 million times and it took 4 seconds approx.

import numpy as np
import time

numbers = [1, 2, 3, "four", 5, "seven", "nine"]
# numbers = ["four", "seven", "nine"]
numbers = 1_000000*numbers

t0 = time.time()

M = [isinstance(i, int) for i in numbers]
M = np.asarray(M)
index = np.where(M == True)

if np.prod(np.shape(index)) == 0:
    print("No valid numbers found in the list.")
else:
    M = np.asarray(numbers)
    M = M[index].astype(int)
    Average = np.mean(M)
    print(f"The average of the numerical values is: {Average}")
    
t1 = time.time()
print(f"Duration = {(t1 - t0)}")

Output:The average of the numerical values is: 2.75
Duration = 3.5027401447296143

paul18fr · May-06-2024, 08:06 AM

Here bellow more general way if both integers and floats exist in the list, but i'm wondering: is there a better way to directlycombine the 2 boolean list (to avoid stacknig indexes)?

Output:[ True  True  True False  True False False False]
[False False False False False False False  True]

combination:
[ True  True  True False  True False False True]

import numpy as np
import time

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
# numbers = ["four", "seven", "nine"]
numbers = 1_000000*numbers

t0 = time.time()

M_int   = np.asarray([isinstance(i, int) for i in numbers])
M_float = np.asarray([isinstance(i, float) for i in numbers])

index_int = np.where(M_int == True)
index_float = np.where(M_float == True)
index = np.hstack((index_int, index_float))

if np.prod(np.shape(index)) == 0:
    print("No valid numbers found in the list.")
else:
    M = np.asarray(numbers)
    M = M[index].astype(float)
    Average = np.mean(M)
    print(f"The average of the numerical values is: {Average}")
    
t1 = time.time()
print(f"Duration = {(t1 - t0)}")

***snippsat*** · (This post was last modified: May-06-2024, 04:04 PM by snippsat.)

(May-06-2024, 04:17 AM)paul18fr Wrote: I cannot avoid a list comprehension here, and maybe there's a simplier (and faster) way?
In the following, the pattern has been duplicated 1 million times and it took 4 seconds approx.

Just to mention that first code work without list comprehension.
You make this more complicated than it need to be.
Can just do it like this.

import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine"]
filtered_numbers = [i for i in numbers if isinstance(i, int)]
average = np.mean(filtered_numbers)
print(average)

Output:
2.75

Also numpy may be overkill for a task like this,it will faster on large datasets eg matrix and can do vectorized calculation.
For measure small code like this use timeit
A example.

import timeit
import numpy as np
from statistics import mean

def num_py():
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = np.mean(filtered_numbers)
    #print(average)

def plain():
    # No libaries
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = sum(filtered_numbers) / len(filtered_numbers)
    #print(average)

def stat():
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = mean(filtered_numbers)
    #print(average)

if __name__ == '__main__':
    lst = ['num_py', 'plain', 'stat']
    for func in lst:
        time_used = timeit.Timer(f"{func}()", f'from __main__ import {func}').timeit(number=100000)
        print(f'{func} --> {time_used:.2f}')

Output:num_py --> 11.04
plain --> 5.05
stat --> 18.80

This run each function 100000 times and give back average time used.
So the plain code is faster here even if make the list bigger * 100,so this task is not best suited for numpy.
That said for this task all work fine,as task is simple as most the work is done in the list comprehension.

paul18fr · (This post was last modified: May-07-2024, 07:03 AM by paul18fr.)

@snippsat: I figured out how simply using [i for i in numbers if isinstance(i, int)] is much more efficient and smarter: i bow down Smile

import time
import numpy as np

n = 1_000_000
numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
# numbers = ["four", "seven", "nine"]
numbers = numbers*n
 
t0 = time.time()

filtered_numbers1 = np.hstack(( np.asarray([i for i in numbers if isinstance(i, int)]), 
                               np.asarray([i for i in numbers if isinstance(i, float)]) ))

if np.prod(np.shape(filtered_numbers1)) == 0:
    print("No valid numbers found in the list.")
else:
    Average = np.mean(filtered_numbers1)
    print(f"The average of the numerical values is: {Average}")
     
t1 = time.time()
print(f"Duration#1 = {(t1 - t0)}")


filtered_numbers2 =  [i for i in numbers if isinstance(i, int)] + \
                     [i for i in numbers if isinstance(i, float)]

if np.prod(np.shape(filtered_numbers2)) == 0:
    print("No valid numbers found in the list.")
else:
    average = sum(filtered_numbers2) / len(filtered_numbers2)
    print(f"The average of the numerical values is: {Average}")
     
t2 = time.time()
print(f"Duration#2 = {(t2 - t1)}")

***snippsat*** · (This post was last modified: May-07-2024, 07:48 AM by snippsat.)

Can do int and float it one isinstance call.

import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
filtered_numbers = [i for i in numbers if isinstance(i, (int, float))]
average = np.mean(filtered_numbers)
print(average)

Output:
2.666

Also if measure as mention use timeit.
Here just put all code in a string and run it 1000000 and get back the averge time used.

import timeit

mycode = '''
import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
filtered_numbers = [i for i in numbers if isinstance(i, (int, float))]
average = np.mean(filtered_numbers)
#print(average)
'''

print(timeit.timeit(stmt=mycode, number=1000000))

Output:
8.478885899996385

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	python exception handling handling .... with traceback	mg24	3	1,331	Nov-09-2022, 07:29 PM Last Post: Gribouillis
	Help needed with a "for loop" + error handling	tamiri	2	2,656	May-27-2022, 12:21 PM Last Post: tamiri
	Handling Python Fatal Error	richajain1785	7	6,050	Oct-14-2021, 01:34 PM Last Post: Tails86
	Error Handling	JarredAwesome	5	3,031	Oct-17-2020, 12:41 AM Last Post: JarredAwesome
	Error handling using cmd module	leifeng	3	2,973	Jun-06-2020, 06:25 PM Last Post: leifeng
	Excpetion Handling Getting Error Number	gw1500se	4	2,472	May-29-2020, 03:07 PM Last Post: gw1500se
	Error With Reading Files In Directory And Calculating Values	chascp	2	2,492	Feb-15-2020, 01:57 PM Last Post: chascp
	Warning / Error handling in python	Prarthana_12	1	5,164	Feb-08-2019, 09:21 PM Last Post: snippsat
	Help With Error Handling	jo15765	6	4,184	Sep-14-2018, 06:27 PM Last Post: jo15765
	Error Handling/No results from SQL Query	JP_ROMANO	7	9,661	Jul-18-2018, 02:31 PM Last Post: JP_ROMANO

Calculating Average with Error Handling

User Panel Messages

Announcements