每個(gè)數(shù)據(jù)科學(xué)家都應(yīng)該知道的12個(gè)Python特性!
來源:大數(shù)據(jù)應(yīng)用 本文約5700字,建議閱讀11分鐘
本文我們將深入探討每個(gè)數(shù)據(jù)科學(xué)家都應(yīng)該了解的12個(gè)Python特性。

# list comprehension_list = [x**2 for x in range(1, 11)]# nested list comprehension to flatten listmatrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]flat_list = [numfor row in matrixfor num in row]print(_list)print(flat_list)[1, 4, 9, 16, 25, 36, 49, 64, 81, 100][1, 2, 3, 4, 5, 6, 7, 8, 9]
# dictionary comprehension_dict = {var:var ** 2 for var in range(1, 11) if var % 2 != 0}# set comprehension# create a set of squares of numbers from 1 to 10_set = {x**2 for x in range(1, 11)}# generator comprehension_gen = (x**2 for x in range(1, 11))print(_dict)print(_set)print(list(g for g in _gen)){1: 1, 3: 9, 5: 25, 7: 49, 9: 81}{64, 1, 4, 36, 100, 9, 16, 49, 81, 25}[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
for idx, value in enumerate(["a", "b", "c", "d"]):if idx % 2 == 0:print(value)ac
x = [1, 2, 3, 4]y = [5, 6, 7, 8]# iterate over both arrays simultaneouslyfor a, b in zip(x, y):print(a, b, a + b, a * b)1 5 6 52 6 8 123 7 10 214 8 12 32
def fib_gen(n):a, b = 0, 1for _ in range(n):yield aa, b = b, a + bres = fib_gen(10)print(list(r for r in res))[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
numbers = range(10)even_numbers = list(filter(lambda x: x % 2 == 0, numbers))print(even_numbers)[0, 2, 4, 6, 8]
import pandas as pddata = {"sales_person": ["Alice", "Bob", "Charlie", "David"],"sale_amount": [100, 200, 300, 400],}df = pd.DataFrame(data)threshold = 250df["above_threshold"] = df["sale_amount"].apply(lambda x: True if x >= threshold else False)dfsales_person sale_amount above_threshold0 Alice 100 False1 Bob 200 False2 Charlie 300 True3 David 400 True
numbers = range(10)# Use map(), filter(), and reduce() to preprocess and aggregate the list of numberseven_numbers = filter(lambda x: x % 2 == 0, numbers)squares = map(lambda x: x**2, even_numbers)sum_of_squares = reduce(lambda x, y: x + y, squares)print(f"Sum of the squares of even numbers: {sum_of_squares}")Sum of the squares of even numbers: 120
data = [1, 3, 5, 7]print(any(x % 2 == 0 for x in data))print(all(x % 2 == 1 for x in data))FalseTrue
import randomdef random_numbers():while True:yield random.random()# Use next() to find the first number greater than 0.9num = next(x for x in random_numbers() if x > 0.9)print(f"First number greater than 0.9: {num}")First number greater than 0.9: 0.9444805819267413
09 默認(rèn)字典
defaultdict是內(nèi)置類的子類dict,允許為缺失的鍵提供默認(rèn)值。
defaultdict對于處理丟失或不完整的數(shù)據(jù)非常有用,例如在處理稀疏
矩陣或特征向量時(shí)。它還可用于計(jì)算分類變量的頻率。
一個(gè)例子是計(jì)算列表中項(xiàng)目的出現(xiàn)次數(shù)。如果傳入default_factory的參數(shù)為int,一開始初始化鍵對應(yīng)的值都為0。
from collections import defaultdictcount = defaultdict(int)for item in ['a', 'b', 'a', 'c', 'b', 'a']:count[item] += 1countdefaultdict(int, {'a': 3, 'b': 2, 'c': 1})
from functools import partialdef add(x, y):return x + yincrement = partial(add, 1)increment(1)2
11 lru_cache
lru_cache是functools模塊中的一個(gè)修飾函數(shù),它允許使用有限大小的緩存來緩存函數(shù)的結(jié)果。
lru_cache對于優(yōu)化計(jì)算成本較高的函數(shù)或可能使用相同參數(shù)多次調(diào)用的模型訓(xùn)練過程非常有用。
緩存可以幫助加快函數(shù)的執(zhí)行速度并降低總體計(jì)算成本。
這是一個(gè)使用緩存有效計(jì)算Fibonacci numbers(https://en.wikipedia.org/wiki/Fibonacci_number)的示例(在計(jì)算機(jī)科學(xué)中稱為記憶)
rom functools import lru_cache@lru_cache(maxsize=None)def fibonacci(n):if n <= 1:return nreturn fibonacci(n - 1) + fibonacci(n - 2)fibonacci(1e3)4.346655768693743e+208
from dataclasses import dataclass@dataclassclass Person:name: strage: intcity: strp = Person("Alice", 30, "New York")print(p)Person(name='Alice', age=30, city='New York')
評論
圖片
表情
