(完整版)计算机语言python100道pandas(含答案)

资源描述

《(完整版)计算机语言python100道pandas(含答案)》由会员分享，可在线阅读，更多相关《(完整版)计算机语言python100道pandas(含答案)（8页珍藏版）》请在金锄头文库上搜索。

1、1. Import pandas under the name pd .In 1: import pandas as pd import numpy as np2. Print the version of pandas that has been imported.In 2:pd._version_3. Print out all the version information of the libraries that are required by the pandas library In 3:pd.show_versions()4. Create a DataFrame df fro

2、m this dictionary data which has the index labels .In 2:data = animal: cat, cat, snake, dog, dog, cat, snake, cat, dogage: 2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3,visits: 1, 3, 2, 3, 2, 3, 1, 1, 2, 1,priority: yes, yes, no, yes, no, no, no, yes, no, no labels = a, b, c, d, e, f, g, h, i, j df =

3、 pd.DataFrame(data, index=labels)5. Display a summary of the basic information about this DataFrame and its data. In 5:df.info()# .or.df.describe()6. Return the first 3 rows of the DataFrame dfIn 6:df.iloc:3# or equivalently df.head(3)7. Select just the animal and age columns from the DataFrame df .

4、In 7: df.loc:, animal, age# or dfanimal, age8. Select the data in rows 3, 4, 8 and in columns animal, age .In 3:df.locdf.index3, 4, 8, animal, age9. Select only the rows where the number of visits is greater than 3.In 4: dfdfvisits 310. Select the rows where the age is missing, i.e. is NaN .In 5: df

5、dfage.isnull()11. Select the rows where the animal is a cat and the age is less than 3.In 6:df(dfanimal = cat) & (dfage 3)12. Select the rows the age is between 2 and 4 (inclusive).In 7:dfdfage.between(2, 4)13. Change the age in row f to 1.5.In :df.locf, age = 1.514. Calculate the sum of all visits

6、(the total number of visits).In :dfvisits.sum()15. Calculate the mean age for each different animal in df .In 8: df.groupby(animal)age.mean()16. Append a new row k to df with your choice of values for each column. Then delete that row to return the original DataFrame.In :df.lock = 5.5, dog, no, 2# a

7、nd then deleting the new row.df = df.drop(k)17. Count the number of each type of animal in df .In 9:dfanimal.value_counts()18. Sort df first by the values in the age in decending order, then by the value in the visit column in ascending order.In 10:df.sort_values(by=age, visits, ascending=False, Tru

8、e)19. The priority column contains the values yes and no. Replace this column with a column of booleanvalues: yes should be True and no should be False .In :dfpriority = dfpriority.map(yes: True, no: False)In 14:dfanimal = dfanimal.replace(snake, python)print(df)21. For each animal type and each num

9、ber of visits, find the mean age. In other words, each row is an animal,each column is a number of visits and the values are the mean ages (hint: use a pivot table).In 15:df.pivot_table(index=animal, columns=visits, values=age, aggfunc=mean)22. You have a DataFrame df with a column A of integers. Fo

10、r example:df = pd.DataFrame(A: 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7)How do you filter out rows which contain the same integer as the row immediately above? In 16:df = pd.DataFrame(A: 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7) df.locdfA.shift() != dfA# Alternatively, we could use drop_duplicates() here. Note# that

11、this removes *all* duplicates though, so it wont23. Given a DataFrame of numeric values, saydf = pd.DataFrame(np.random.random(size=(5, 3) # a 5x3 frame of float valu eshow do you subtract the row mean from each element in the row?In :df.sub(df.mean(axis=1), axis=0)24. Suppose you have DataFrame wit

12、h 10 columns of real numbers, for example: df = pd.DataFrame(np.random.random(size=(5, 10), columns=list(abcdefghij )Which column of numbers has the smallest sum? (Find that columns label.)In 17:df.sum().idxmin()25. How do you count how many unique rows a DataFrame has (i.e. ignore all rows that are

13、 duplicates)?In :len(df) - df.duplicated(keep=False).sum()# or perhaps more simply. len(df.drop_duplicates(keep=False)26. You have a DataFrame that consists of 10 columns of floating -point numbers. Suppose that exactly 5entries in each row are NaN values. For each row of the DataFrame, find the col

14、umn which contains the thirdNaN value.(You should return a Series of column labels.)In :(df.isnull().cumsum(axis=1) = 3).idxmax(axis=1)27. A DataFrame has a column of groups grps and and column of numbers vals. For example: df = pd.DataFrame(grps: list(aaabbcaabcccbbc),vals: 12,345,3,1,45,14,4,52,54,23,235,21,57,3,87)In :df.groupby(grp)vals.nlargest(3).sum(level=0)28. A DataFrame has two integer columns A and B. The values in A are between 1 and 100 (inclusive). Foreach group of 10 consecutive integers in A (i.e. (0, 10 , (10, 20 ,

展开阅读全文