Appendix A: Function Catalog¶

This is a (semi)comprehensive list of the built-in python functions we will use in this class.

Week 1

print() prints to the screen the value of the variable in the parentheses or the 'text' if in quote marks.

dataframe.head() prints the first 5 rows of the data table named dataframe.

dataframe.describe() prints some summary statistics of the data table named dataframe.

dataframe.loc[0][column1] refers to the element of the data table named dataframe in the first (zeroth) row of column column1.

Most basic mapping:

plt.figure(figsize=(10,5))
ax = plt.axes(projection=ccrs.Mollweide())
ax.coastlines()
plt.show()

This defines a figure object, addes axes with a Mollweide map projection, adds coastlines, and prints the figure to the screen. Note: you will need to import cartopy.crs as ccrs and import matplotlib.pyplot as plt first for this to run.

ax.tissot() will add Tissot circles to a map. Do this to see the map projection created distortions.

plt.scatter(x,y,c,s) makes a matplotlib scatter plot of data located at (x,y) with color set by the variable c and markersize set with s.

plt.savefig('filename') saves the figure as filename.

ax.gridlines() or plt.grid() add gridlines to a figure.

Week 2

np.loadtxt is a numpy function to load data from a .txt (text) file format.

pd.read_csv() reads from a csv (comma separated values) formated file to a Pandas DataFrame.

variablename.shape returns the array size of a variable named variablename.

np.reshape() reshapes the variable named variablename to the input size. For example

a = np.array([[1,2,3], [4,5,6]])
np.reshape(a, 6)
array([1, 2, 3, 4, 5, 6])

plt.hist(x,bins=n) plots a histogram of the variable x with n number of bins. Setting density=True will normalize the histogram.

np.max(x) and np.min(x) returns the maximum and minimum value of the input x.

np.mean and np.std returns the mean and standard deviation value of the input x.

len(x) returns the length of the array x.

np.random.normal() generates random points from a normal distribution.

sorted(x) sorts the values of the array x.

np.asarray(x) converts the input to an array.

np.linspace(a,b,n) creates a number line list from a to b with n elements.

np.arange(a,b,step) creates a number line list from a to b in steps of step.

np.repeat(x,n,axis=0) repeats the array x n number of times along the axis=0 (0 or 1) axis.

np.tile(x,n) makes an array by reapeating x n number of times.

Week 3

Fancier histogram:

fig = plt.figure(1,(6,6))
ax = fig.add_subplot()
plt.hist(x,edgecolor='black',bins=4,label='Magnitude',log=True)
plt.xlabel('Magnitude, Mw', fontsize=16)
plt.ylabel('Number of Events', fontsize=16)
plt.title('Earthquake Magnitudes', fontsize=18)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.xlim([5, 9])
plt.grid(True)
plt.show()

plt.figure makes a object and add_subplot adds subplots. edgecolor sets the color of the bin outline. log=True makes the y-axis logarithmic. fontsize sets the size of the fonts.

pd.to_datetime() converts a string column of a dataframe with date and time data to a datetime object.

plt.annotate add annotation to a plot.

Week 4

dataframe_name.drop(columns=['a','b']) drops the columns a and b from the dataframe dataframe_name.

np.isnan(x) returns a boolean array with True values where x has NaN values.

Defining a function:

def name_of_function(input):
    """
    Function to compute something
    
    parameters
    ----------
    input variable in units
    
    output variabel in units
    """
    # write your code here
    output = 2 * input
    return output

Week 5

a = dataframe_name['column b'].values converts the DataFrame Series column b into the numpy array a.

a = dataframe_name['column b'].index[dataframe_name['column b']==x] returns the index value where it is true that dataframe_name['column b']==x.

np.power(base,exponent) numpyt function to execute \(base^{exponent}\)

np.count_nonzero(x) counts the non-zero elements of the array x.

np.delete(x,del,axis=0) deletes the indices del from the array x along the axis= (0 or 1) axis.

np.unique(x) finds the unique elements of the array x.

a = dataframe_name.sort_values(by=['column b']) sorts the rows of the dataframe datafram_name by the values in column b.

Week 6

np.random.choice(a) generates a random sample pulled from elements in the 1D array a.

np.random.binomial(n,p,size=s) Samples are drawn from a binomial distribution with specified parameters, n trials and p probability of success where n an integer >= 0 and p is in the interval [0,1].

np.random.poisson(lam,size=n) Samples are drawn from a Poisson distribution with specified parameters, lam average rate of event occurance.

np.random.gamma(shape, scale=1.0, size=n) Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale (sometimes designated “theta”), where both parameters are > 0.

a.append(b) Appends (adds on) the variable b to the end of the array a.

np.sum(x) Adds up the elements of the array x.

np.cumsum(x) Return the cumulative sum of the elements along a given axis.

Python Intro to Geoscience

Appendix A: Function Catalog¶