Math 300 - Homework Assignment #4 (back to Math 300 main page)

Suggetsed solutions could be found by these links: page_rank.m, run_rank.m, boole.m (using "for" loop), better_boole.m (using vector arithmetic), run_boole.m.

The assignment is due Monday, July 24, 11:59pm.

Due to some reasons we can not use public folder anymore. Please email me (o.dykhovychnyi@wsu.edu) your files instead of uploading them to your public folder.

Assignment is worth 110 points.

 Part 1 (45 points).

In this part you are required to create a MATLAB script named "run_rank.m" and a MATLAB function file named "page_rank.m".

The "run_rank.m" script should do the following:

  1. Load a square matrix from this file to the matrix variable M. The matrix descibes primitive network of 8 web pages using CSV format. Since the function I showed you in class requires some additional actions to get a matrix variable, you may use dlmread function. This function returns exatly what you need. More details on usage could be found here.
  2. Compute PageRank vector for the network descibed by M by calling page_rank(matr, alpha, e) function (which you are supposed to define in this file in a separate file as described below) with damping factor 0.85 and precision 10-6. The output should be stored into the vector variable named v and immediately printed to the console.
  3. Plot computed PageRank vector v using bar function. More details on usage could be found here. The axes should be labeled as "Pages" (x-axis) and "Rank" (y-axis).

The "page_rank.m" file should define the page_rank(matr, alpha, e) function as follows:

  1. The function has 3 input parameters:
  2. The function returns a column-vector - PageRank vector.
  3. The function shoud implement PageRank algortithm used by Google search engine to rank web pages resulting from a particual search query. The input of the algorithm is an adjacency matrix (input parameter matr) of a directed graph describing some particual network of web pages (see also this link). The algorithm transforms given matrix to a so called Google Matrix using given damping factor (input parameter alpha). Then the algorithm finds the eigenvector corrseponding to the dominant eigenvalue (the ones with the largest magnutude) of the Google Matrix using power iteration method (recall the document you typest in the previous assinment). The computed eigenvector called PageRank vector and it represents the probability distribution of the likelihood that a person randomly clicking on links will arrive at any particular page. This vector is the output of the algorithm. There are dozens of online materials explaining this algorithm, e.g. this one or the original article.
  4. The Google Matrix (G) should be computed from the adjacency matrix (A) as follows: First, compute matrix  B=(bi,j)  from matrix  A=(ai,j)  by dividing every element of  A  by the sum of all elements of the column to which given elemnt belongs, i.e.  bi,j=ai,j k = 1 N ak,j Then   G = α B + ( 1 - α ) 1 N C , where   N  is the size of the square matrices  A,B  and  C ; C is simply a matrix of all ones, i.e.   C = 1 1 1 1 1 1 Note that you will probably need "for" loop (to iterate over all columns of the matrix, but not rows!) together with sum function to compute B. More on sum function usage could be found here.
    The G matrix could be computed just with one line as soon as you got the B matrix.
  5. The power iteration method is extremely simple:
    1. You start with the vector x0 representing uniform distribution, i.e. all cordinates are 1N
    2. Your first iteration should compute vector x1 by multiplying x0 by your Google Matrix G, i.e. x1=Gx0
    3. Every next iteration (k) will also multiply vector obtained from preceeding iteration (k-1) by your Google Matrix G, i.e. xk=Gxk-1 .
    4. The algorithm should stop when xk-xk-12=(xk1-xk-11)2++(xkn-xk-1n)2e, where e denotes precision specified by the input parameter e.
    5. Then your PageRank vector is simply xk.
    Note that you probably will need to use "while" loop to implement this algorithm, and the variables for xk and xk-1 (which are column-vectors) should be defined before starting the loop (you can define xk-1 with all zeroes and xk as descibed in (a)). Also note that the expression used in (d) is called 2-norm or Euclidean norm of a vector (in your case vector which norm is computed is obtained by substracting vectos xk and xk-1) and norm function may be useful here. More details on usage could be found here.
  6. Note that your function should be able to process matrices of any size, i.e. you should not hardcode any particual size of matrix N inside your function. Instead it should be derived (and probably stored into some variable) inside your function by applying proper built-in function to your matrix matr.
  7. Your function should check whether input matrix matr is really a square matrix. If not, an error with a proper message should be raised using error function.
  8. Your function should not print any output messages to the console.
  9. The diagram produced by your script should look like this.

 Part 2 (45 points)

In this part you are required to create a MATLAB function named boole. The function should be stored in a separate file with a proper name.

This function should implement numerical integration algorithm using one of the Newton–Cotes formulas called Boole's rule:

  1. The function has 3 input parameters:
  2. The function returns a floating point number - value of the defintie integral.
  3. Here is how Boole's rule work. Consider the integral \[\int_a^b f(x)\,dx\] where \(f\) (fun input parameter) is twice continuously differentiable. Let \(a=x_0\lt x_1\lt\dots\lt x_n=b\) be an equally-spaced partition of \([a,b]\) (a and b input parameters), where we assume that \(n\) (n input parameter) is evenly divided by 4. When \(h=(b-a)/n\) is the uniform width of the subintervals, then accroing to Boole's rule your integral can be estimated as \[\int_a^b f(x)\,dx \sim \frac{2h}{45}\sum_{i=1}^{n/4} \left( 7f(x_{4i-4})+32f(x_{4i-3})+12f(x_{4i-2})+32f(x_{4i-1})+7f(x_{4i}) \right). \]
  4. (You can skip/ignore this step!)Your function should plot two things: original function on the given interval (using grid with n*10 points or more) as a blue solid line without any point markers; and values of your function at endpoints of all subintervals (i.e. \(f(x_0), f(x_1), \dots, f(x_n)\)) as a dashed red line using "x" points marker. For this purpose you should probably store both endpoints of your subintervals and values of your function at this points while doing computations, and then use these two sets to create the plot. The axes should be labeled properly, e.g. "x" and "f(x)".
  5. Your function should check whether values provided for a and b satifsfy a < b. If not, an error with a proper message should be raised using error function.
  6. Your function should check whether provided number of subintervals n is divisible by 4. If not, an error with a proper message should be raised using error function.
  7. Your function should not print any output messages to the console.

 Part 3 (20 points)

In this part you are required to create a MATLAB script named "run_boole.m".

The script shoud test the boole function created in Part 2 and do the following:

  1. Define 3 anonymous functions (in-line functions):
    1. f1 = \( |\sin(x) | + |\cos(x)| + x \) on [-π , π]
    2. f2 = \( \frac{1}{x} \) on [0.1, 5.1]
    3. f3 = \( x^3 - 2 x^2 - 5x + 1 \) on [-3, 4]
  2. For each of the above functions do the following:
    1. Compute the value of the integral on the given interval using built-in inegral function. You will use this value as a reference value to compute error.
    2. Compute the value of the integral on the given intervals using your boole function for five different partitions: n=4^1=4, n=4^2=16, n=4^3=64, n=4^4=256 and n=4^5=1024. All the computed values should be saved for the next step.(if you skipped step 4 of part 2, you can also skip the following highlighted stuff!)After computing integral for each of the given partitions, pause the script execution for 1 seconds. For this purpose you may use pause(1) function. More details on usage could be found here.
    3. Plot the following:
      • x-axis: powers of 4 that used to compute number of subintervals for every step in (b), i.e. n=1, n=2, n=3, n=4 and n=5;
      • y-axis: computation error, i.e. absolute value of the difference between the value computed with inegral function (step (a)) and the value computed with your boole function (step (b)) using corresponding partitioning.
      The axes should be properly labeled, e.g. "Number of subintervals" and "Error". The title of your plot should be the name of the function and corresponding interval, e.g. "f3 on [-3, 4]"
    4. The script execution should be paused and wait for user input. For this purpose you may use waitforbuttonpress function. ore details on usage could be found here

 Publishing your results.

As soon as you are done with all of the above, upload all the "*.m" files (4 files) to ~/public/homeworks/hw04 directory on adams.math.wsu.edu. Obviously, if directory doesn't exist yet, it should be created.

As soon as you are done with all of the above, send all your "*.m" files (4 files) to o.dykhovychnyi@wsu.edu.