Assignment 4, CS3346A, 2013

Assigning date: Nov 4, 2013. Updated Nov 20.
Due date: Dec 4, 2013 (midnight)
Electronic submission: See how here).
Individual effort (no group work)
Total marks: 25% of the final marks

Note: this assignment is given early so you can prepare for the midterm better for Chapters 13-14's materials. As Dec 4 is the last day of the term, and there is no final exam for the course, you must submit the assignment 4 by or on Dec 4. This deadline cannot be extended, and there is no late submission. If you cannot finish all questions, please submit partial solutions to get partial marks.

Update (none):

Question 1:
Textbook (3E): Question 13.13 (reproduced here)
Consider two medical tests, A and B, for a virus. Test A is 95% effective at recognizing the virus when it is present, but has a 10% false positive rate (indicating that the virus is present, when it is not). Test B is 90% effective at recognizing the virus, but has a 5% false positive rate. The two tests use independent methods of identifying the virus. The virus is carried by 1% of all people. Say that a person is tested for the virus using only one of the tests, and that test comes back positive for carrying the virus. Which test returning positive is more indicative of someone really carrying the virus? Justify your answer mathematically.

Question 2:
Textbook (3E): Question 13.21 (reproduced here)
Suppose you are a witness to a nighttime hit-and-run accident involving a taxi in Athens. All taxis in Athens are blue or green. You swear, under oath, that the taxi was blue. Extensive testing shows that, under the dim lighting conditions, discrimination between blue and green is 75% reliable.
a. Is it possible to calculate the most likely color for the taxi? (Hint: distinguish carefully between the proposition that the taxi is blue and the proposition that it appears blue.)
b. What if you know that 9 out of 10 Athenian taxis are green?

Question 3:
Textbook (3E): Question 14.1 (reproduced here)
We have a bag of three biased coins a, b, and c with probabilities of coming up heads of 20%, 60%, and 80%, respectively. One coin is drawn randomly from the bag (with equal likelihood of drawing each of the three coins), and then the coin is flipped three times to generate the outcomes X1, X2, and X3.
a. Draw the Bayesian network corresponding to this setup and define the necessary CPTs.
b. Calculate which coin was most likely to have been drawn from the bag if the observed flips come out heads twice and tails once.

Question 4:
Textbook (3E): Question 14.11 (reproduced here)
In your local nuclear power station, there is an alarm that senses when a temperature gauge exceeds a given threshold. The gauge measures the temperature of the core. Consider the Boolean variables A (alarm sounds), FA (alarm is faulty), and FG (gauge is faulty) and the multivalued nodes G (gauge reading) and T (actual core temperature).
a. Draw a Bayesian network for this domain, given that the gauge is more likely to fail when the core temperature gets too high.
b. Is your network a polytree? Why or why not?
c. Suppose there are just two possible actual and measured temperatures, normal and high; the probability that the gauge gives the correct temperature is x when it is working, but y when it is faulty. Give the conditional probability table associated with G.
d. Suppose the alarm works correctly unless it is faulty, in which case it never sounds. Give the conditional probability table associated with A.
e. Suppose the alarm and gauge are working and the alarm sounds. Calculate an expression for the probability that the temperature of the core is too high, in terms of the various conditional probabilities in the network.

Updated Nov 20.

Question 5:
You will be creating a Decision (Bayesian) network using a Bayesian network software. Many free software tools are available from the web. (But note that you need to use the decision node and utility node on top of a Bayesian network). It seems that the tool from AISpace is very nice and easy to use. Students liked it in the past. The authors of the tool also have a very useful online book that you can read. In the past, students also used MSBN from Microsoft. You can use it too.

Duplicate the alarm belief network in the lecture notes and textbook. Then you will use the software to obtain the probabilities of the following statements. Make sure that the result of the first one is the same as the one in the textbook; that is, (0.284, 0.716). This verifies that your net is constructed correctly.

  1. P(Burglary|JohnCalls=true and MaryCalls=true)
  2. P(Burglary|JohnCalls=true and MaryCalls=false)
  3. P(Burglary|JohnCalls=true)
  4. P(Burglary|JohnCalls=true, Earthquate = true)
  5. P(Burglary|Alarm=true, Earthquate = true)
  6. P(Earthquate|Alarm=true, Burglary = false)

Hand in a few screen shots showing the graphical structure of the belief network, and the results. Explain and compare briefly the results you obtain.

Question 6:
Expand the alarm Bayesian network to include a decision node and a utility node. The utility node is "MyCost" (or "MyProfit") that reflects how much cost (or profit) for various outcomes. The decision node is for an action with two possible values: "Go home" and "Call security company" (i.e., not "Go home"). When you "Go Home", you would likely interrupt the robbery; you may however, miss meetings with clients, and lose a sale. If you "Call security company" to check out your house, you need to pay them a small amount of money, but they may not get there in time. Add necessary nodes to reflect the new information provided above. Choose reasonable values for probabilities and utility values.

Submit your network topology and detailed parameters of the network. Show optimal decisions under two different circumstances (such as: John called and I learned a minor earthquate happened, shall I go home?). Explain briefly the results you obtain.

Question 7:
Create a decision network of your own, for a fictional or actual situation that you may need to calculate probabilities given certain evidences, or make optimal decisions. The network should contain at least 5 nodes for random variables, one decision node and one utility node.

The submission requirement is the same as the last question.