哈工大深圳ML试题答案.docx

资源描述

《哈工大深圳ML试题答案.docx》由会员分享，可在线阅读，更多相关《哈工大深圳ML试题答案.docx（4页珍藏版）》请在三一文库上搜索。

1、1 Give the definitions or your comprehensions of the following terms. (12)The inductive learning hypothesisP17OverfittingP49Consistent learnerP1482 Give brief answers to the following questions.(15)If the size of a version space is I VS* I. In general what is the smallest number of queries may be re

2、quired by a concept learner using optimal query strategy to perfectly learn the target concept?P27In genaraL decision trees represent a disjunction of conjunctions of constrains on the attribute values of instanse.then what expression does the following decision tree corresponds to ?YesNoYesNo3 Give

3、 the explaination to inductive bias, and list inductive bias of CANDIDATE-ELIMINATION algorithm, decision tree learning(ID3), BACKPROPAGATION algorithm。O)4 How to solve overfitting in decision tree and neural network?。) Solution: Decision tree: 及早停止树增加(stop growing earlier) 后修剪法(posl-pruning) Neural

4、 Network 权值衰减(weigh】 decay) 验证数据集(validalion set)A5 Prove that the LMS weight update mle q 吗 +(匕w加(/O-Vg)* perfomis a gradient descent to minimize the squared error. In particular, define the squared error E as in the text. Now Acalculate the derivative of E with respect to the weight assuming that

5、V(b) is a linear function as defined in the text. Gradient descent is achieved by updating each weight in proportion dEto. Therefore, you must show that the LMS training rule alters weights in this proportion物for each training example it encounters. ( E = 工 (%他-)2) (bJJaw&rainin txatnpleSolution：AAs

6、 Vtrain(b) hk(VxeX)(hk(x) = 1)-(A;(x) = 1) then % % =(% 1 %)人仇/ %) (10,)The hypothesis is false.One counter example is A XOR B while if A!=B, training examples are all positive, while if A=B. training examples are all negative, then, using ID3 to extend DI, the new tree D2 will be equivalent to DI,D

7、2 is equal to DI.7 Design a two-input perceptron that implements the boolean function A a-iB .Design a two-layer network of perceptrons that implements A XOR 8.(10)8 Suppose that a hypothesis space containing three hypotheses,九喜.由，and the posterior probabilities of these typotheses given the trainin

8、g data are , and respectively. And if a new instance X is encountered, which is classified positive by %, but negative by % and h5 .then give the result and detail classification course of Bayes optimal classifier1 10)P1259 Suppose S is a collection of training-example days described by attributes i

9、ncluding Humidity, which can have the values High or Normal. Assume S is a collection containing 10 examples, 7+,3-L Of these 10 examples, suppose 3 of the positive and 2 of the negative examples have Humidity = High, and the remainder have Humidity = Normal. Please calculate the information gain du

10、e to sorting the original 10 examples by the attribute Humidity.( log2l=0, log22=L log23=, log24=2, Iogi5=, Iog26=, logz7=, log28=3, logi9=, log210=,) (5)Solution：7733ia)Here we denote S=7+3-bthen Enlropyi 7+.3-l)= - -log2 - -log2 =;(b) Gain(S,Humidity)=Entropy(S)-Z yyEntropy(Sv) Gain(S.a2)vvaucs(II

11、uniidityi) |,/ Values( Humidity)=High. NormalS岫=s e SI HiuMity(s) = HighEntropy(Sw,)=-1log, |-1log21 = 0.972，际而| = 5=4* Entropy(5jVfWM/)=-log2 -1log21 = 0,72 ,屈尔/=5Thus Gain (S,Humidity)=(. x 0.972 + -* 0.72)=10 Finish the following algorithm. (IO)(1) GRADIENT-DESCENT(training examples, rj)Each trai

12、ning example is a pair of the form 卜)，where x is the vector of input values, and t is the target output value. J is the learning rate . Initialize each q to some small random value Until the termination condition is met, Do Initialize each 弓 to zero. For each 卜，in training_examples/ Do Input the ins

13、tance X to the unit and compute the output o For each linear unit weight , Do For each linear unit weight ,Do(2) FIND-S Algorithm Initialize h to the most specific hypothesis in H For each positive training instance x For each attribute constraint a( in hIfThendo nothingElsereplace a, in h by the next more general constraint that is satisfied by x Output hypothesis h

展开阅读全文