• Nu S-Au Găsit Rezultate

# Two Roles for the Bayesian Methods in Learning

N/A
N/A
Protected

Share "Two Roles for the Bayesian Methods in Learning"

Copied!
60
0
0
Arată mai multe ( pagini)

Text complet

(1)

## Bayesian Learning

Based on “Machine Learning”, T. Mitchell, McGRAW Hill, 1997, ch. 6

Acknowledgement:

The present slides are an adaptation of slides drawn by T. Mitchell

(2)

## Two Roles for the Bayesian Methods in Learning

1.

### Provides practical learning algorithms

by combining prior knowledge/probabilities with observed data:

• Naive Bayes learning algorithm

• Expectation Maximization (EM) learning algorithm (scheme):

learning in the presence of unobserved variables

• Bayesian Belief Network learning

2.

### Provides a useful conceptual framework

• Serves for evaluating other learning algorithms, e.g.

concept learning through general-to-specific hypotheses ordering (FindS, and CandidateElimination),

neural networks, liniar regression

• Provides additional insight into Occam’s razor

(3)

1. Basic Notions Bayes’ Theorem

Defining classes of hypotheses:

Maximum A posteriori Probability (MAP) hypotheses Maximum Likelihood (ML) hypotheses

2. Learning MAP hypotheses

2.1 The brute force MAP hypotheses learning algorithm 2.2 The Bayes optimal classifier;

2.3 The Gibbs classifier;

2.4 The Naive Bayes and the Joint Bayes classifiers.

Example: Learning over text data using Naive Bayes 2.5 The Minimum Description Length (MDL) Principle;

MDL hypotheses

3. Learning ML hypotheses

3.1 ML hypotheses in learning real-valued functions 3.2 ML hypotheses in learning to predict probabilities 3.3 The Expectation Maximization (EM) algorithm 4. Bayesian Belief Networks

(4)

## 1 Basic Notions

• Product Rule:

probability of a conjunction of two events A and B:

P(A ∧ B) = P(A|B)P(B) = P(B|A)P(A)

• Bayes’ Theorem:

P(A|B) = P(B|A)P(A) P(B)

• Theorem of total probability:

if events A1, . . . , An are mutually exclusive, with Pn

i=1 P(Ai) = 1, then

P(B) =

n

X

i=1

P(B|Ai)P(Ai)

in particular

P(B) = P(B|A)P(A) + P(B|¬A)P(¬A)

(5)

(6)

M L

hiH

i

hM AP = argmax

hH

P(h|D) = argmax

hH

P(D|h)P(h)

P(D) = argmax

hH

P(D|h)P(h)

i

j

M AP

M L

(7)

## Exemplifying MAP Hypotheses

Suppose the following data characterize the lab result for cancer-suspect people.

P(cancer) = 0.008 P(¬cancer) = 0.992 h1 = cancer, h2 = ¬cancer

P(+|cancer) = 0.98 P(−|cancer) = 0.02 D = {+,−}, P(D | h1), P(D | h2) P(+|¬cancer) = 0.03 P(−|¬cancer) = 0.97

Question: Should we diagnoze a patient x whose lab result is positive as having cancer?

Indeed, we have to find argmax{P(cancer|+), P(¬cancer|+)}. Applying Bayes theorem (for D = {+}):

P(+ | cancer)P(cancer) = 0.98 × 0.008 = 0.0078 P(+ | ¬cancer)P(¬cancer) = 0.03 × 0.992 = 0.0298

⇒ hM AP = ¬cancer (We can infer

### P (cancer | +) =

0.0078+0.02980.0078

)

(8)

## 2 Learning MAP Hypothesis

M AP

hH

hH

M AP

(9)

M AP

j

j

hiH

j

i

i

vjV

hiH

j

i

i

M AP

(10)

### The Bayes Optimal Classifier: An Example

Let us consider three possible hypotheses:

P(h1|D) = 0.4, P(h2|D) = 0.3, P(h3|D) = 0.3 Obviously, hM AP = h1.

Let’s consider an instance x such that h1(x) = +, h2(x) = −, h3(x) = −

Question: What is the most probable classification of x?

P(−|h1) = 0, P(+|h1) = 1 P(−|h2) = 1, P(+|h2) = 0 P(−|h3) = 1, P(+|h3) = 0 X

hiH

P(+|hi)P(hi|D) = 0.4 and X

hiH

P(−|hi)P(hi|D) = 0.6 therefore

argmax

vjV

X

hiH

P(vj|hi)P(hi|D) = −

(11)

## 2.3 The Gibbs Classifier

### [Opper and Haussler, 1991]

Note: The Bayes optimal classifier provides the best result, but it can be expensive if there are many hypotheses.

Gibbs algorithm:

1. Choose one hypothesis at random, according to P(h|D) 2. Use this to classify new instance

Surprising fact [Haussler et al. 1994]:

If the target concept is selected randomly according to the P(h|D) distribution, then the expected error of Gibbs Classifier is no worse than twice the expected error of the Bayes optimal classifier!

E[errorGibbs] ≤ 2E[errorBayesOptimal]

(12)

## 2.4 The Naive Bayes Classifier

When to use it:

The target function f takes value from a finite set V = {v1, . . . , vk}

Moderate or large training data set is available

The attributes < a1, . . . , an > that describe instances are conditionally independent w.r.t. to the given classification:

P(a1, a2 . . . an|vj) = Y

i

P(ai|vj)

The most probable value of f(x) is:

vM AP = argmax

vjV

P(vj|a1, a2 . . . an) = argmax

vjV

P(a1, a2 . . . an|vj)P(vj) P(a1, a2 . . . an)

= argmax

vjV

P(a1, a2 . . . an|vj)P(vj) = argmax

vjV

Y

i

P(ai|vj)P(vj) not.= vN B

This is the so-called decision rule of the Naive Bayes classifier.

(13)

## The Joint Bayes Classifier

vM AP = argmax

vjV

P(vj|a1, a2 . . . an) = . . .

= argmax

vjV

P(a1, a2 . . . an|vj)P(vj) = argmax

vjV

P(a1, a2 . . . an, vj) not.= vJ B

(14)

j

i

j

(15)

j

j

j

i

i

j

i

j

N B

vjV

j

aix

i

j

(16)

## The Naive Bayes: An Example

Consider again the PlayTennis example, and new instance

hOutlook = sun, T emp = cool, Humidity = high, W ind = strongi We compute:

vN B = argmaxvjV P(vj)Q

i P(ai|vj) P(yes) = 149 = 0.64 P(no) = 145 = 0.36 . . .

P(strong|yes) = 39 = 0.33 P(strong|no) = 35 = 0.60

P(yes) P(sun|yes) P(cool|yes) P(high|yes) P(strong|yes) = 0.0053 P(no) P(sun|no) P(cool|no) P(high|no) P(strong|no) = 0.0206

→ vN B = no

(17)

1

2

n

j

i

i

j

j

vjV

j

i

i

j

vjV

j

1

n

j

(18)

## The problem of unseen data

What if none of the training instances with target value vj have the at- tribute value ai?

It follows that Pˆ(ai|vj) = 0, and Pˆ(vj)Q

i Pˆ(ai|vj) = 0

The typical solution is to (re)define P(ai|vj), for each value vj of ai:

i

j

## ) ←

nn+mc+mp, where

• n is number of training examples for which v = vj,

• nc number of examples for which v = vj and a = ai

• p is a prior estimate for Pˆ(ai|vj)

(for instance, if the attribute a has k values, then p = k1)

• m is a weight given to that prior estimate (i.e. number of “virtual” examples)

(19)

1

n

(20)

## Learning to Classify Text: Main Design Issues

1. Represent each document by a vector of words

• one attribute per word position in document 2. Learning:

• use training examples to estimate P(+), P(−), P(doc|+), P(doc|−)

• Naive Bayes conditional independence assumption:

P(doc|vj) =

length(doc)

Y

i=1

P(ai = wk|vj)

where P(ai = wk|vj) is probability that word in position i is wk, given vj

• Make one more assumption:

∀i, m P(ai = wk|vj) = P(am = wk|vj) = P(wk|vj)

i.e. attributes are (not only indep. but) also identically distributed

(21)

### Learn naive Bayes text ( Examples, Vocabulary )

1. Collect all words and other tokens that occur in Examples V ocabulary ← all distinct words and other tokens in Examples 2. Calculate the required P(vj) and P(wk|vj) probability terms

For each target value vj in V

docsj ← the subset of Examples for which the target value is vj

j

### ) ←

|Examples|docsj| |

T extj ← a single doc. created by concat. all members of docsj

n ← the total number of words in T extj

For each word wk in V ocabulary

nk ← the number of times word wk occurs in T extj

k

j

### ) ←

n+|V ocabularynk+1 | (here we use the m-estimate)

(22)

### Classify naive Bayes text (Doc)

positions ← all word positions in Doc that contain tokens from V ocabulary Return

N B

vjV

j

ipositions

i

k

j

### )

(23)

Given 1000 training documents from each of the 20 newsgroups, learn to classify new documents according to which newsgroup it came from

comp.graphics misc.forsale comp.os.ms-windows.misc rec.autos comp.sys.ibm.pc.hardware rec.motorcycles

comp.sys.mac.hardware rec.sport.baseball comp.windows.x rec.sport.hockey

alt.atheism sci.space soc.religion.christian sci.crypt

talk.religion.misc sci.electronics talk.politics.mideast sci.med

talk.politics.misc talk.politics.guns

Naive Bayes: 89% classification accuracy (having used 2/3 of each group for training; eliminated rare words, and the 100 most freq. words)

(24)

### Learning Curve for 20 Newsgroups

0 10 20 30 40 50 60 70 80 90 100

100 1000 10000

20News

Bayes TFIDF PRTFIDF

Accuracy vs. Training set size

(25)

## 2.5 The Minimum Description Length Principle

Occam’s razor: prefer the shortest hypothesis Bayes analysis: prefer the hypothesis hM AP

hM AP = argmax

hH

P(D|h)P(h) = argmax

hH

(log2 P(D|h) + log2 P(h))

= argmin

hH

(−log2 P(D|h) − log2 P(h)) Interesting fact from the Information Theory:

The optimal (shortest expected coding length) code for an event with probability p is the one using −log2 p bits.

So we can interpret:

−log2 P(h): the length of h under the optimal code

−log2 P(D|h): the length of D given h under the optimal code Therefore we prefer the hypothesis h that minimizes...

(26)

### Bayes Analysis and the MDL Principle

We saw that a MAP learner prefers the hypothesis h that minimizes LC1(h) + LC2(D|h), where LC(x) is the description length of x under encoding C

hM DL = argmin

hH

(LC1(h) + LC2(D|h)) Example: H = decision trees, D = training data labels

• LC1(h) is the number of bits to describe tree h

• LC2(D|h) is the number of bits to describe D given h

In literature, the application of MDL to practical problems often include arguments justifying the choice of the encodings C1 and C2.

(27)

### For instance:

LC2(D|h) = 0 if examples are classified perfectly by h, and both the transmitter and the receiver know h.

Therefore, in this situation we need only to describe exceptions. So:

hM DL = argmin

hH

(length(h) + length(misclassif ications))

In general, MDL trades off hypothesis size for training errors:

it might select a shorter hypothesis that makes few errors over a longer hypothesis that perfectly classifies the data!

Consequence: In learning (for instance) decision trees, (using) the MDL principle can work as an alternative to pruning.

(28)

(29)

## 3 Learning Maximum Likelihood (ML) Hypothesis

### ML Hypotheses as Least Suquered Error Hypotheses

hML f

e y

x

Problem: Consider learning a real-valued target function f : X → R from D, a training set consisting of examples hxi, dii, i = 1, . . . , m with

xi, assumed fixed (to simplify)

di noisy training value di = f(xi) + ei

ei is random variable (noise) drawn inde- pendently for each xi, according to some Gaussian distribution with mean=0.

(30)

### Proposition

Considering H, a certain class of functions h : X → R such that h(xi) = f(xi) and assuming that xi are mutually independent given h,

the maximum likelihood hypothesis hM L is the one that minimizes the sum of squared errors:

hM L

def.= argmax

hH

P(D|h) = arg min

hH m

X

i=1

(di − h(xi))2

(31)

Note: We will use the probability density function:

p(x0) def.= limǫ0

1

ǫP(x0 ≤ x < x0 + ǫ) hM L = argmax

hH

P(D|h) = argmax

hH

m

Y

i=1

p(di|h) µi=f=(xi) argmax

hH

m

Y

i=1

p(ei|h)

= argmax

hH

m

Y

i=1

p(di − f(xi)|h) h(xi)=f= (xi) argmax

hH

m

Y

i=1

p(di − h(xi)|h)

= argmax

hH

m

Y

i=1

√ 1

2πσ2e12(dih(σ xi))2 = argmax

hH

(

m

X

i=1

ln 1

√2πσ2 − 1 2

di − h(xi) σ

2

)

= argmax

hH

m

X

i=1

−1 2

di − h(xi) σ

2

= argmax

hH

m

X

i=1

−(di − h(xi))2

= argmin

hH

m

X

i=1

(di − h(xi))2

(32)

## Generalisations...

1. Similar derivations can be performed starting with other assumed noise distributions (than Gaussians), producing different results.

2. It was assumed that

a. the noise affects only f(xi), and

b. no noise was recorded in the attribute values for the given ex- amples xi.

Otherwise, the analysis becomes significantly more complex.

(33)

### 3.2 ML hypotheses for Learning Probability Functions

Let us consider a non-deterministic function (i.e. one-to-many relation) f : X → {0,1}.

Given a set of independently drawn examples

D = {< x1, d1 >, . . . , < xm, dm >} where di = f(xi) ∈ {0,1},

we would like to learn a ML hypothesis for the probability function g(x) def.= P(f(x) = 1).

For example, h(xi) = 0.92 if P({< xi, di > |di = 1}) = 0.92.

Proposition: In this setting, hM L = argmaxhH P(D | h) maximizes the sum Pm

i=1[di lnh(xi) + (1 − di)ln(1 − h(xi))].

Proof:

P(D | h) = Πmi=1P(xi, di | h) = Πmi=1P(di | xi, h) · P(xi | h) It can be assumed that xi is independent of h, therefore:

P(D | h) = Πmi=1P(di | xi, h) · P(xi)

(34)

Proof (continued):

What we wanted to compute is h(xi) = P(di = 1 | xi, h).

In a more general form:

P(di | xi, h) =

h(xi) if di = 1 1 − h(xi) if di = 0

In a more convenient mathematical form: P(di | xi, h) = h(xi)di(1−h(xi))1di.

⇒ hM L = argmaxhH Πmi=1[h(xi)di(1 − h(xi))1diP(xi)]

= argmaxhH Πmi=1h(xi)di(1 − h(xi))1di · Πmi=1P(xi)

= argmaxhH Πmi=1h(xi)di(1 − h(xi))1di

= argmaxhH

m

X

i=1

[di lnh(xi) + (1 − di)ln(1 − h(xi))]

Note: The quantity −Pm

i=1[di lnh(xi) + (1 − di)ln(1 − h(xi))] is called cross- entropy; the above hM L minimizes this quantity.

(35)

### 3.3 The Expectation Maximization (EM) Algorithm

[Dempster et al, 1977]

### Find (local) Maximum Likelihood hypotheses when data is only partially observable:

• Unsupervised learning (i.e., clustering):

the target value is unobservable

• Supervised learning:

some instance attributes are unobservable

### Some applications:

• Non-hierarchical clustering:

Estimate the means of k Gausseans

• Learn Hidden Markov Models

• Learn Probabilistic Context Free Grammars

• Train Radial Basis Function Networks

• Train Bayesian Belief Networks

(36)

1

m

1

m

1

m

(37)

(0)

(t)

(t+1)

(t)

1

m

i

i

i

(38)

(t)

not.

(t)

(t)

(t+1)

(t+1)

h

(t)

(39)

## The EM algorithmic Schema

idea: replace missing values by estimated values

initialize parameters with arbitrary values

estimate missing val- ues based on current parameter values

re-estimate param- eters using the complete data

repeat the previous two steps until conver- gence

X, h(t) E[Z | ] h(t)

h

= argmax h(t+1)

P(Z|X; )h(t)

E [ln P(X,Z|h)]

++t

t=0

(40)

### Methodology: How to apply an instance of the EM schema

initialize parametersarbitrarily

(t)

++t t=0

h

(t+1)

P(Z|X; )h(t)

(t)

### M Step:

apply updating rules (2)

apply

### E Step:

updating rules (1)

(41)

### Methodology: How to derive an instance of the EM schema

h(t)

"observable" data log−likelihood of the

h(t) Q(h| )

function

"auxiliary"

variables

"unobservable" / latent

"observable"

variables / data

parameters hypothesis /

(4) solve this optimization problem

and sometimes Lagrange’s method) (usually using partial derivatives,

of "complete" data

(2) apply expectation linearity (1) compute the log−likelihood

with the total probability formula

(3) apply Bayes formula

(and sometimes the exponentiation trick)

++t

### ln P(X|h)

t=0

h

= argmax h(t+1)

P(Z|X; )h(t)

E [ln P(X,Z|h)]

X, h(t) E[Z | ]

M Step

initialized parameters consider arbitrarily

E Step

(42)

(43)

1

0

## π 1− π

### θ

X πBernoulli(θ) + (1π)Bernoulli(2θ)

0

1

### 1− π π

X πBernoulli(p) + (1π)Bernoulli(q)

(44)

EM algorithm for solving

A mixture of vectors of independent and identically distributed (i.i.d.) Bernoulli variables

Xi

X1 X10

x1 xi x10 x

~Bernoulli (θ )A ~Bernoulli (θ )B π

Z~Bernoulli

1−π

B A

X1 Xi X10

X = (x1, . . . , x10) π·Bernoulli(x1;θA)·. . .·Bernoulli(x10;θA) + (1π)·Bernoulli(x1;θB)·. . .·Bernoulli(x10;θB).

Cf. “What is the expectation maximization algorithm?”, Chuong B. Do, Serafim Batzoglou, Nature Biotechnology, vol. 26, no. 8, 2008, pag. 897-899

(45)

Init¸ializare:

atribuie valori arbitrare (θA(0), θB(0)ˆın intervalul (0,1)) pentru parametrii θA ¸si respectiv θB;

Corpul iterativ:

pentru t = 0, . . . , T 1 (cu T fixat ˆın avans)

(sau: pˆan˘a cˆand log-verosimilitatea datelor observabile nu mai cre¸ste semnificativ), (sau: pˆan˘a cˆand |θA(t) θA(t+1)| < ε, |θB(t) θB(t+1)| < ε),

cu ε fixat ˆın avans) execut˘a

Pasul E:

pentru i = 1, . . . ,5 calculeaz˘a

p(t+1)i,A = (θ(t)A )xi(1θ(t)A )10xi

(θ(t)A )xi(1θ(t)A )10xi + (θB(t))xi(1θB(t))10xi

p(t+1)i,B = (θ(t)B )xi(1θ(t)B )10−xi

(θ(t)A )xi(1θ(t)A )10xi + (θB(t))xi(1θB(t))10xi ; Pasul M:

θA(t+1) = P5

i=1

xi

10 p(t+1)i,A P5

i=1p(t+1)i,A ¸si θB(t+1) = P5

i=1

xi

10 p(t+1)i,B P5

i=1p(t+1)i,B ; Returneaz˘a θA(t+1), θB(t+1);

(46)

A mixture of vectors of independent Bernoulli distributions, applied to clustering of hand-written digits (MNIST)

2

1

### π

K

(p )1,i

~B

X1,1 X1,i X1,D

(p )2,i

~B

X2,1 X2,i X2,D

(p )K,i

~B

XK,1 X K,i XK,D

x1 xi xD 1

Z~Categorical

2 K

x CMU, 2015f, A. Smola, B. Poczos,

HW2, pr. 1

X = (x1, . . . , xD) π1·Bernoulli(x1;p1,1)·. . .·Bernoulli(xD;p1,D) +πK·Bernoulli(x1;pK,1)·. . .·Bernoulli(xD;pK,D).

(47)

(48)

i

j

k

i

j

k

i

k

1

2

1

2

2

1

2

1

l

1

m

1

n

1

l

1

n

(49)

## A Bayes Net

Storm

Campfire Lightning

Thunder ForestFire

Campfire C

¬C

¬S,B ¬S,¬B 0.4

0.6 0.1 0.9

0.8 0.2

0.2 0.8 S,¬B

BusTourGroup

S,B

The network is defined by

• A directed acyclic graph, represening a set of conditional independence assertions:

Each node — representing a random variable — is asserted to be conditionally independent of its nondescendants, given its immediate predecessors.

Example: P(T hunder|F orestF ire, Lightning) = P(T hunder|Lightning)

• A table of local conditional probabilities for each node/variable.

(50)

1

2

n

1

n

1

1

n

n

n

i=1

i

i

i

i

(51)

## Inference in Bayesian Nets

### (b) P (A)

P(A|S)=0.7

P(A|~S)=0.3

P(G|S)=0.8 P(G|~S)=0.2 P(S|L,F)=0.8

P(S|~L,F)=0.5 P(S|~L,~F)=0.3 P(S|L,~F)=0.6

L F

S

A G

P(L)=0.4 P(F)=0.6

(52)

(53)

(54)

(55)

(56)

ijk

i

ijk

i

ij

i

ik

h

ijk

dD

h

ij

ik

ijk

ijk

ijk

ijk

dD

h

ij

ik

ijk

ijk

j

ijk

ijk

56.

(57)

h

ijk

ijk

dD

h

dD

h

ijk

dD

h

h

ijk

ij

i

ik

i

i

h

ijk

dD

h

ijk

jk

h

ij

ik

h

ij

ik

dD

h

ijk

jk

h

ij

ik

h

ij

ik

h

ik

ijk

h

ij

ik

(58)

h

ijk

dD

h

ijk

h

ij

ik

ijk

h

ik

dD

h

h

ij

ik

h

ik

dD

h

h

ij

ik

h

h

ik

h

ij

ik

dD

h

ij

ik

h

ik

h

ij

ik

dD

h

ij

ik

h

ij

ik

dD

h

ij

ik

ijk

(59)

ijk

ijk

(60)

## Learning Bayes Nets (III)

### not in the original net.

Referințe

DOCUMENTE SIMILARE

The study results concluded that the ACEIs are effective in lowering systolic and diastolic blood pressure, they reduce global cardiovascular risk through

As the distribution of the sensible refers, broadly speaking, to who is allowed to speak, what is visible and what remains invisible within a perceptual framework, this article

7.E-learning is successful only if there is adequate instructors’ training in the uses of technology for learning. 8.E-learning gives instructors the opportunity to

mining, standards for representing interactions data, protocols for harvesting data from multiple databases (Open Architecture Initiative). • Processing interactome

The Constitution of the Republic of Albania regulates three situations that require extraordinary measures: war situation, state of emergency and state of natural

a) Successive transport of carriers independent of each other. In this case, each of the carriers concludes separate contracts with the client, self-reliant, for each the

Identity is thus constructed in interaction, which means that out of a whole host of potential identity features, those features become salient which permit a differentiation of

The studied company through his organizational culture, shows that it fights to minimize the number of conflicts, encourage the collaboration among employees and between them

The methodology of managerial training programs’ development at university level, and, particularly, at post-university level, goes on, with few exceptions, to be

The first theme, Efficient and interoperable eGovernment services, aims at improving the efficiency and effectiveness of public administrations and facilitating their interactions

Member States have committed themselves to inclusive eGovernment objectives to ensure that by 2010 all citizens, including socially disadvantaged groups, become major beneficiaries

The most impor- tant point is that, with complete data, the maximum-likelihood parameter learning problem for a Bayesian network decomposes into separate learning problems, one for

De¸si ˆın ambele cazuri de mai sus (S ¸si S ′ ) algoritmul Perceptron g˘ ase¸ste un separator liniar pentru datele de intrare, acest fapt nu este garantat ˆın gazul general,

This classification allows stating an important correlation between the denotation of the adjective and its syntax (for a more detailed analysis, see Cornilescu

In the same line, Nistor and Roman (1968b) offer an outstanding survey of the major problems in computational linguistics in the sixties: information retrieval, Chomsky’s

Thus, if Don Quixote is the idealist, Casanova the adventurous seducer, Werther the suicidal hero, Wilhelm Meister the apprentice, Jesus Christ will be, in the audacious and

Secondly, when the criminal sanction applied in the sentencing state, it does not correspond to the duration and nature of the executing Member State, it (the executing Member

The paper presented a descriptive analysis of the international migration phenomenon in the EU members, with a special focus on Romania, between 2006 and 2010. Romania,

The number of vacancies for the doctoral field of Medicine, Dental Medicine and Pharmacy for the academic year 2022/2023, financed from the state budget, are distributed to

e) The doctoral thesis is officially submitted to the Doctoral School Secretariat, in printed and electronic format, together with the summary of the thesis in

Victimele au posibilitatea de a solicita să fie însoțite de o persoană aleasă de acestea, cu ocazia primului contact cu o autoritate competentă, în ipoteza

Medial Tibiofemoral medial joint Longitudinal scan Medial meniscus (Mm) Longitudinal scan Medial collateral ligament (MCL) Longitudinal scan Semimembranosus insertion(Sm)

These weights of test data make the representation The backprobagation is born of a neural network, this is based on the number of hidden layers and the ReLu layer appears in

Selectați limba dvs.

Site-ul web va fi tradus în limba pe care o selectați.

Limbi sugerate pentru dvs:

Alte: