Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Junction Tree Algorithm and ArgMax Junction Tree Algorithm, Study notes of Machine Learning

The Junction Tree Algorithm and ArgMax Junction Tree Algorithm. It covers topics such as Collect & Distribute, Algorithmic Complexity, and ArgMax Junction Tree Algorithm. The document also provides examples and code snippets. likely to be useful as study notes, lecture notes, or summary for a course on Machine Learning.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

torley
torley 🇺🇸

4.6

(41)

258 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Tony Jebara, Columbia University
Machine Learning
4771
Instructor: Tony Jebara
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Junction Tree Algorithm and ArgMax Junction Tree Algorithm and more Study notes Machine Learning in PDF only on Docsity!

Machine Learning

Instructor: Tony Jebara

Topic 18

  • The Junction Tree Algorithm
  • Collect & Distribute
  • Algorithmic Complexity
  • ArgMax Junction Tree Algorithm

JTA with many cliques

  • Problem: what if we have more than two cliques?

1) Update AB & BC

2) Update BC & CD

  • Problem: AB has not heard about CD!

After BC updates, it will be inconsistent for AB

  • Need to iterate the pairwise updates many times
  • This will eventually converge to consistent marginals
  • But, inefficient… can we do better?

AB B BC C^ CD

AB B BC C CD

JTA: Collect & Distribute

  • Use tree recursion rather than iterate messages mindlessly!

initialize(DAG){ Pick root

Set all variables as: }

collectEvidence(node) {

for each child of node { update1(node,collectEvidence(child)); }

return(node); }

distributeEvidence(node) {

for each child of node {

update2(child,node); distributeEvidence(child); } }

update1(node w,node v) { }

update2(node w,node v) { }

normalize() { }

ψ C i

= p ( x i | π i ), φ S = 1

p X

( C )

= 1 ∑ C ψ C^ **

ψ C

** ∀ C , p X

( S )

= 1 ∑ (^) S^ φ S^ **

φ S

** ∀ S

φ VW

= ψ

∑ V ( V ∩ W ) V

, ψ W =

φ V^ * ∩ W φ VW

ψ W

φ VW

** = ψ

∑ V \ ( V ∩ W ) V

, ψ W =

φ V^ ** ∩ W φ V^ *^ ∩ W

ψ W

Algorithmic Complexity

  • The 5 steps of JTA are all efficient:

1) Moralization

Polynomial in # of nodes

2) Introduce Evidence (fixed or constant)

Polynomial in # of nodes (convert pdf to slices)

3) Triangulate (Tarjan & Yannakakis 1984)

Suboptimal=Polynomial, Optimal=NP

4) Construct Junction Tree (Kruskal)

Polynomial in # of cliques

5) Junction Tree Algorithm (Init,Collect,Distribute,Normalize)

Polynomial (linear) in # of cliques, Exponential in Clique Cardinality

Junction Tree Algorithm

  • Convert Directed Graph to Junction Tree
  • Initialize separators to 1 (and Z=1) and set clique

tables to the CPTs in the Directed Graph

  • Run Collect, Distribute, Normalize
  • Get valid marginals from all ψ , φ tables

p ( X ) =

1

Z

ψ X

∏ C (^ C )

φ X

∏ S (^ S )

=

1

1

p x 1 , x

p x 3 | x

p x 4 | x

p x 5 | x

p x 6 | x

p x 7 | x

1 × 1 × 1 × 1 × 1

x

1

x

2

x

2

x

3

x

3

x

4

x

5

x

7

x

3

x

5

x

5

x

6

p ( X ) = p x

p x 2 | x

p x 3 | x

p x 4 | x

p x 5 | x

p x 6 | x

p x 7 | x

ArgMax Junction Tree Algorithm

  • We can also use JTA for finding the max not the sum

over the joint to get argmax of marginals & conditionals

  • Say have some evidence:
  • Most likely (highest p) XF?
  • What is most likely state of patient with fever & headache?
  • Solution: replace sum with max inside JTA update code
  • Final potentials are max marginals:
  • Highest value in potential is most likely:

p X F , X

( E )

= p x 1 ,…, x n , x n + 1 ,…, x

( N )

X F

= arg max XF p X F , X

( E )

p F

= max x 2 , x 3 , x 4 , x 5 p x 1 = 1 , x 2 , x 3 , x 4 , x 5 , x 6

( =^1 )

= max x 2 p x 2 | x 1

( =^1 ) p^ x

1

( =^1 )max

x 3 p x 3 | x 1

( =^1 )

max x 4

p ( x 4 | x 2 )max x

5

p ( x 5 | x 3 ) p ( x 6 = 1 | x 2 , x 5 )

ψ

** X

( C )

= max U \ C

p ( X )

X

C

= arg max C ψ

** X

( C )

φ VW

= max V ( VW ) ψ V , ψ W =

φ V^ * ∩ W φ VW

ψ W φ VW

** = max V \ (^) ( VW ) ψ V , ψ W =

φ V^ ** ∩ W φ V^ *^ ∩ W

ψ W

ArgMax Junction Tree Algorithm

  • Why do I need the ArgMax junction tree algorithm?
  • Can’t I just compute marginals using the Sum algorithm

and then find the highest value in each marginal???

  • No!! Here’s a counter-example:
  • Most likely is x 1 *=C and x 2 *=
  • But the sub-marginals p(x 1 ) and p(x 2 ) do not reveal this…
  • The marginals would falsely imply that is x 1 *=A and x 2 *=

p x 1 , x

=

x 1

A

x 1

B

x 1

C

x 2 = 0

x 2 = 1

. 14. 05. 27 . 24. 20. 10

⎢ ⎢

⎥ ⎥

p x

= x 2 =^0

x 2 = 1

. 46 . 54

⎢ ⎢

⎥ ⎥

p x

=

A B C

  1. 38 0. 25 0. 37

⎡ ⎣⎢^

⎤ ⎦⎥